1. Caseologue custom python script

This custom python script complements the tests performed by the ELK reasoner and the ROBOT report tool. It allows us to test for specific features of the EDAM ontology (e.g. check for wikipedia links or for mandatory properties)

1.1. Get started

This tool is not yet available as a python or a conda package. To use it on your owl file, you will need to clone the public GitHub repository:

git clone git@github.com:edamontology/caseologue.git

Move to the caseologue/caseologue_python directory and install requirements:

pip install -r requirements.txt

Then run:

EDAM_PATH=<path to test data> python3 caseologue.py

Note: On Windows you will need to write python instead of python3, which also works on Mac.

By default the script will run all tests. You can filter the tests on error level using these options:

-E, --essential  runs all essential tests
-e, --error      runs all error tests
-c, --curation   runs all curation tests

Repartition of tests can be seen in the source code here.

1.1.1. Options

  • Essential: tests that can be applicable to other side of the EDAM ontology such as EDAM geo of EDAM bioimaging, , also mandatory for pull request merge.

  • Error: tests validating the semantic and syntactic consistency of the ontology, that are mandatory to pass for a pull request to be merged on the GitHub repository.

  • Curation: unmandatory tests, ran by maintainers, that, if failed, do not compromise the integrity or the logical structure of the ontology. The error level is also a staging area for tests that should be error or essential but still raise errors needing to be fixed.

1.2. Tests documentation

This python script uses the unittest modules to test and report errors of the tested EDAM owl file.

For (almost) each test described below, the script calls a custom SPARQL query, using the RDFlib library, and report the error in a comprehensive table.

class caseologue.EdamQueryTest(methodName='runTest')[source]
test_bad_uri()[source]

Checks that the concepts URI matches the regex ^http://edamontology.org/(data|topic|operation|format)_[0-9]{4}$.

> SPARQL query available here

Severity level: essential

test_bad_uri_reference()[source]

Check that a reference (e.g. superclass) to another concept is actually declared in EDAM.

"get_uri.rq" retrieves all URI. "uri_reference.rq" retrieves all referenced URI. Then check if the references URIs are in the declared concept URIs.

> SPARQL query available here :

Severity level: essential

Checks that every topic has a wikipedia link filled in the seeAlso property.

> SPARQL query available here

Severity level: curation

test_deprecated_replacement()[source]

Checks that every deprecated concept has a replacement suggested (replaced_by or consider).

> SPARQL query available here

Severity level: error

test_deprecated_replacement_obsolete()[source]

Checks that the suggested replacement (replacedBy/consider) for a deprecated term is not obsolete.

> SPARQL query available here

Severity level: curation

test_duplicate_all()[source]

Checks that there is no duplicate content (case sensitive, for computational reasons) across all the ontology on given properties.

> SPARQL query available here

Severity level: curation

test_duplicate_in_concept()[source]

Checks that there is no duplicate content (case insensitive) within a concept on given properties.

> SPARQL query available here

Severity level: curation

test_empty_property()[source]

Checks that no property is empty.

> SPARQL query available here

Severity level: error

test_format_property_missing()[source]

Checks the no mandatory property for format are missing (documentation,is_format_of). To make sure not to miss the inherited "is_format_of" property from parent concept, a CONSTRUCT query is used to add the missing triplets to the graph.

> SPARQL query available here

Severity level: curation

test_formatting()[source]

Checks the formatting of the properties. Properties should not have a space neither at the start nor the end, no tab and no end of line. Checks that label have no dot at the end and that definition do have a dot at the end.

Severity level: curation

test_id_unique()[source]

Checks that the numerical part of the URI is not duplicated.

Uses a small python script to retrieve all duplicated id available here

Severity level: error

test_identifier_property_missing()[source]

Checks the no mandatory property for identifier (subclass of accession) are missing (regex).

> SPARQL query available here

Severity level: curation

Checks that all webpage and doi are declared as literal links.

> SPARQL query available here

Severity level: curation

test_mandatory_property_missing()[source]

Checks that no mandatory property for all concepts are missing (oboInOwl:hasDefinition, rdfs:subClassOf, created_in, oboInOwl:inSubset, rdfs:label).

> SPARQL query available here

Severity level: error

test_missing_deprecated_property()[source]

Checks that no mandatory property for deprecated concept are missing (edam:obsolete_since, edam:oldParent, oboInOwl:inSubset <http://purl.obolibrary.org/obo/edam#obsolete>, rdfs:subClassOf <http://www.w3.org/2002/07/owl#DeprecatedClass>).

> SPARQL query available here

Severity level: error

test_next_id_modif()[source]

Checks that the "next id" property is equal to "the maximal concept id numerical part" +1.

> SPARQL query available here

Severity level: error

test_object_relation_obsolete()[source]

Checks that a relation between concepts is not pointing towards obsolete concepts (is_format_of, has_input, has_output, is_identifier_of, has_topic ...)

> SPARQL query available here

Severity level: error

test_relation_too_broad()[source]

Checks that a concept is not in relation (restriction) with a concept "not recommended for annotation".

> SPARQL query available here

Severity level: curation

test_spelling_check()[source]

Uses unix codespell command and custom spelling dictionary to check spelling errors in EDAM.

> GitHub page of codespell available here

Severity level: curation

test_subset_id()[source]

Checks that the "subset" part of a concept id is the same as its superclass (e.g. data concept only subclass of another data concept).

> SPARQL query available here

Severity level: error

test_super_class_refers_to_self()[source]

Checks if a given concept doesn't refers to itself as superclass.

> SPARQL query available here

Severity level: essential

caseologue.suite()[source]

Defines the level of error of each test.