1. Caseologue custom python script
This custom python script complements the tests performed by the ELK reasoner and the ROBOT report tool. It allows us to test for specific features of the EDAM ontology (e.g. check for wikipedia links or for mandatory properties)
1.1. Get started
This tool is not yet available as a python or a conda package. To use it on your owl file, you will need to clone the public GitHub repository:
git clone git@github.com:edamontology/caseologue.git
Move to the caseologue/caseologue_python directory and install requirements:
pip install -r requirements.txt
Then run:
EDAM_PATH=<path to test data> python3 caseologue.py
Note: On Windows you will need to write python instead of python3, which also works on Mac.
By default the script will run all tests. You can filter the tests on error level using these options:
-E, --essential runs all essential tests
-e, --error runs all error tests
-c, --curation runs all curation tests
Repartition of tests can be seen in the source code here.
1.1.1. Options
Essential: tests that can be applicable to other side of the EDAM ontology such as EDAM geo of EDAM bioimaging, , also mandatory for pull request merge.
Error: tests validating the semantic and syntactic consistency of the ontology, that are mandatory to pass for a pull request to be merged on the GitHub repository.
Curation: unmandatory tests, ran by maintainers, that, if failed, do not compromise the integrity or the logical structure of the ontology. The error level is also a staging area for tests that should be error or essential but still raise errors needing to be fixed.
1.2. Tests documentation
This python script uses the unittest modules to test and report errors of the tested EDAM owl file.
For (almost) each test described below, the script calls a custom SPARQL query, using the RDFlib library, and report the error in a comprehensive table.
- class caseologue.EdamQueryTest(methodName='runTest')[source]
- test_bad_uri()[source]
Checks that the concepts URI matches the regex ^http://edamontology.org/(data|topic|operation|format)_[0-9]{4}$.
> SPARQL query available here
Severity level: essential
- test_bad_uri_reference()[source]
Check that a reference (e.g. superclass) to another concept is actually declared in EDAM.
"get_uri.rq" retrieves all URI. "uri_reference.rq" retrieves all referenced URI. Then check if the references URIs are in the declared concept URIs.
> SPARQL query available here :
Severity level: essential
- test_check_wikipedia_link()[source]
Checks that every topic has a wikipedia link filled in the seeAlso property.
> SPARQL query available here
Severity level: curation
- test_deprecated_replacement()[source]
Checks that every deprecated concept has a replacement suggested (replaced_by or consider).
> SPARQL query available here
Severity level: error
- test_deprecated_replacement_obsolete()[source]
Checks that the suggested replacement (replacedBy/consider) for a deprecated term is not obsolete.
> SPARQL query available here
Severity level: curation
- test_duplicate_all()[source]
Checks that there is no duplicate content (case sensitive, for computational reasons) across all the ontology on given properties.
> SPARQL query available here
Severity level: curation
- test_duplicate_in_concept()[source]
Checks that there is no duplicate content (case insensitive) within a concept on given properties.
> SPARQL query available here
Severity level: curation
- test_empty_property()[source]
Checks that no property is empty.
> SPARQL query available here
Severity level: error
- test_format_property_missing()[source]
Checks the no mandatory property for format are missing (documentation,is_format_of). To make sure not to miss the inherited "is_format_of" property from parent concept, a CONSTRUCT query is used to add the missing triplets to the graph.
> SPARQL query available here
Severity level: curation
- test_formatting()[source]
Checks the formatting of the properties. Properties should not have a space neither at the start nor the end, no tab and no end of line. Checks that label have no dot at the end and that definition do have a dot at the end.
> SPARQL query available here:
Severity level: curation
- test_id_unique()[source]
Checks that the numerical part of the URI is not duplicated.
Uses a small python script to retrieve all duplicated id available here
Severity level: error
- test_identifier_property_missing()[source]
Checks the no mandatory property for identifier (subclass of accession) are missing (regex).
> SPARQL query available here
Severity level: curation
- test_literal_links()[source]
Checks that all webpage and doi are declared as literal links.
> SPARQL query available here
Severity level: curation
- test_mandatory_property_missing()[source]
Checks that no mandatory property for all concepts are missing (oboInOwl:hasDefinition, rdfs:subClassOf, created_in, oboInOwl:inSubset, rdfs:label).
> SPARQL query available here
Severity level: error
- test_missing_deprecated_property()[source]
Checks that no mandatory property for deprecated concept are missing (edam:obsolete_since, edam:oldParent, oboInOwl:inSubset <http://purl.obolibrary.org/obo/edam#obsolete>, rdfs:subClassOf <http://www.w3.org/2002/07/owl#DeprecatedClass>).
> SPARQL query available here
Severity level: error
- test_next_id_modif()[source]
Checks that the "next id" property is equal to "the maximal concept id numerical part" +1.
> SPARQL query available here
Severity level: error
- test_object_relation_obsolete()[source]
Checks that a relation between concepts is not pointing towards obsolete concepts (is_format_of, has_input, has_output, is_identifier_of, has_topic ...)
> SPARQL query available here
Severity level: error
- test_relation_too_broad()[source]
Checks that a concept is not in relation (restriction) with a concept "not recommended for annotation".
> SPARQL query available here
Severity level: curation
- test_spelling_check()[source]
Uses unix codespell command and custom spelling dictionary to check spelling errors in EDAM.
> GitHub page of codespell available here
Severity level: curation