SDBV Gruppe

Tools

FAIRDOM-SEEK

FAIRDOM-SEEK is a web-based cataloguing and commons platform, for sharing heterogeneous scientific research datasets, models or simulations, processes and research outcomes. It preserves associations between them, along with information about the people and organisations.

Underpinning FAIRDOM-SEEK is the ISA infrastructure, a standard framework for describing how individual experiments are aggregated into wider studies and investigations. Within FAIRDOM-SEEK, ISA has been extended and is configurable to allow the structure to be used outside of Biology.

Flexible and detailed sharing permissions are available to manage the catalogued items from early collaborations within projects, through to the publishing of final research results. At this point a DOI can be generated for individual items, or entire aggregates packaged as Research Objects.

FAIRDOM-SEEK incorporates semantic technology, allowing sophisticated queries over the content. Metadata can be collected using standard Excel tools and processes, through the use of RightField.

FAIRDOM-SEEK can be downloaded, installed, and managed locally as a solution to data sharing within groups and consortia. In addition, a publically available instance of a FAIRDOM-SEEK commons is available as the FAIRDOMHub.

ChemHits

Normalization and Matching of Chemical Compound Names

Despite all standardization efforts in the field of chemical nomenclatures, a chemical compound still can be found having many different names – trivial, as well as systematic names. Hence, the unambiguous identification of a chemical compound solely based on its name requires comprehensive chemical knowledge and often extensive searches in chemical databases. As many publications exclusively describe a chemical compound by its name the matching of these diverging notations can be tedious. However, this identification is crucial for the integration of biochemical data, e.g. for the bundling of data in databases or for the setup of biochemical models based on published data found in the literature.

We have developed ChemHits, an application which detects and matches synonymic names of (bio-)chemical compounds and thereby facilitates merging of corresponding data referring to the same compound, but described with different names. The tool that we have developed is based on natural language processing (NLP) methods and applies transformation rules to systematically process chemical compound names to a unique generic normalized name form. It is capable of normalizing a given name of a chemical compound and matching it against names in (bio-)chemical databases, like KEGG COMPOUND, SABIO-RK or ChEBI, even when there is no exact name-to-name-match. The tool is also able to match a complete list of compound names against these databases which makes it useful for the automatic cross-annotation of chemical data in databases.

NormSys

Registry for Modeling Standards

The NormSys Registry aims at surveying standard formats for computational modeling in biology. It not only lists the standards, but also compares their major features, their possible fields of biological application and use cases (including model examples), as well as their relationships, commonalities and differences. This registry provides a common entry point for modelers and software developers who plan to apply the standards for their respective case of application, and serves them with detailed information and links to the standards, their specifications and APIs.