Imprint | Deutsch | Start







 


Scientific Databases and Visualization - BioReader

The main focus of this project is the development and application of natural language processing (NLP) methods to support dealing with chemical compound names. A chemical compound can have many different names; it can have several trivial names as well as several systematic names, even when following naming recommendations as those of the International Union of Pure and Applied Chemistry (IUPAC). Furthermore, underspecified names and class names frequently occur in publications, databases and patents.

This Project focuses on two different approaches:

ChemHits identifies names of chemical compounds via string normalization. Input names are normalized and subsequently matched against one of several reference databases (ChEBI, KEGG, etc.). (Version 1.0 released July 2010!)

CLP(name2structure) aims at a deep analysis resulting in a chemical structure and classification for a given name.

The methods  and tools developed under this project are to be used by curators of the SABIO-RK database for the identification of compounds.

 
page last modified: 22.12.2011,15:36



Project Manager

Priv.-Doz. Dr. Wolfgang Müller
Email:
Phone: +49 (0)6221 - 533 - 231

Fax: +49 (0)6221 - 533 - 298