The paper “BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages” by Benjamin Heinzerling and Michael Strube has been accepted for LREC 2018, which will take place in Miyazaki, Japan, from May 7th to 12th, 2018. The paper comes with an innovative natural language resource, which can be downloaded from this GitHub page. BTW: The European Media Lab has been a regular sponsor of the LREC conference for several consecutive years.
HITS, the Heidelberg Institute for Theoretical Studies, was established in 2010 by physicist and SAP co-founder Klaus Tschira (1940-2015) and the Klaus Tschira Foundation as a private, non-profit research institute. HITS conducts basic research in the natural, mathematical, and computer sciences. Major research directions include complex simulations across scales, making sense of data, and enabling science via computational research. Application areas range from molecular biology to astrophysics. An essential characteristic of the Institute is interdisciplinarity, implemented in numerous cross-group and cross-disciplinary projects. The base funding of HITS is provided by the Klaus Tschira Foundation.