Cross lingual and Multilingual Information Retrieval

The widespread use of the Internet has increased the multilingual information available online. Furthermore, the non-native English speakers have increased. Initially online documents were used predominantly by English speakers. Now more than half (50.4%) of web users speak a native language other than English. It has become more important that documents of different languages and cultures are retrieved in response to the user's request.

Our research in this area focuses on supporting multilingual information retrieval by interactive retrieval tools with a focus on european languages. However in addition special attention is given to the Arabic language. We focus on different approaches for multilingual information retrieval: One is using machine-readable multilingual dictionaries; the other is automatic extraction of possible correct translation equivalents sensed by statistical analysis of parallel corpora. For the second approach we use a statistical/probabilistic method on parallel text written in multiple languages in order to identify the correct sense of the word translation using bilingual parallel text as training data.

Selected Publications

Andargachew Gezmu and Andreas Nürnberger, Neural machine translation for amharic-english translation, In: Proceedings of the 13th International Conference on Agents and Artificial Intelligence. Volume 1: Online, 04-06.02.2021 - [Sétubal]: SCITEPRESS - Science and Technology Publications, Lda.; Rocha, Ana Paula . - 2021, pp. 526-53.
Farag Ahmed and Andreas Nürnberger, multi Searcher: Can we Support People to get Information from Text they can't Read or Understand?, In: Proceedings of the 33rd Annual ACM SIGIR conference in Research and Development in Information Retrieval (SIGIR2010), 19-23 July, pp. 837-838 Geneva, Switzerland.
Farag Ahmed and Andreas Nürnberger, Corpora based Approach for Arabic/English Word Translation Disambiguation. Journal of Speech and Language Technology, Volume 11, pp. 195-213, 2009.
Farag Ahmed and Andreas Nürnberger, Arabic/English Word Translations Disambiguation using Parallel Corpora and Matching Schemes, In: Proceedings of the 12th European Machine Translation Conference (EAMT08) 22-23 September 2008 at University of Hamburg, Germany. pp. 6-11
Ernesto William De Luca, Stefan Hauke, Andreas Nürnberger and Stefan Schlechtweg, MultiLexExplorer: Combining Multilingual Web Search with Multilingual Lexical Resources In: Proceedings of the combined Workshop on Language-Enabled Educational Technology and Development and Evaluation of Robust Spoken Dialogue Systems. In conjunction with the 17th European Conference on Artificial Intelligence (ECAI'06). Riva del Garda, Italy, pp. 17-21, 2006.