Options
Orthonormal Explicit Topic Analysis for Cross-Lingual Document Matching
McCrae, John Philip; Cimiano, Philipp; Klinger, Roman (2013): Orthonormal Explicit Topic Analysis for Cross-Lingual Document Matching, in: David Yarowsky, Timothy Baldwin, Anna Korhonen, u. a. (Hrsg.), Proceedings of the 2013 Conference on Empirical Natural Language Processing, Seattle, Washington: Association for Computational Linguistics, S. 1732–1740.
Faculty/Chair:
Author:
Title of the compilation:
Proceedings of the 2013 Conference on Empirical Natural Language Processing
Editors:
Yarowsky, David
Baldwin, Timothy
Korhonen, Anna
Livescu, Karen
Bethard, Steven
Conference:
EMNLP
Publisher Information:
Year of publication:
2013
Pages:
Language:
English
Abstract:
Cross-lingual topic modelling has applications in machine translation, word sense disambiguation and terminology alignment. Multilingual extensions of approaches based on latent (LSI), generative (LDA, PLSI) as well as explicit (ESA) topic modelling can induce an interlingual topic space allowing documents in different languages to be mapped into the same space and thus to be compared across languages. In this paper, we present a novel approach that combines latent and explicit topic modelling approaches in the sense that it builds on a set of explicitly defined topics, but then computes latent relations between these. Thus, the method combines the benefits of both explicit and latent topic modelling approaches. We show that on a cross-lingual mate retrieval task, our model significantly outperforms LDA, LSI, and ESA, as well as a baseline that translates every word in a document into the target language.
GND Keywords:
Maschinelle Übersetzung
Keywords:
Cross-Lingual Document Matching
DDC Classification:
RVK Classification:
Peer Reviewed:
Yes:
International Distribution:
Yes:
Open Access Journal:
Yes:
Type:
Conferenceobject
Activation date:
March 8, 2024
Versioning
Question on publication
Permalink
https://fis.uni-bamberg.de/handle/uniba/94019