Options
Orthonormal Explicit Topic Analysis for Cross-Lingual Document Matching
McCrae, John Philip; Cimiano, Philipp; Klinger, Roman (2024): Orthonormal Explicit Topic Analysis for Cross-Lingual Document Matching, in: Bamberg: Otto-Friedrich-Universität, S. 1732–1740.
Faculty/Chair:
Author:
Publisher Information:
Year of publication:
2024
Pages:
Source/Other editions:
Proceedings of the 2013 Conference on Empirical Natural Language Processing / David Yarowsky, Timothy Baldwin, Anna Korhonen, Karen Livescu, Steven Bethard (Hg.). - Seattle, Washington : Association for Computational Linguistics, 2013, S. 1732–1740.
Year of first publication:
2013
Language:
English
Abstract:
Cross-lingual topic modelling has applications in machine translation, word sense disambiguation and terminology alignment. Multilingual extensions of approaches based on latent (LSI), generative (LDA, PLSI) as well as explicit (ESA) topic modelling can induce an interlingual topic space allowing documents in different languages to be mapped into the same space and thus to be compared across languages. In this paper, we present a novel approach that combines latent and explicit topic modelling approaches in the sense that it builds on a set of explicitly defined topics, but then computes latent relations between these. Thus, the method combines the benefits of both explicit and latent topic modelling approaches. We show that on a cross-lingual mate retrieval task, our model significantly outperforms LDA, LSI, and ESA, as well as a baseline that translates every word in a document into the target language.
GND Keywords:
Maschinelle Übersetzung
Keywords:
Cross-Lingual Document Matching
DDC Classification:
RVK Classification:
Peer Reviewed:
Yes:
International Distribution:
Yes:
Open Access Journal:
Yes:
Type:
Conferenceobject
Activation date:
August 19, 2024
Permalink
https://fis.uni-bamberg.de/handle/uniba/96521