Options
Multi-CAST Tabasaran (audio recordings)
Contributor(s):
Publisher Information:
Otto-Friedrich-Universität Bamberg
Year of publication:
2021
Language:
Multilingual/Other
Abstract:
This archive contains audio recordings for the Multi-CAST Tabasaran corpus (Bogomolova et al. 2021), originally published in January 2021 with version 2101 of the Multi-CAST collection (Haig & Schnell 2015). The annotation and documentation files accompanying these files have been archived separately. The recordings are available as WAV and MP3 files.
Tabasaran [taba1259] is a Nakh-Daghestanian (Caucasian) language from the Lezgic subbranch. Recent census data puts the number of speakers at about 120 000; Campbell et al. (2017) classify the language as vulnerable.
The texts in the Multi-CAST Tabasaran corpus were recorded by Natalia Bogomolova with the assistance of Dmitry Ganenkov in 2010, and subsequently transcribed, glossed, and translated by the former. The annotations with GRAID and RefIND were added by Nils Schiborr between 2019 and 2020. The five texts in this corpus are a mixture of traditional and biographical narratives.
Citation
Bogomolova, Natalia & Ganenkov, Dmitry & Schiborr, Nils N. 2021. Multi-CAST Tabasaran. In Haig, Geoffrey & Schnell, Stefan (eds.), Multi-CAST: Multilingual corpus of annotated spoken texts. [version of the annotations used]. Bamberg: University of Bamberg.
References
Campbell, Lyle & Lee, Nala H. & Okura, Eve & Simpson, Sean & Ueki, Kaori. 2017. The catalogue of endangered languages (ElCat). (endangeredlanguages.com/)
Haig, Geoffrey & Schnell, Stefan (eds.). 2015. Multi-CAST: Multilingual corpus of annotated spoken texts. [version]. Bamberg: University of Bamberg.
Tabasaran [taba1259] is a Nakh-Daghestanian (Caucasian) language from the Lezgic subbranch. Recent census data puts the number of speakers at about 120 000; Campbell et al. (2017) classify the language as vulnerable.
The texts in the Multi-CAST Tabasaran corpus were recorded by Natalia Bogomolova with the assistance of Dmitry Ganenkov in 2010, and subsequently transcribed, glossed, and translated by the former. The annotations with GRAID and RefIND were added by Nils Schiborr between 2019 and 2020. The five texts in this corpus are a mixture of traditional and biographical narratives.
Citation
Bogomolova, Natalia & Ganenkov, Dmitry & Schiborr, Nils N. 2021. Multi-CAST Tabasaran. In Haig, Geoffrey & Schnell, Stefan (eds.), Multi-CAST: Multilingual corpus of annotated spoken texts. [version of the annotations used]. Bamberg: University of Bamberg.
References
Campbell, Lyle & Lee, Nala H. & Okura, Eve & Simpson, Sean & Ueki, Kaori. 2017. The catalogue of endangered languages (ElCat). (endangeredlanguages.com/)
Haig, Geoffrey & Schnell, Stefan (eds.). 2015. Multi-CAST: Multilingual corpus of annotated spoken texts. [version]. Bamberg: University of Bamberg.
Type:
Sound
Keywords: ;
spoken language corpus
Tabasaran
Format: ;
audio/mpeg
audio/wav
Version:
1
Permalink
https://fis.uni-bamberg.de/handle/uniba/97623