Trendi - a monitor corpus of Slovene
In this paper we present Trendi, a monitor corpus of written Slovene, which has been compiled recently as part of the SLED (Monitor corpus and related resources) project. The methodology and the contents of the corpus are presented, as well as the...
mehr
Volltext:
|
|
Zitierfähiger Link:
|
|
In this paper we present Trendi, a monitor corpus of written Slovene, which has been compiled recently as part of the SLED (Monitor corpus and related resources) project. The methodology and the contents of the corpus are presented, as well as the findings of the survey that aimed to identify the needs of potential users related to topical language use. The Trendi corpus currently contains news articles and other web content from 110 different sources, with the texts being collected and linguistically annotated on a daily basis. The corpus complements Gigafida 2.0, a 1.13-billion-word reference corpus of standard written Slovene. Also discussed are the ways in which the corpus will be integrated into various lexicographic projects, helping not only in the identification of neologisms but also in monitoring changes in already identified language phenomena.
|
Creating and working with spoken language corpora in EXMARaLDA
Spoken language corpora— as used in conversation analytic research, language acquisition studies and dialectology— pose a number of challenges that are rarely addressed by corpus linguistic methodology and technology. This paper starts by giving an...
mehr
Volltext:
|
|
Zitierfähiger Link:
|
|
Spoken language corpora— as used in conversation analytic research, language acquisition studies and dialectology— pose a number of challenges that are rarely addressed by corpus linguistic methodology and technology. This paper starts by giving an overview of the most important methodological issues distinguishing spoken language corpus workfrom the work with written data. It then shows what technological challenges these methodological issues entail and demonstrates how they are dealt with in the architecture and tools of the EXMARaLDA system.
|
Export in Literaturverwaltung |
|