Filtern nach
Letzte Suchanfragen

Ergebnisse für *

Es wurden 385 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 25 von 385.

Sortieren

  1. Verpackt in Feedbackschleifen : Einblicke in Digitale Lehrformate des digitalen Sommersemesters
    Erschienen: 2021

    Mit der pandemiebedingten Notwendigkeit im Sommersemester ausschließlich digital zu unterrichten ging eine große Frage einher: "Wie komme ich in Kontakt mit unseren Studierenden?" und mehr noch "Wie halte ich diesen Kontakt?" Studierende sind... mehr

     

    Mit der pandemiebedingten Notwendigkeit im Sommersemester ausschließlich digital zu unterrichten ging eine große Frage einher: "Wie komme ich in Kontakt mit unseren Studierenden?" und mehr noch "Wie halte ich diesen Kontakt?" Studierende sind untereinander im Idealfall durch Messenger-Gruppen verbunden, für uns Dozierende bleibt häufig nur ein Kanal: die traditionelle E-Mail. Viele Studierende fragen ihre universitätseigene E-Mail nicht ab oder leiten sie nicht auf eine private E-Mail-Adresse um, was für uns im Grunde bedeutete, dass nur unsere Homepage als sicherer aber eben auch einseitiger Informationskanal zur Verfügung stand. "Wegweiser" zu den neuen digitalen Räumen konnten hier zwar aufgestellt werden, was in diesen Räumen aber angeboten wurde, sollte m.E. interaktiv an die Bedarfe unserer Studierenden angepasst werden. Es brauchte also eigentlich sogar mehr Interaktion als in analogen Lehrveranstaltungen. Daher war für mich bei der Transformation der für das Sommersemester geplanten Lehrveranstaltungen die Integration von Interaktionsmöglichkeiten ein zentraler Aspekt, ein Prozess der selbst aber eben auch ein Trial and Error-Verfahren war, der ohne Feedback ins Leere gelaufen wäre. Nachfolgend möchte ich ein Seminar und drei Formate vorstellen, die Feedback und Interaktion auf unterschiedliche Weise integrieren.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400); Linguistik (410); Literatur und Rhetorik (800); Literaturen germanischer Sprachen; Deutsche Literatur (830)
    Schlagworte: Virtuelle Hochschule; Moodle; Interaktion; Interaktivität; E-Learning; Rückmeldung; Germanistik; Linguistik
    Lizenz:

    Creative Commons - Namensnennung 4.0

  2. Kommunikative Formate : Weblogs in den Geistes- und Sozialwissenschaften
    Erschienen: 10.11.2021

    Export in Literaturverwaltung
    Hinweise zum Inhalt: kostenfrei
    Quelle: GiNDok
    Sprache: Deutsch
    Medientyp: Konferenzveröffentlichung; conferenceObject
    Format: Online
    DDC Klassifikation: Sprache (400); Linguistik (410); Literatur und Rhetorik (800)
    Schlagworte: Wissenschaftskommunikation; Weblog; Geisteswissenschaften; Sozialwissenschaften
    Lizenz:

    creativecommons.org/licenses/by/2.0/de/deed.de

    ;

    info:eu-repo/semantics/openAccess

  3. Fachinformationsdienst Linguistik zwischen Innovation und Tradition
  4. Predicting sentiments and space in Swiss literature using BERT and Prodigy

    Grisot G, Pennino F, Herrmann JB. Predicting sentiments and space in Swiss literature using BERT and Prodigy. Presented at the CHR2023 - 3rd Conference on Computational Humanities Research, Antwerp. ; Thanks to the development of new powerful... mehr

     

    Grisot G, Pennino F, Herrmann JB. Predicting sentiments and space in Swiss literature using BERT and Prodigy. Presented at the CHR2023 - 3rd Conference on Computational Humanities Research, Antwerp. ; Thanks to the development of new powerful technologies for computational data analysis, an increasing number of researchers has investigated sentiment in texts, making use of traditional corpus linguistic approaches as well as machine learning tools. When considering literary texts, however, sentiment analysis is still in its infancy, especially when it focuses on languages other than English [1]. Crucially, only very few studies so far have related the representation of sentiment and emotions to that of space. This has depended partly on the limited amount of literary texts available digitally and partly of the challenges of defining and identifying space in literature. Emotions and space are however central to the experience of literary narrative [2, 3, 4], and recent advances in their systematic, quantitative analysis have been made within computational literary studies [5, 6, 7]. Using lexicon-based methods, Grisot and Herrmann [8] investigated emotions and sentiments in relation to the representation of literary space, looking in particular at the differences between the rural and urban landscapes portrayed in a corpus of Swiss novels written in German. The present paper takes a step forward, building on their data and using manual annotation and advanced machine learning methods to train a fine-tuned model, in order to automatically detect and recognise on the one hand sentiment (valence, arousal) and discrete emotions (joy, anger, sadness, disgust, fear, surprise), and on the other spatial entities (named and unnamed), in a historical corpus of Swiss novels. With such model, we aim at higher levels of lexical coverage and validity when compared to existing results obtained with sentiment lexicons and entities lists. Using a language model trained on a large corpus (3000+) of German literary texts spanning ...

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400); Literatur und Rhetorik (800); Germanische Sprachen; Deutsch (430); Informatik, Informationswissenschaft, allgemeine Werke (000)
    Schlagworte: Sentiment Analysis; Geography of Literature; Machine Learning; BERT; Swiss Literature
    Lizenz:

    creativecommons.org/publicdomain/zero/1.0/ ; info:eu-repo/semantics/openAccess

  5. The role of animacy in the nominal possessive constructions of Modern Low Saxon
    Erschienen: 2005

    The dialects of modern Low Saxon dispose of multiple nominal constructions to express possession. The noun phrases in (1) – (4) exemplify these four different constructions. (1) sien Huus his house "his house" (2) Anna ehr Huus Anna her house "Anna's... mehr

     

    The dialects of modern Low Saxon dispose of multiple nominal constructions to express possession. The noun phrases in (1) – (4) exemplify these four different constructions. (1) sien Huus his house "his house" (2) Anna ehr Huus Anna her house "Anna's house" (3) Oma's Huus grandma=POSS house "grandma's house" (4) dat Huus vun de CDU the house of the CDU "the house of the Christian Democrats" After establishing these four constructions as a case of syntactic alternation using authentic examples taken from a one million word corpus of Low Saxon texts, I discuss the role of animacy in the domain of nominal possessive constructions. I examine animacy per se and several additional factors that have been connected to animacy in the literature such as concreteness, person, and number. Moreover, I also take a look at the possessive relation between possessor and possessum and discuss how it is connected to animacy. My research shows that animacy plays a very important role for the choice of possessive construction in Low Saxon and that the animacy level of the possessor is much more important than the animacy level of the possessum. The three prenominal constructions (1) – (3) in which the possessor phrase precedes the possessum phrase are generally used with more animate possessors than the postnominal construction in (4). This is in line with the few comments on these constructions given in descriptive grammars on Low Saxon, e.g. Saltveit (1983), and with results of previous studies on the possessive alternation in English (eg. Leech et al. 1994 and Rosenbach 2002). The fact that the prenominal possessive constructions in (1) – (3) pattern more closely together than each does with the postnominal possessive construction in (4) lends further support to Rosenbach's theory for English that the influence of animacy on the choice of possessive construction is based on the different linear order of possessor and possessum. However, I will also show that there are some differences between the prenominal constructions ...

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch; Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400); Germanische Sprachen; Deutsch (430); Andere germanische Sprachen (439)
    Lizenz:

    cc_by_nd

  6. Analyzing domain specific word embeddings for a large corpus of contemporary German. International Corpus Linguistics Conference, Cardiff, Wales, UK, July 22-26, 2019
    Erschienen: 2019
    Verlag:  Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    Distributional models of word use constitute an indispensable tool in corpus based lexicological research for discovering paradigmatic relations and syntagmatic patterns (Belica et al. 2010). Recently, word embeddings (Mikolov et al. 2013) have... mehr

     

    Distributional models of word use constitute an indispensable tool in corpus based lexicological research for discovering paradigmatic relations and syntagmatic patterns (Belica et al. 2010). Recently, word embeddings (Mikolov et al. 2013) have revived the field by allowing to construct and analyze distributional models on very large corpora. This is accomplished by reducing the very high dimensionality of word cooccurrence contexts, the size of the vocabulary, to few dimensions, such as 100-200. However, word use and meaning can vary widely along dimensions such as domain, register, and time, and word embeddings tend to represent only the most prevalent meaning. In this paper we thus construct domain specific word embeddings to allow for systematically analyzing variations in word use. Moreover, we also demonstrate how to reconstruct domain specific co-occurrence contexts from the dense word embeddings.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Phrase <Syntagma>; Automatische Sprachanalyse; Deutsch
    Lizenz:

    creativecommons.org/licenses/by/4.0/deed.de ; info:eu-repo/semantics/openAccess

  7. A "polyglottal" speech synthesis - modifications for a replica of Kempelen's speaking machine
    Erschienen: 2019
    Verlag:  Dresden : TUDpress

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Kempelen; Wolfgang von; Sprechmaschine; automatische Sprachproduktion
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  8. cmc-core: a basic schema for encoding CMC corpora in TEI
    Erschienen: 2019
    Verlag:  Cergy-Pontoise, France : Cergy-Pontoise University, France

    Since 2013 representatives of several French and German CMC corpus projects have developed three customizations of the TEI-P5 standard for text encoding in order to adapt the encoding schema and models provided by the TEI to the structural... mehr

     

    Since 2013 representatives of several French and German CMC corpus projects have developed three customizations of the TEI-P5 standard for text encoding in order to adapt the encoding schema and models provided by the TEI to the structural peculiarities of CMC discourse. Based on the three schema versions, a 4th version has been created which takes into account the experiences from encoding our corpora and which is specifically designed for the submission of a feature request to the TEI council. On our poster we would present the structure of this schema and its relations (commonalities and differences) to the previous schemas.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Text Encoding Initiative; Deutsch; Englisch
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  9. A corpus-based lexical resource of spoken German in interaction
    Erschienen: 2019
    Verlag:  Brno, Czech Republic : Lexical Computing CZ s.r.o.

    This paper presents the prototype of a lexicographic resource for spoken German in interaction, which was conceived within the framework of the LeGeDe-project (LeGeDe=Lexik des gesprochenen Deutsch). First of all, it summarizes the theoretical and... mehr

     

    This paper presents the prototype of a lexicographic resource for spoken German in interaction, which was conceived within the framework of the LeGeDe-project (LeGeDe=Lexik des gesprochenen Deutsch). First of all, it summarizes the theoretical and methodological approaches that were used for the initial planning of the resource. The headword candidates were selected by analyzing corpus-based data. Therefore, the data of two corpora (written and spoken German) were compared with quantitative methods. The information that was gathered on the selected headword candidates can be assigned to two different sections: meanings and functions in interaction. Additionally, two studies on the expectations of future users towards the resource were carried out. The results of these two studies were also taken into account in the development of the prototype. Focusing on the presentation of the resource’s content, the paper shows both the different lexicographical information in selected dictionary entries, and the information offered by the provided hyperlinks and external texts. As a conclusion, it summarizes the most important innovative aspects that were specifically developed for the implementation of such a resource.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerunterstützte Lexikografie; Gesprochene Sprache; Korpus; Deutsch
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

  10. The microstructure of Online Linguistics Dictionaries: obligatory and facultative elements
    Erschienen: 2019
    Verlag:  Ljubljana : Trojina, Institute for Applied Slovene Studies

    The planning of a dictionary should consider both theoretical and empiric aspects, either for its macro- and microstructure: this is true also for Online Specialized Dictionaries of Linguistics. In particular the microstructure should be standardized... mehr

     

    The planning of a dictionary should consider both theoretical and empiric aspects, either for its macro- and microstructure: this is true also for Online Specialized Dictionaries of Linguistics. In particular the microstructure should be standardized and structured so as to fit with the primary and secondary functions of a dictionary. Unfortunately, empirical studies that investigate Online Specialized Dictionaries of Linguistics are rare, making it unclear which microstructural elements are obligatory and which are facultative. This article will present and comment upon the results of an investigation into a corpus of Online Specialized Dictionaries of Linguistics, focusing attention on these aspects and also the most important theoretical issues. An example taken from DIL, a German-Italian Online Dictionary of Linguistics, will end the article.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerunterstützte Lexikografie; Mikrostruktur; Methodologie; Fachsprache
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

  11. Deep learning for free indirect representation
    Erschienen: 2019
    Verlag:  München [u.a.] : German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg

    In this paper, we present our work-inprogress to automatically identify free indirect representation (FI), a type of thought representation used in literary texts. With a deep learning approach using contextual string embeddings, we achieve f1 scores... mehr

     

    In this paper, we present our work-inprogress to automatically identify free indirect representation (FI), a type of thought representation used in literary texts. With a deep learning approach using contextual string embeddings, we achieve f1 scores between 0.45 and 0.5 (sentence-based evaluation for the FI category) on two very different German corpora, a clear improvement on earlier attempts for this task. We show how consistently marked direct speech can help in this task. In our evaluation, we also consider human inter-annotator scores and thus address measures of certainty for this difficult phenomenon.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Deutsch; Indirekte Rede; Erlebte Rede; Automatische Sprachanalyse; Korpus
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  12. Detecting the boundaries of sentence-like units on spoken German
    Erschienen: 2019
    Verlag:  München [u.a.] : German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg

    Automatic division of spoken language transcripts into sentence-like units is a challenging problem, caused by disfluencies, ungrammatical structures and the lack of punctuation. We present experiments on dividing up German spoken dialogues where we... mehr

     

    Automatic division of spoken language transcripts into sentence-like units is a challenging problem, caused by disfluencies, ungrammatical structures and the lack of punctuation. We present experiments on dividing up German spoken dialogues where we investigate the impact of task setup and data representation, encoding of context information as well as different model architectures for this task.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Deutsch; Gesprochene Sprache; Automatische Sprachanalyse; Segmentierung; Satz
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  13. Overview of GermEval Task 2, 2019 shared task on the identification of offensive language
    Erschienen: 2019
    Verlag:  München [u.a.] : German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg

    We present the second edition of the GermEval Shared Task on the Identification of Offensive Language. This shared task deals with the classification of German tweets from Twitter. Two subtasks were continued from the first edition, namely a... mehr

     

    We present the second edition of the GermEval Shared Task on the Identification of Offensive Language. This shared task deals with the classification of German tweets from Twitter. Two subtasks were continued from the first edition, namely a coarse-grained binary classification task and a fine-grained multi-class classification task. As a novel subtask, we introduce the classification of offensive tweets as explicit or implicit. The shared task had 13 participating groups submitting 28 runs for the coarse-grained task, another 28 runs for the fine-grained task, and 17 runs for the implicit-explicit task. We evaluate the results of the systems submitted to the shared task. The shared task homepage can be found at projects.fzai.h-da.de/iggsa/

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Deutsch; Beleidigung; Social Media; Twitter <Softwareplattform>; Tweet; Automatische Spracherkennung
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  14. A descriptive analysis of a German corpus annotated with opinion sources and targets
    Erschienen: 2019
    Verlag:  München [u.a.] : German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg

    We present a descriptive analysis on the two datasets from the shared task on Source, Subjective Expression and Target Extraction from Political Speeches (STEPS), the only existing German dataset for opinion role extraction of its size. Our analysis... mehr

     

    We present a descriptive analysis on the two datasets from the shared task on Source, Subjective Expression and Target Extraction from Political Speeches (STEPS), the only existing German dataset for opinion role extraction of its size. Our analysis discusses the individual properties of the three components, subjective expressions, sources and targets and their relations towards each other. Our observations should help practitioners and researchers when building a system to extract opinion roles from German data.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Deutsch; Politische Sprache; Gesprochene Sprache; Propositionale Einstellung; Automatische Sprachanalyse
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  15. A Supervised learning approach for the extraction of opinion sources and targets from German text
    Erschienen: 2019
    Verlag:  München [u.a.] : German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg

    We present the first systematic supervised learning approach for the extraction of opinion sources and targets on German language data. A wide choice of different features is presented, particularly syntactic features and generalization features. We... mehr

     

    We present the first systematic supervised learning approach for the extraction of opinion sources and targets on German language data. A wide choice of different features is presented, particularly syntactic features and generalization features. We point out specific differences between opinion sources and targets. Moreover, we explain why implicit sources can be extracted even with fairly generic features. In order to ensure comparability our classifier is trained and tested on the dataset of the STEPS shared task.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Deutsch; Semantische Analyse; Propositionale Einstellung; Automatische Sprachanalyse
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  16. A web of loans: multilingual loanword lexicography with property graphs
    Erschienen: 2019
    Verlag:  Brno, Czech Republic : Lexical Computing CZ s.r.o.

    The Lehnwortportal Deutsch (2012 seqq.) serves as an integrated online information system on German lexical borrowings into other languages, synthesizing an increasing number of lexicographical dictionaries and providing basic cross-resource search... mehr

     

    The Lehnwortportal Deutsch (2012 seqq.) serves as an integrated online information system on German lexical borrowings into other languages, synthesizing an increasing number of lexicographical dictionaries and providing basic cross-resource search options. The paper discusses the far-reaching revision of the system’s conceptual, lexicographical and technological underpinnings currently under way, focussing on their relevance for multilingual loanword lexicography.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Lehnwort
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

  17. How much “tourism” is there in dictionary apps? An empirical study of lexicographical resources on mobile devices (German, Italian, Spanish)
    Erschienen: 2019
    Verlag:  Brno, Czech Republic : Lexical Computing CZ s.r.o.

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Online-Wörterbuch; computerunterstützte Lexikographie; Tourismus; Deutsch; Italienisch; Spanisch
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

  18. From thousands of graphics to one conclusion. Visualization of the vocabulary of quotation expressions
    Erschienen: 2019
    Verlag:  Brno, Czech Republic : Lexical Computing CZ s.r.o.

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Visualisierung; Phraseologie
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

  19. A corpus-based lexical resource of spoken German in interaction
    Erschienen: 2019
    Verlag:  Brno, Czech Republic : Lexical Computing CZ s.r.o.

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerunterstützte Lexikographie; Gesprochene Sprache; Korpus
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

  20. Präskriptive Terminologiearbeit optimieren: Potenziale der deskriptiven Phase gezielt nutzen
    Erschienen: 2019
    Verlag:  Stuttgart : tcworld

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Terminologie; Visualisierung; Datenbank; Methodologie; Fachsprache
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  21. CLARIN Web Services for TEI-annotated Transcripts of Spoken Language
    Erschienen: 2020
    Verlag:  Utrecht : CLARIN

    We present web services implementing a workflow for transcripts of spoken language following TEI guidelines, in particular ISO 24624:2016 "Language resource management - Transcription of spoken language". The web services are available at our website... mehr

     

    We present web services implementing a workflow for transcripts of spoken language following TEI guidelines, in particular ISO 24624:2016 "Language resource management - Transcription of spoken language". The web services are available at our website and will be available via the CLARIN infrastructure, including the Virtual Language Observatory and WebLicht.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Text Encoding Initiative; Gesprochene Sprache; Transkription; Computerlinguistik; Web Services
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  22. Uralic multimedia corpora: ISO/TEI corpus data in the project INEL
    Erschienen: 2020
    Verlag:  Stroudsburg, PA : Association for Computational Linguistics

    In this paper, we describe a data processing pipeline used for annotated spoken corpora of Uralic languages created in the INEL (Indigenous Northern Eurasian Languages) project. With this processing pipeline we convert the data into a loss-less... mehr

     

    In this paper, we describe a data processing pipeline used for annotated spoken corpora of Uralic languages created in the INEL (Indigenous Northern Eurasian Languages) project. With this processing pipeline we convert the data into a loss-less standard format (ISO/TEI) for long-term preservation while simultaneously enabling a powerful search in this version of the data. For each corpus, the input we are working with is a set of files in EXMARaLDA XML format, which contain transcriptions, multimedia alignment, morpheme segmentation and other kinds of annotation. The first step of processing is the conversion of the data into a certain subset of TEI following the ISO standard ’Transcription of spoken language’ with the help of an XSL transformation. The primary purpose of this step is to obtain a representation of our data in a standard format, which will ensure its long-term accessibility. The second step is the conversion of the ISO/TEI files to a JSON format used by the “Tsakorpus” search platform. This step allows us to make the corpora available through a web-based search interface. As an addition, the existence of such a converter allows other spoken corpora with ISO/TEI annotation to be made accessible online in the future. ; Tässä paperissa kuvataan aineistonnprosessointimenetelmä joka on käytössä uralilaisten puhuttujen korpusten luonnissa kieltedokumentointiprojekti INELissä. Prosessointimenetelmää käytetään konvertoimaan dataa häviöttömään ISO/TEI- standardiformaattiin pitkän aikavälin säilytystä varten sekä samanaikaisesti tehokkaisiin hakutoimintoihin tälle akineistoversiolle. Jokaisen korpuksen lähtöaineistona on joukko tiedostoja EXMARaLDAn XML-formaatissa, joka sisältää transkriptejä, multimediaa kohdennuksineen, morfeemijäsennyksiä ja muita annotaatiota. Ensimmäinen käsittelyaskel on aineiston konvertointi TEI:n osajoukkoon, joka muodostaa ISO-standardin puhutun kielen transkripteille, XSL-transformaatioita käyttäen. Tämän askelen ensisijainen tarkoitus on saada aineisto sellaiseen ...

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerlinguistik; Uralische Sprachen; Korpus; Text Encoding Initiative; Gesprochene Sprache; Annotation
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  23. WebAnno-MM: EXMARaLDA meets WebAnno
    Erschienen: 2020
    Verlag:  Linköping : Linköping University Electronic Press

    In this paper, we present WebAnno-MM, an extension of the popular web-based annotation tool WebAnno, which is designed for the linguistic annotation of transcribed spoken data with time aligned media files. Several new features have been implemented... mehr

     

    In this paper, we present WebAnno-MM, an extension of the popular web-based annotation tool WebAnno, which is designed for the linguistic annotation of transcribed spoken data with time aligned media files. Several new features have been implemented for our current use case: a novel teaching method based on pair-wise manual annotation of transcribed video data and systematic comparison of agreement between students. To enable the annotation of transcribed spoken language data, apart from technical and data model related challenges, WebAnno-MM offers an additional view to data: a (musical) score view for the inspection of parallel utterances, which is relevant for various methodological research questions regarding the analysis of interactions of spoken content.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Annotation; Transkription; Multimedia; Computerlingustik
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  24. Interaction of technology and methodology in building and sharing an annotated learner corpus of spoken German
    Erschienen: 2020
    Verlag:  València : Editorial Universitat Politècnica de València

    This paper discusses the technological and methodological challenges in creating and sharing HAMATAC, the Hamburg Map Task Corpus. The first version of the corpus, consisting of 24 recordings with orthographic transcriptions and metadata, is publicly... mehr

     

    This paper discusses the technological and methodological challenges in creating and sharing HAMATAC, the Hamburg Map Task Corpus. The first version of the corpus, consisting of 24 recordings with orthographic transcriptions and metadata, is publicly available. A second version featuring different types of linguistic annotation is in progress. I will describe how the various software tools and data formats of the EXMARaLDA system were used for transcription and multi-level annotation, to compile recordings and transcriptions into a corpus and manage metadata, to publish the corpus, and how they can be used for carrying out corpus queries (KWIC) and analyses. Some recurrent issues in corpus building and sharing and the interaction of technological and methodological aspects will be illustrated using HAMATAC. ; Este artículo trata los retos tecnológicos y metodológicos de la creación y publicación de HAMATAC, el Hamburg Map Task Corpus. La primera versión del corpus, que consiste en 24 grabaciones con transcripción ortográfica y metadatos, está disponible públicamente. Está en desarrollo una segunda versión que incluye distintos tipos de anotación lingüística. Voy a describir cómo las diversas herramientas de software y formatos de datos del sistema EXMARaLDA se utilizaron para la transcripción y la anotación multinivel, para reunir grabaciones y transcripciones en un corpus, para administrar los metadatos y para publicar el corpus, y cómo pueden ser usados para realizar consultas en el corpus (KWIC) y análisis. Se ilustrarán usando HAMATAC algunos de los cuestiones recurrentes de la creación y publicación de un corpus y la interacción de los aspectos tecnológicos y metodológicos.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Gesprochene Sprache; Annotation; Transkription; Korpus; Methodologie
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  25. Addressing Cha(lle)nges in Long-Term Archiving of Large Corpora
    Erschienen: 2020
    Verlag:  Paris : European Language Resources Association

    This paper addresses long-term archival for large corpora. Three aspects specific to language resources are focused, namely (1) the removal of resources for legal reasons, (2) versioning of (unchanged) objects in constantly growing resources,... mehr

     

    This paper addresses long-term archival for large corpora. Three aspects specific to language resources are focused, namely (1) the removal of resources for legal reasons, (2) versioning of (unchanged) objects in constantly growing resources, especially where objects can be part of multiple releases but also part of different collections, and (3) the conversion of data to new formats for digital preservation. It is motivated why language resources may have to be changed, and why formats may need to be converted. As a solution, the use of an intermediate proxy object called a signpost is suggested. The approach will be exemplified with respect to the corpora of the Leibniz Institute for the German Language in Mannheim, namely the German Reference Corpus (DeReKo) and the Archive for Spoken German (AGD).

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Langzeitarchivierung; Nutzungsrecht; Dateiformat
    Lizenz:

    creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess