Ergebnisse für *

Es wurden 13 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 13 von 13.

Sortieren

  1. Das Referenzkorpus Altdeutsch. Das Konzept, die Realisierung und die neuen Möglichkeiten
    Erschienen: 2024
    Verlag:  Tübingen : Narr ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

  2. A BLARK extension for temporal annotation mining
    Erschienen: 2024
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    The Basic Language Resource Kit (BLARK) proposed by Krauwer is designed for the creation of initial textual resources. There are a number of toolkits for the development of spoken language resources and systems, but tools for second level resources,... mehr

     

    The Basic Language Resource Kit (BLARK) proposed by Krauwer is designed for the creation of initial textual resources. There are a number of toolkits for the development of spoken language resources and systems, but tools for second level resources, that is, resources which are the result of processing primary level speech resources such as speech recordings. Typically, processing of this kind in phonetics is done manually, with the aid of spreadsheets multi-purpose statistics software. We propose a Basic Language and Speech Kit (BLAST) as an extension to BLARK and suggest a strategy for integrating the kit into the Natural Language Toolkit (NLTK). The prototype kit is evaluated in an application to examining temporal properties of spoken Brazilian Portuguese.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Annotation; Data Mining; Gesprochene Sprache; Phonetik; Rhythmus
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

  3. Building a historical corpus for Classical Portuguese: some technological aspects
    Erschienen: 2024
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper describes the restructuring process of a large corpus of historical documents and the system architecture that is used for accessing it. The initial challenge of this process was to get the most out of existing material, normalizing the... mehr

     

    This paper describes the restructuring process of a large corpus of historical documents and the system architecture that is used for accessing it. The initial challenge of this process was to get the most out of existing material, normalizing the legacy markup and harvesting the inherent information using widely available standards. This resulted in a conceptual and technical restructuring of the formerly existing corpus. The development of the standardized markup and techniques allowed the inclusion of important new materials, such as original 16th and 17th century prints and manuscripts; and enlarged the potential user groups. On the technological side, we were grounded on the premise that open standards are the best way of making sure that the resources will be accessible even after years in an archive. This is a welcomed result in view of the additional consequence of the remodeled corpus concept: it serves as a repository for important historical documents, some of which had been preserved for 500 years in paper format. This very rich material can from now on be handled freely for linguistic research goals.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Portugiesisch; Archivierung; Annotation; Metadaten; Sprachdaten; Computerlinguistik
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

  4. CoGesT: a formal transcription system for conversational gesture
    Erschienen: 2024
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    In order to create reusable and sustainable multimodal resources a transcription model for hand and arm gestures in conversation is needed. We argue that transcription systems so far developed for sign language transcription and psychological... mehr

     

    In order to create reusable and sustainable multimodal resources a transcription model for hand and arm gestures in conversation is needed. We argue that transcription systems so far developed for sign language transcription and psychological analysis are not suitable for the linguistic analysis of conversational gesture. Such a model must adhere to a strict form-function distinction and be both computationally explicit and compatible with descriptive notations such as feature structures in other areas of computational and descriptive linguistics. We describe the development and evaluation of a suitable formal model using a feature-based transcription system, concentrating as a first step on arm gestures within the context of the development of an annotated video resource and gesture lexicon.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Transkription; Körpersprache; Gespräch; Gestik; Computerlinguistik; Annotation; Konversationsanalyse
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

  5. Consistent storage of metadata in inference lexica: the MetaLex approach
    Erschienen: 2024
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    With MetaLex we introduce a framework for metadata management where information can be inferred from different areas of metadata coding, such as metadata for catalogue descriptions, linguistic levels, or tiers. This is done for consistency and... mehr

     

    With MetaLex we introduce a framework for metadata management where information can be inferred from different areas of metadata coding, such as metadata for catalogue descriptions, linguistic levels, or tiers. This is done for consistency and efficiency in metadata recording and applies the same inference techniques that are used for lexical inference. For this purpose we motivate the need for metadata descriptions on all document levels, describe the different structures of metadata, use existing metadata recommendations on different levels of annotations, and show a usecase of metadata inference.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Metadaten; Schlussfolgern; Lexikon; Annotation; Computerlinguistik
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

  6. Building bridges. Reconstructing implicit information in argumentative texts using commonsense knowledge
    Autor*in: Becker, Maria
    Erschienen: 2024
    Verlag:  Mannheim : IDS-Verlag ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    In many argumentative texts a substantial amount of knowledge remains implicit. This implicit knowledge is often crucial for a deep understanding and correct interpretation of arguments. In this work we investigate how to automatically reconstruct... mehr

     

    In many argumentative texts a substantial amount of knowledge remains implicit. This implicit knowledge is often crucial for a deep understanding and correct interpretation of arguments. In this work we investigate how to automatically reconstruct implicit knowledge in argumentative texts, and how the reconstruction of implicit knowledge can help in improving computational argument analysis. We point out that knowledge which stays implicit can in most cases be framed as commonsense knowledge, which has been shown to be helpful for solving many Natural Language Processing (NLP) tasks. However, it has not yet been leveraged for an in-depth analysis of arguments. This work closes this research desideratum by integrating commonsense knowledge in computational argument analysis. We explore ways to fill implicit knowledge gaps in arguments automatically by utilizing commonsense knowledge, in order to build bridges between argumentative sentences – with the ultimate goal of improving argument analysis.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Buch (Monographie)
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Argumentation; Common Sense; Implizites Wissen; Annotation; Automatische Sprachanalyse
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  7. Annotation driven concordancing: the PAX toolkit
    Erschienen: 2024
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    We describe PAX, "Portable Audio Concordance System", a proof-of-concept prototype of a multipurpose, multilingual audio concordance toolkit. The primary goal is to support efficient grammar and lexicon construction in the documentation of unwritten... mehr

     

    We describe PAX, "Portable Audio Concordance System", a proof-of-concept prototype of a multipurpose, multilingual audio concordance toolkit. The primary goal is to support efficient grammar and lexicon construction in the documentation of unwritten languages; languages currently included are Ega, Anyi, and Koulango (Ivory Coast), additional samples in German and English. The approach combines methods from corpus linguistics, annotation theory and practice, phonetics and lexicography.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Annotation; Konkordanz; Korpus; Phonetik; Lexikografie; XML; Gesprochene Sprache; Multimodales System
    Lizenz:

    creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

  8. Sprachressourcen in der Standardisierung
    Erschienen: 2024
    Verlag:  Hildesheim : Gesellschaft für Sprachtechnologie und Computerlinguistik ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    Wir berichten über internationale Normungsarbeit im Bereich von Sprachressourcen. Die Normen werden von internationalen Arbeitsgruppen im Rahmen der International Organization for Standardization(ISO) entwickelt und jeweils national von... mehr

     

    Wir berichten über internationale Normungsarbeit im Bereich von Sprachressourcen. Die Normen werden von internationalen Arbeitsgruppen im Rahmen der International Organization for Standardization(ISO) entwickelt und jeweils national von entsprechenden Gruppen, in Deutschland koordiniert vom Deutschen Institut für Normung (DIN), begleitet und diskutiert. Für die automatische Sprachverarbeitung besteht seit Jahren zunehmend Bedarf an elektronischen Ressourcen: Lexika, Korpora, Grammatiken, Annotationskonventionen, Sprachdatensammlungen, usw. Damit solche Ressourcen über einen einzelnen Anwendungskontext hinaus wiederverwertbar sind und zwischen Arbeitsgruppen ausgetauscht werden können, wird an einer Normung ihrer Repräsentationsformate und der zur Beschreibung von Ressourceninhalten benutzbaren Vokabularien gearbeitet (Datenkategorien). Waren in der Vergangenheit Standardisierungsbemühungen auf bestimmte Ausschnitte aus dem Spektrum der linguistischen Beschreibungen von Ressourcen beschränkt(z.B. die EU-Projekte SAM im Bereich gesprochener Sprache, EAGLES und ISLE im Bereich von Morphosyntax, Syntax, lexikalischer Semantik in Texten und Lexika und Sprachtechnologie), so ist die Zielsetzung der 2002 und 2003 gegründeten ISO (TC37SC4) bzw. DIN (NAT AA6) Arbeitsgruppenbreiter: es geht um Metarichtlinien für die Repräsentation und Annotation von Texten ebenso wie um Datenkategorien für Lexika, morphologische und morphosyntaktische Analyse, usw. Wir beschreiben den aktuellen Stand der Normungsdiskussion.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch
    Medientyp: Aufsatz aus einer Zeitschrift
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Standardisierung; Sprachverarbeitung; Annotation; Daten; Online-Ressource
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  9. Metadata for time aligned corpora
    Erschienen: 2024
    Verlag:  Luxemburg : European Language Resources Association ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    For a detailed description of time aligned corpora, for example spoken language corpora and multimodal corpora, specific metadata categories are necessary, extending the scope of traditional metadata categories. We argue that it is necessary to allow... mehr

     

    For a detailed description of time aligned corpora, for example spoken language corpora and multimodal corpora, specific metadata categories are necessary, extending the scope of traditional metadata categories. We argue that it is necessary to allow metadata on all levels of annotation, i.e. on a general level for catalogues, on the session level for each recording, on the annotation level for multi tier score annotation, even on the level of individual annotation segments. We use existing standards where they allow this distinction and introduce metadata categories for the layer level.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Metadaten; Korpus; Gesprochene Sprache; Annotation; Multimodales System
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  10. Detecting impact relevant sections in scientific research
    Erschienen: 2024
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    Impact assessment is an evolving area of research that aims at measuring and predicting the potential effects of projects or programs on a variety of stakeholders. While measuring the impact of scientific research is a vibrant subdomain of impact... mehr

     

    Impact assessment is an evolving area of research that aims at measuring and predicting the potential effects of projects or programs on a variety of stakeholders. While measuring the impact of scientific research is a vibrant subdomain of impact assessment, a recurring obstacle in this specific area is the lack of an efficient framework that facilitates labeling and analysis of lengthy reports. To address this issue, we propose, implement, and evaluate a framework for automatically assessing the impact of scientific research projects by identifying pertinent sections in research reports that indicate potential impact. We leverage a mixed-method approach that combines manual annotation with supervised machine learning to extract these passages from project reports. We experiment with different machine learning algorithms, including traditional statistical models as well as pre-trained transformer language models. Our results show that our proposed method achieves accuracy scores up to 0.81, and that our method is generalizable to scientific research from different domains and different languages.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Forschung; Annotation; Wirkungsanalyse; Sprachdaten; Infrastruktur
    Lizenz:

    creativecommons.org/licenses/by-nc/4.0/deed.de ; info:eu-repo/semantics/openAccess

  11. Out of the mouths of MPs: Speaker attribution in parliamentary debates
    Erschienen: 2024
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper presents GePaDeSpkAtt, a new corpus for speaker attribution in German parliamentary debates, with more than 7,700 manually annotated events of speech, thought and writing. Our role inventory includes the sources, addressees, messages and... mehr

     

    This paper presents GePaDeSpkAtt, a new corpus for speaker attribution in German parliamentary debates, with more than 7,700 manually annotated events of speech, thought and writing. Our role inventory includes the sources, addressees, messages and topics of the speech event and also two additional roles, medium and evidence. We report baseline results for the automatic prediction of speech events and their roles, with high scores for both, event triggers and roles. Then we apply our model to predict speech events in 20 years of parliamentary debates and investigate the use of factives in the rhetoric of MPs.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Politische Rede; Annotation; Parlamentsdebatte; Politiker; Opposition
    Lizenz:

    creativecommons.org/licenses/by-nc/4.0/deed.de ; info:eu-repo/semantics/openAccess

  12. Das kleine Wörterbuch der Redeeinleiter
    Erschienen: 2024
    Verlag:  Zenodo ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

  13. Investigating reply relations on Wikipedia talk pages to reconstruct interactional strategies of Wikipedia authors
    Erschienen: 2024
    Verlag:  Amsterdam/Philadelphia : Benjamins ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

    This chapter presents the annotation and analysis of interpretative reply relations on Wikipedia talk pages using data from the WikiDemoCorpus (WDC). Building on an approach of annotating interpretative reply relations to analyze these relations in... mehr

     

    This chapter presents the annotation and analysis of interpretative reply relations on Wikipedia talk pages using data from the WikiDemoCorpus (WDC). Building on an approach of annotating interpretative reply relations to analyze these relations in Wikipedia talk page posts, the chapter presents nine reply relation categories found in the German WDC. Additionally, linguistic cues for each category and the Wikipedia discussion pages overall are explained in detail, illustrated through reply relation targets. The results of the linguistic annotation are threefold: First, we provide an annotation scheme that can be used by third parties to produce more data according to their needs. Second, we shed light on and quantify the numerous ways Wikipedia authors reply to each other’s posts on talk pages. Finally, we provide richly annotated data that can be used for further analyses, such as identifying interactional relations on higher levels or training tasks in machine learning algorithms.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Wikipedia; Interaktion; Annotation; Deutsch; Computerunterstützte Kommunikation
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess