Ergebnisse für *

Es wurden 22 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 22 von 22.

Sortieren

  1. What is the Text Encoding Initiative?
    Autor*in: Burnard, Lou
    Erschienen: 2014
    Verlag:  OpenEdition Press, [s.l.]

    Europa-Universität Viadrina, Universitätsbibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Hinweise zum Inhalt
    Volltext (kostenfrei)
    Quelle: Verbundkataloge
    Sprache: Englisch
    Medientyp: Ebook
    Format: Online
    ISBN: 9782821834590
    RVK Klassifikation: ES 935
    Schlagworte: digital editing, digital humanities, TEI, text encoding, XML; Text Encoding Initiative; Auszeichnungssprache; Elektronische Publikation
    Umfang: 1 Online-Ressource (114 S.)
    Bemerkung(en):

    The Text Encoding Initiative (TEI) Guidelines have long been regarded as the de facto standard for the preparation of digital textual resources in the scholarly research community. For the beginner, they offer a daunting range of possibilities, reflecting the huge range of potential applications for text encoding, from traditional scholarly editions, to language corpora, historical lexicons, digital archives and beyond.Drawing on many examples of TEI-encoded text from a variety of research domains, this simple and straightforward book is intended to help the beginner make their own choices from the full range of TEI options. It explains the XML technology used by the TEI in language accessible to the non-technical reader and provides a guided tour of the many parts of the TEI universe, and how it may be customized to suit an individual project's needs

  2. Making great work even better : Appraisal and Digital Curation of widely dispersed Electronic Textual Resources (c. 15th–19th cent.) in CLARIN-D
    Erschienen: 2012
    Verlag:  Berlin-Brandenburgische Akademie der Wissenschaften, Berlin

  3. What is the Text Encoding Initiative?
    how to add intelligent markup to digital resources
    Autor*in: Burnard, Lou
    Erschienen: 2014
    Verlag:  OpenEdition Press, Marseille

    Universitätsbibliothek Clausthal
    keine Fernleihe
    Universitätsbibliothek Mannheim
    keine Fernleihe
    Universitätsbibliothek Mannheim
    keine Fernleihe
    Universitätsbibliothek Mannheim
    keine Fernleihe
    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Hinweise zum Inhalt
    Volltext (Kostenfrei)
    Quelle: Verbundkataloge
    Sprache: Englisch
    Medientyp: Ebook
    Format: Online
    ISBN: 9782821834606
    RVK Klassifikation: ES 935
    Schriftenreihe: Encyclopédie numérique ; 3
    Schlagworte: Text Encoding Initiative; Elektronische Publikation; Auszeichnungssprache;
    Umfang: 1 Online-Ressource (113 S.)
  4. What is the Text Encoding Initiative?
    Autor*in: Burnard, Lou
    Erschienen: 2014
    Verlag:  OpenEdition Press, [s.l.]

    Hochschulbibliothek Ansbach
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Staatliche Bibliothek, Schloßbibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek Augsburg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Staatsbibliothek Bamberg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek Bamberg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek Bayreuth
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Landesbibliothek Coburg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek Eichstätt-Ingolstadt
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek Erlangen-Nürnberg, Hauptbibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Kunsthistorisches Institut in Florenz, Max-Planck-Institut, Bibliothek
    Hochschule Weihenstephan-Triesdorf, Zentralbibliothek
    keine Ausleihe von Bänden, nur Papierkopien werden versandt
    Hochschulbibliothek Ingolstadt
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Hochschule Landshut, Hochschule für Angewandte Wissenschaften, Bibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Bayerische Staatsbibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Deutsches Museum, Bibliothek
    keine Ausleihe von Bänden, nur Papierkopien werden versandt
    Hochschule München, Bibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Landesamt für Digitalisierung, Breitband und Vermessung, Bibliothek
    keine Fernleihe
    Technische Universität München, Universitätsbibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek der LMU München
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Zentralinstitut für Kunstgeschichte, Bibliothek
    keine Ausleihe von Bänden, nur Papierkopien werden versandt
    Hochschule für angewandte Wissenschaften Neu-Ulm, Hochschulbibliothek
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Technische Hochschule Nürnberg Georg Simon Ohm, Bibliothek
    keine Ausleihe von Bänden, nur Papierkopien werden versandt
    Deutsches Forum für Kunstgeschichte, Bibliothek
    Universitätsbibliothek Passau
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Staatliche Bibliothek Regensburg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek Regensburg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Bibliotheca Hertziana - Max-Planck-Institut für Kunstgeschichte
    Universitätsbibliothek Würzburg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Hinweise zum Inhalt
    Volltext (kostenfrei)
    Quelle: Verbundkataloge
    Sprache: Englisch
    Medientyp: Ebook
    Format: Online
    ISBN: 9782821834590
    RVK Klassifikation: ES 935
    Schlagworte: digital editing, digital humanities, TEI, text encoding, XML; Text Encoding Initiative; Auszeichnungssprache; Elektronische Publikation
    Umfang: 1 Online-Ressource (114 S.)
    Bemerkung(en):

    The Text Encoding Initiative (TEI) Guidelines have long been regarded as the de facto standard for the preparation of digital textual resources in the scholarly research community. For the beginner, they offer a daunting range of possibilities, reflecting the huge range of potential applications for text encoding, from traditional scholarly editions, to language corpora, historical lexicons, digital archives and beyond.Drawing on many examples of TEI-encoded text from a variety of research domains, this simple and straightforward book is intended to help the beginner make their own choices from the full range of TEI options. It explains the XML technology used by the TEI in language accessible to the non-technical reader and provides a guided tour of the many parts of the TEI universe, and how it may be customized to suit an individual project's needs

  5. What is the Text Encoding Initiative?
    how to add intelligent markup to digital resources
    Autor*in: Burnard, Lou
    Erschienen: 2014
    Verlag:  OpenEdition Press, Marseille

    Universitätsbibliothek Augsburg
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Universitätsbibliothek Passau
    uneingeschränkte Fernleihe, Kopie und Ausleihe
    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Hinweise zum Inhalt
    Volltext (kostenfrei)
    Quelle: Verbundkataloge
    Sprache: Englisch
    Medientyp: Ebook
    Format: Online
    ISBN: 9782821834606; 9782821834590
    RVK Klassifikation: ES 935
    Schriftenreihe: Encyclopédie numérique ; 3
    Schlagworte: Auszeichnungssprache; Text Encoding Initiative; Elektronische Publikation
    Umfang: 1 Online-Ressource
  6. CLARIN Web Services for TEI-annotated Transcripts of Spoken Language
    Erschienen: 2020
    Verlag:  Linköping : Linköping University Electronic Press

    We present web services which implement a workflow for transcripts of spoken language following the TEI guidelines, in particular ISO 24624:2016 “Language resource management – Transcription of spoken language”. The web services are available at our... mehr

     

    We present web services which implement a workflow for transcripts of spoken language following the TEI guidelines, in particular ISO 24624:2016 “Language resource management – Transcription of spoken language”. The web services are available at our website and will be available via the CLARIN infrastructure, including the Virtual Language Observatory and WebLicht.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Text Encoding Initiative; Korpus; Datenmanagement; Gesprochene Sprache; Web Services; Annotation
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  7. Towards comprehensive definitions of data quality for audiovisual annotated language resources
    Erschienen: 2021
    Verlag:  Linköping : Linköping University Electronic Press

    Though digital infrastructures such as CLARIN have been successfully established and now provide large collections of digital resources, the lack of widely accepted standards for data quality and documentation still makes re-use of research data a... mehr

     

    Though digital infrastructures such as CLARIN have been successfully established and now provide large collections of digital resources, the lack of widely accepted standards for data quality and documentation still makes re-use of research data a difficult endeavour, especially for more complex resource types. The article gives a detailed overview over relevant characteristics of audiovisual annotated language resources and reviews possible approaches to data quality in terms of their suitability for the current context. Conclusively, various strategies are suggested in order to arrive at comprehensive and adequate definitions of data quality for this specific resource type and possibly for digital language resources in general.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Datenqualität; Audiovisuelles Material; Korpus; Text Encoding Initiative; Metadaten; Forschungsdaten; Datenmanagement
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  8. The TEI-based ISO standard “Transcription of Spoken Language” as an exchange format within CLARIN and beyond
    Erschienen: 2021
    Verlag:  Utrecht : CLARIN ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper describes the TEI-based ISO standard 2462:2016 “Transcription of spoken language” and other formats used within CLARIN for spoken language resources. It assesses the current state of support for the standard and the interoperability... mehr

     

    This paper describes the TEI-based ISO standard 2462:2016 “Transcription of spoken language” and other formats used within CLARIN for spoken language resources. It assesses the current state of support for the standard and the interoperability between these formats and with relevant tools and services. The main idea behind the paper is that a digital infrastructure providing language resources and services to researchers should also allow the combined use of resources and/or services from different contexts. This requires syntactic and semantic interoperability. We propose a solution based on the ISO/TEI format and describe the necessary steps for this format to work as an exchange format with basic semantic interoperability for spoken language resources across the CLARIN infrastructure and beyond.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: ISO-Norm; Mündliche Kommunikation; Transkription; Text Encoding Initiative; Korpus; Computerlinguistik; Datenmanagement
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  9. Smart dictionary editing with LeXmart
    Erschienen: 2022
    Verlag:  Mannheim : IDS-Verlag ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    Given the relevance of interoperability, born-digital lexicographic resources as well as legacy retro-digitised dictionaries have been using structured formats to encode their data, following guidelines such as the Text Encoding Initiative or the... mehr

     

    Given the relevance of interoperability, born-digital lexicographic resources as well as legacy retro-digitised dictionaries have been using structured formats to encode their data, following guidelines such as the Text Encoding Initiative or the newest TEI Lex-0. While this new standard is being defined in a stricter approach than the original TEI dictionary schema, its reuse of element names for several types of annotation as well as the highly detailed structure makes it difficult for lexicographers to efficiently edit resources and focus on the real content. In this paper, we present the approach designed within LeXmart to facilitate the editing of TEI Lex-0 encoded resources, guaranteeing consistency through all editing processes.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Englisch, Altenglisch (420)
    Schlagworte: Computergestützte Lexikographie; Digitalisierung; Text Encoding Initiative
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  10. CMC-core: a schema for the representation of CMC corpora in TEI ; Le CMC-core : un schéma de représentation des corpus de la CMR en TEI
    Erschienen: 2023
    Verlag:  Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    In this Paper, we describe a schema and models which have been developed for the representation of corpora of computer-mediated communicatin (CMC corpora) using the representation framework provided by the Text Encoding Initiative (TEI). We... mehr

     

    In this Paper, we describe a schema and models which have been developed for the representation of corpora of computer-mediated communicatin (CMC corpora) using the representation framework provided by the Text Encoding Initiative (TEI). We characterise CMC discourse as dialogic, sequentially organised interchange between humans and point out that many features of CMC are not adequately handled by current corpus encoding schemas and tools. We formulate desiderata for a representation of CMC in encoding schemes and argue why the TEI is a suitable framework for the encoding of CMC corpora. We propose a model of basic CMC units (utterances, posts, and nonverbal activities) and the macro- and micro-level structures of interactions in CMC environments. Based on these models, we introduce CMC-core, a TEI customisation for the encoding of CMC corpora, which defines CMC-specific encoding features on the four levels of elements, model classes, attribute classes, and modules of the TEI infrastructure. The description of our customisation is illustrated by encoding examples from corpora by researchers of the TEI SIG CMC, representing a variety of CMC genres, i.e. chat, wiki talk, twitter, blog, and Second Life interactions. The material described, i.e. schemata, encoding examples, and documentation, is available from the of the TEI CMC SIG Wiki and will accompany a feature request to the TEI council in late 2019. ; Dans cet article, nous décrivons un schéma et des modèles de représentation développés pour structurer les corpus de communication médiée par ordinateur (CMC) en suivant les recommandations de la Text Encoding Initiative (TEI). Nous considérons le discours CMC comme un échange dialogique entre humains, organisé de manière séquentielle. Nous insistons d’abord sur le fait que de nombreuses caractéristiques de la CMC ne sont pas traitées de manière adéquate par les schémas et les outils actuels d’encodage de corpus. Nous formulons donc un ensemble de recommandations pour représenter la CMC avec des schémas ...

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einer Zeitschrift
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerunterstützte Kommunikation; Korpus; Text Encoding Initiative; Interaktion; Computerlinguistik
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  11. TEI Feature Structures as a Representation Format for Multiple Annotation and Generic XML Documents
    Erschienen: 2017

    Feature structures are mathematical entities (rooted labeled directed acyclic graphs) that can be represented as graph displays, attribute value matrices or as XML adhering to the constraints of a specialized TEI tag set. We demonstrate that this... mehr

     

    Feature structures are mathematical entities (rooted labeled directed acyclic graphs) that can be represented as graph displays, attribute value matrices or as XML adhering to the constraints of a specialized TEI tag set. We demonstrate that this latter ISO-standardized format can be used as an integrative storage and exchange format for sets of multiple annotation XML documents. This specific domain of application is rooted in the approach of multiple annotations, which marks a possible solution for XML-compliant markup in scenarios with conflicting annotation hierarchies. A more extreme proposal consists in the possible use as a meta-representation format for generic XML documents. For both scenarios our strategy concerning pertinent feature structure representations is grounded on the XDM (XQuery 1.0 and XPath 2.0 Data Model). The ubiquitous hierarchical and sequential relationships within XML documents are represented by specific features that take ordered list values. The mapping to the TEI feature structure format has been implemented in the form of an XSLT 2.0 stylesheet. It can be characterized as exploiting aspects of both the push and pull processing paradigm as appropriate. An indexing mechanism is provided with regard to the multiple annotation documents scenario. Hence, implicit links concerning identical primary data are made explicit in the result format. In comparison to alternative representations, the TEI-based format does well in many respects, since it is both integrative and well-formed XML. However, the result documents tend to grow very large depending on the size of the input documents and their respective markup structure. This may also be considered as a downside regarding the proposed use for generic XML documents. On the positive side, it may be possible to achieve a hookup to methods and applications that have been developed for feature structure representations in the fields of (computational) linguistics and knowledge representation.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Text Encoding Initiative; XML; Auszeichnungssprache
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  12. Lightweight grammatical annotation in the TEI: new perspectives
    Erschienen: 2018
    Verlag:  Paris, France : European language resources association (ELRA)

    In mid-2017, as part of our activities within the TEI Special Interest Group for Linguists (LingSIG), we submitted to the TEI Technical Council a proposal for a new attribute class that would gather attributes facilitating simple token-level... mehr

     

    In mid-2017, as part of our activities within the TEI Special Interest Group for Linguists (LingSIG), we submitted to the TEI Technical Council a proposal for a new attribute class that would gather attributes facilitating simple token-level linguistic annotation. With this proposal, we addressed community feedback complaining about the lack of a specific tagset for lightweight linguistic annotation within the TEI. Apart from @lemma and @lemmaRef, up till now TEI encoders could only resort to using the generic attribute @ana for inline linguistic annotation, or to the quite complex system of feature structures for robust linguistic annotation, the latter requiring relatively complex processing even for the most basic types of linguistic features. As a result, there now exists a small set of basic descriptive devices which have been made available at the cost of only very small changes to the TEI tagset. The merit of a predefined TEI tagset for lightweight linguistic annotation is the homogeneity of tagging and thus better interoperability of simple linguistic resources encoded in the TEI. The present paper introduces the new attributes, makes a case for one more addition, and presents the advantages of the new system over the legacy TEI solutions.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Text Encoding Initiative; Annotation
    Lizenz:

    creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

  13. TEI-based XML-Applications: Transcriptions
    Autor*in: Witt, Andreas
    Erschienen: 2018

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Computerlinguistik; Text Encoding Initiative; SGML; XML
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  14. Compilation and Annotation of the Discourse-structured Blog Corpus for German
    Erschienen: 2018
    Verlag:  Ljubljana : Ljubljana University Press

    The present paper reports the first results of the compilation and annotation of a blog corpus for German. The main aim of the project is the representation of the blog discourse structure and relations between its elements (blog posts, comments) and... mehr

     

    The present paper reports the first results of the compilation and annotation of a blog corpus for German. The main aim of the project is the representation of the blog discourse structure and relations between its elements (blog posts, comments) and participants (bloggers, commentators). The data included in the corpus were manually collected from the scientific blog portal SciLogs. The feature catalogue for the corpus annotation includes three types of information which is directly or indirectly provided in the blog or can be construed by means of statistical analysis or computational tools. At this point, only directly available information (e.g., title of the blog post, name of the blogger etc.) has been annotated. We believe, our blog corpus can be of interest for the general study of blog structure or related research questions as well as for the development of NLP methods and techniques (e.g. for authorship detection).

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerunterstützte Kommunikation; Korpus; Text Encoding Initiative
    Lizenz:

    creativecommons.org/licenses/by-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

  15. Extending the possibilities for collaborative work with TEI/XML through the usage of a wiki system
    Erschienen: 2018
    Verlag:  New York : ACM

    This paper presents and discusses an integrated project-specific working environment for editing TEI/XML-files and linking entities of interest to a dedicated wiki system. This working environment has been specifically tailored to the workflow in our... mehr

     

    This paper presents and discusses an integrated project-specific working environment for editing TEI/XML-files and linking entities of interest to a dedicated wiki system. This working environment has been specifically tailored to the workflow in our interdisciplinary digital humanities project GeoBib. It addresses some challenges that arose while working with person-related data and geographical references in a growing collection of TEI/XML-files. While our current solution provides some essential benefits, we also discuss several critical issues and challenges that remain.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: XML; Text Encoding Initiative; Korpus; Digital Humanities
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  16. Reply relations in CMC: types and annotation
    Erschienen: 2018
    Verlag:  Antwerpen : University of Antwerp

    This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of... mehr

     

    This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of such relations in chat, wiki talk, and blog corpora. We distinguish technical reply structures, indentation structures, and interpretative reply relations, which include reply relations induced by linguistic markers. We sort out the different levels of description and annotation that are involved and propose a solution for their combined representation within the TEI annotation framework.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Annotation; Text Encoding Initiative; Computerunterstützte Kommunikation
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  17. Reply relations in CMC: types and annotation
    Erschienen: 2018
    Verlag:  Antwerpen : University of Antwerp

    This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of... mehr

     

    This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of such relations in chat, wiki talk, and blog corpora. We distinguish technical reply structures, indentation structures, and interpretative reply relations, which include reply relations induced by linguistic markers. We sort out the different levels of description and annotation that are involved and propose a solution for their combined representation within the TEI annotation framework.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerunterstützte Kommunikation; Korpus; Annotation; Text Encoding Initiative; Antwort
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  18. cmc-core: a basic schema for encoding CMC corpora in TEI
    Erschienen: 2019
    Verlag:  Cergy-Pontoise, France : Cergy-Pontoise University, France

    Since 2013 representatives of several French and German CMC corpus projects have developed three customizations of the TEI-P5 standard for text encoding in order to adapt the encoding schema and models provided by the TEI to the structural... mehr

     

    Since 2013 representatives of several French and German CMC corpus projects have developed three customizations of the TEI-P5 standard for text encoding in order to adapt the encoding schema and models provided by the TEI to the structural peculiarities of CMC discourse. Based on the three schema versions, a 4th version has been created which takes into account the experiences from encoding our corpora and which is specifically designed for the submission of a feature request to the TEI council. On our poster we would present the structure of this schema and its relations (commonalities and differences) to the previous schemas.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Text Encoding Initiative; Deutsch; Englisch
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  19. CLARIN Web Services for TEI-annotated Transcripts of Spoken Language
    Erschienen: 2020
    Verlag:  Utrecht : CLARIN

    We present web services implementing a workflow for transcripts of spoken language following TEI guidelines, in particular ISO 24624:2016 "Language resource management - Transcription of spoken language". The web services are available at our website... mehr

     

    We present web services implementing a workflow for transcripts of spoken language following TEI guidelines, in particular ISO 24624:2016 "Language resource management - Transcription of spoken language". The web services are available at our website and will be available via the CLARIN infrastructure, including the Virtual Language Observatory and WebLicht.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Text Encoding Initiative; Gesprochene Sprache; Transkription; Computerlinguistik; Web Services
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  20. Uralic multimedia corpora: ISO/TEI corpus data in the project INEL
    Erschienen: 2020
    Verlag:  Stroudsburg, PA : Association for Computational Linguistics

    In this paper, we describe a data processing pipeline used for annotated spoken corpora of Uralic languages created in the INEL (Indigenous Northern Eurasian Languages) project. With this processing pipeline we convert the data into a loss-less... mehr

     

    In this paper, we describe a data processing pipeline used for annotated spoken corpora of Uralic languages created in the INEL (Indigenous Northern Eurasian Languages) project. With this processing pipeline we convert the data into a loss-less standard format (ISO/TEI) for long-term preservation while simultaneously enabling a powerful search in this version of the data. For each corpus, the input we are working with is a set of files in EXMARaLDA XML format, which contain transcriptions, multimedia alignment, morpheme segmentation and other kinds of annotation. The first step of processing is the conversion of the data into a certain subset of TEI following the ISO standard ’Transcription of spoken language’ with the help of an XSL transformation. The primary purpose of this step is to obtain a representation of our data in a standard format, which will ensure its long-term accessibility. The second step is the conversion of the ISO/TEI files to a JSON format used by the “Tsakorpus” search platform. This step allows us to make the corpora available through a web-based search interface. As an addition, the existence of such a converter allows other spoken corpora with ISO/TEI annotation to be made accessible online in the future. ; Tässä paperissa kuvataan aineistonnprosessointimenetelmä joka on käytössä uralilaisten puhuttujen korpusten luonnissa kieltedokumentointiprojekti INELissä. Prosessointimenetelmää käytetään konvertoimaan dataa häviöttömään ISO/TEI- standardiformaattiin pitkän aikavälin säilytystä varten sekä samanaikaisesti tehokkaisiin hakutoimintoihin tälle akineistoversiolle. Jokaisen korpuksen lähtöaineistona on joukko tiedostoja EXMARaLDAn XML-formaatissa, joka sisältää transkriptejä, multimediaa kohdennuksineen, morfeemijäsennyksiä ja muita annotaatiota. Ensimmäinen käsittelyaskel on aineiston konvertointi TEI:n osajoukkoon, joka muodostaa ISO-standardin puhutun kielen transkripteille, XSL-transformaatioita käyttäen. Tämän askelen ensisijainen tarkoitus on saada aineisto sellaiseen ...

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Computerlinguistik; Uralische Sprachen; Korpus; Text Encoding Initiative; Gesprochene Sprache; Annotation
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  21. Using Full Text Indices for Querying Spoken Language Data
    Erschienen: 2020
    Verlag:  Paris : European Language Resources Association

    As a part of the ZuMult-project, we are currently modelling a backend architecture that should provide query access to corpora from the Archive of Spoken German (AGD) at the Leibniz-Institute for the German Language (IDS). We are exploring how to... mehr

     

    As a part of the ZuMult-project, we are currently modelling a backend architecture that should provide query access to corpora from the Archive of Spoken German (AGD) at the Leibniz-Institute for the German Language (IDS). We are exploring how to reuse existing search engine frameworks providing full text indices and allowing to query corpora by one of the corpus query languages (QLs) established and actively used in the corpus research community. For this purpose, we tested MTAS - an open source Lucene-based search engine for querying on text with multilevel annotations. We applied MTAS on three oral corpora stored in the TEI-based ISO standard for transcriptions of spoken language (ISO 24624:2016). These corpora differ from the corpus data that MTAS was developed for, because they include interactions with two and more speakers and are enriched, inter alia, with timeline-based annotations. In this contribution, we report our test results and address issues that arise when search frameworks originally developed for querying written corpora are being transferred into the field of spoken language.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Korpus; Abfrage; Gesprochene Sprache; Text Encoding Initiative; Computerlinguistik
    Lizenz:

    creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

  22. Data modelling
    Erschienen: 2024
    Verlag:  Berlin/Boston : de Gruyter ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    In this chapter, we will first discuss different data formats in which structured content can be formally represented, explaining their respective advantages and disadvantages, and how suitable query languages can be used to retrieve information from... mehr

     

    In this chapter, we will first discuss different data formats in which structured content can be formally represented, explaining their respective advantages and disadvantages, and how suitable query languages can be used to retrieve information from these data structures. The third section covers the core issues of data modelling – how to describe the structure of specific lexicographical content, e.g. which “boxes” the lexicographic content should be put in – both in abstract terms and with reference to the data structures introduced before. put in, including the associated advantages and disadvantages. There are many lexicographic projects that face largely similar challenges. For this reason, initiatives have been launched oriented towards developing standardised solutions for modelling lexicographic data, similar to a set of guidelines for sorting Lego bricks. We report on these in the fourth section.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Datenmodellierung; Abfragesprache; Lexikografie; XML; Datenmodell; Text Encoding Initiative
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess