Suchergebnisse

Intra-connecting an exemplary literary corpus with semantic web technologies for exploratory literary studies

Autor*in: Dittrich, Andreas

Erschienen: 2017

Verlag: Institut für Deutsche Sprache, Bibliothek, Mannheim

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Verbundkataloge
Beteiligt:	Bański, Piotr (Herausgeber); Kupietz, Marc (Herausgeber); Lüngen, Harald (Herausgeber); Rayson, Paul (Herausgeber); Biber, Hanno (Herausgeber); Breiteneder, Evelyn (Herausgeber); Clematide, Simon (Herausgeber); Mariani, John (Herausgeber); Stevenson, Mark (Herausgeber); Sick, Theresa (Herausgeber)
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Online
Weitere Identifier:	urn: urn:nbn:de:bsz:mh39-62441
Schlagworte:	Korpus <Linguistik>; Literatur; Österreich; Aichinger, Ilse; Text Encoding Initiative (TEI); Intertextualität; Semantic Web; Digital Humanities
Weitere Schlagworte:	Word associations; Corpus linguistics; Intertextuality; Literary corpus
Umfang:	Online-Ressource
Bemerkung(en):	In: Proceedings of the Workshop on Challenges in the Management of Large Corpora and Big Data and Natural Language Processing (CMLC-5+BigNLP) 2017 including the papers from the Web-as-Corpus (WAC-XI) guest section. Birmingham, 24 July 2017. - Mannheim : Institut für Deutsche Sprache, 2017., S. 1-6

SusTEInability of linguistic resources through feature structures

Autor*in: Witt, Andreas ; Rehm, Georg ; Hinrichs, Erhard ; Lehmberg, Timm ; Stegmann, Jens

Erschienen: 2015

Verlag: Oxford : Oxford University Press

This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/4490 https://ids-pub.bsz-bw.de/files/4490/Witt_Rehm_Hinrichs_SusTEInability_of_Linguisitc_Resources_2009-1.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-44901

This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Linguistik (410)
Schlagworte:	Programmiersprache; Annotation; Text Encoding Initiative (TEI)
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Integrating corpora of computer-mediated communication in CLARIN-D: Results from the curation project ChatCorpus2CLARIN

Autor*in: Lüngen, Harald ; Beißwenger, Michael ; Ehrhardt, Eric ; Herold, Axel ; Storrer, Angelika

Erschienen: 2016

Verlag: Bochum : Sprachwissenschaftliches Institut, Ruhr-Universität Bochum

We introduce our pipeline to integrate CMC and SM corpora into the CLARIN-D corpus infrastructure. The pipeline was developed by transforming an existing CMC corpus, the Dortmund Chat Corpus, into a resource conforming to current technical and legal... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/5574 https://ids-pub.bsz-bw.de/files/5574/Luengen_Beisswenger_Ehrhardt_Herold_Storrer_Integrating_corpora_2016.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-55743

We introduce our pipeline to integrate CMC and SM corpora into the CLARIN-D corpus infrastructure. The pipeline was developed by transforming an existing CMC corpus, the Dortmund Chat Corpus, into a resource conforming to current technical and legal standards. We describe how the resource has been prepared and restructured in terms of TEI encoding, linguistic annotations, and anonymisation. The output is a CLARIN-conformant resource integrated in the CLARIN-D research infrastructure.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Deutsch; Chatten <Kommunikation>; Korpus; Text Encoding Initiative (TEI)
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

(Best) Practices for Annotating and Representing CMC and Social Media Corpora in CLARIN-D

Autor*in: Beißwenger, Michael ; Ehrhardt, Eric ; Herold, Axel ; Lüngen, Harald ; Storrer, Angelika

Erschienen: 2016

Verlag: Ljubljana : Academic Publishing Division of the Faculty of Arts of the University of Ljubljana

The paper reports the results of the curation project ChatCorpus2CLARIN. The goal of the project was to develop a workflow and resources for the integration of an existing chat corpus into the CLARIN-D research infrastructure for language resources... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/5581 https://ids-pub.bsz-bw.de/files/5581/Beisswenger_Ehrhardt_Herold_Luengen_Storrer_Best_Practices_for_Annotating_and_Representing_2016.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-55810

The paper reports the results of the curation project ChatCorpus2CLARIN. The goal of the project was to develop a workflow and resources for the integration of an existing chat corpus into the CLARIN-D research infrastructure for language resources and tools in the Humanities and the Social Sciences (http://clarin-d.de). The paper presents an overview of the resources and practices developed in the project, describes the added value of the resource after its integration and discusses, as an outlook, to what extent these practices can be considered best practices which may be useful for the annotation and representation of other CMC and social media corpora.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Chatten <Kommunikation>; Deutsch; Text Encoding Initiative (TEI)
Lizenz:	creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

Intra-connecting an exemplary literary corpus with semantic web technologies for exploratory literary studies

Autor*in: Dittrich, Andreas

Erschienen: 2017

Verlag: Mannheim : Institut für Deutsche Sprache

Many (modernist) works of literature can be understood by their associativeness, be it constructed or “free”. This network-like character of (modernist) literature has often been addressed by terms like “free association”, connotation”, “context” or... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/6244 https://ids-pub.bsz-bw.de/files/6244/Dittrich_Intra-connecting_2017.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-62441

Many (modernist) works of literature can be understood by their associativeness, be it constructed or “free”. This network-like character of (modernist) literature has often been addressed by terms like “free association”, connotation”, “context” or “intertext”. This paper proposes an experimental and exemplary approach to intraconnect a literary corpus of the Austrian writer Ilse Aichinger with semantic web-technologies to enable interactive explorations of word-associations.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Literatur; Österreich; Aichinger; Ilse; Text Encoding Initiative (TEI); Intertextualität; Semantic Web; Digital Humanities
Lizenz:	creativecommons.org/licenses/by-nc-nd/3.0/de/deed.de ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Bereich

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Intra-connecting an exemplary literary corpus with semantic web technologies for exploratory literary studies

SusTEInability of linguistic resources through feature structures

Integrating corpora of computer-mediated communication in CLARIN-D: Results from the curation project ChatCorpus2CLARIN

(Best) Practices for Annotating and Representing CMC and Social Media Corpora in CLARIN-D

Intra-connecting an exemplary literary corpus with semantic web technologies for exploratory literary studies

Kontakt

Partner