Suchergebnisse

Clemens Räthel (Hg.): Den Ädelmodiga Abbedissan / Die edelmütige Äbtissin. Berliner Beiträge zur Skandinavistik, Band 28. Berlin: Nordeuropa-Institut 2021, 245 S.

Autor*in: Huber, Patrizia

Erschienen: 2023

Verlag: Humboldt-Universität zu Berlin

Volltext:	http://edoc.hu-berlin.de/18452/27667
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:kobv:11-110-18452/27667-8 https://doi.org/10.18452/26979

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Rezension
Format:	Online
DDC Klassifikation:	Andere germanische Literaturen (839); Literaturen germanischer Sprachen; Deutsche Literatur (830)
Schlagworte:	Annotation; Sammelband; Den Ädelmodiga Abbedissan; Die edelmütige Äbtissin; Prinzessinnenbibliothek
Lizenz:	creativecommons.org/licenses/by/4.0/

Patrick Ledderose: Dramatische Zeiten. Zeitkonzepte in skandinavischen Theatertexten um 1900 und 2000. Nordica, Band 28. Baden- Baden: Rombach Wissenschaft 2021, 391 S.

Autor*in: Arntzen, Knut Ove

Erschienen: 2023

Verlag: Humboldt-Universität zu Berlin

Volltext:	http://edoc.hu-berlin.de/18452/28140
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:kobv:11-110-18452/28140-3 https://doi.org/10.18452/27472

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Rezension
Format:	Online
DDC Klassifikation:	Andere germanische Literaturen (839); Literaturen germanischer Sprachen; Deutsche Literatur (830)
Schlagworte:	Annotation; Rezension; Drama; Dramatik; Theater; Theatertexte; Skandinavien
Lizenz:	creativecommons.org/licenses/by/4.0/

„… ein Gemisch von Gehörtem und selbst Zugeseztem“ ; Nachschriften der ‚Kosmos-Vorträge‘ Alexander von Humboldts: Dokumentation, Kontextualisierung und exemplarische Analysen

Autor*in: Thomas, Christian

Erschienen: 2023

Verlag: Humboldt-Universität zu Berlin

Diese Dissertationsschrift ist angesiedelt im Bereich Digitaler Edition archivalischer Quellen, deren Erschließung und (computergestützter) Analyse. Im Zentrum stehen die sog. Kosmos-Vorträge, die Alexander von Humboldts 1827/28 in zwei... mehr

Volltext:	http://edoc.hu-berlin.de/18452/28342
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:kobv:11-110-18452/28342-6 https://doi.org/10.18452/27521

Diese Dissertationsschrift ist angesiedelt im Bereich Digitaler Edition archivalischer Quellen, deren Erschließung und (computergestützter) Analyse. Im Zentrum stehen die sog. Kosmos-Vorträge, die Alexander von Humboldts 1827/28 in zwei Vortragszyklen in Berlin gehalten hat. Diese werden als gleichwertige, zweifache Publikationen in Humboldts Werkbiographie eingeordnet. In einem zentralen Kapitel (Kap. 7) geht es mir um eine editionstheoretische Fundierung der Edition von Vorlesungsnachschriften, zunächst allgemein und dann bezogen auf die Nachschriften der Kosmos-Vorträge. Zuvor wird das Forschungsfeld beleuchtet, da über die Rahmenbedingungen und Inhalte der beiden Vortragsreihen bislang nur wenig bekannt war. Humboldts Motivation zu diesen Vorträgen, deren Zusammenhang mit dem Kosmos (1845–62) und weiteren seiner Publikationen, sowie die jeweiligen organisatorischen Rahmenbedingungen werden untersucht. Inhaltlich sind die Kosmos-Vorträge bislang wenig erforscht worden, unter anderem weil die wichtigsten Quellen nicht rezipiert wurden. Dank der Digitalisierung des Humboldt-Nachlasses und vor allem durch die Digitale Edition der Nachschriften aus dem Hörerkreis sind die Voraussetzungen dafür mittlerweile sehr viel besser. Um die künftige Arbeit mit diesen Dokumenten zu unterstützen, dokumentiere und reflektiere ich in Kapitel 8 die praktische Umsetzung des Editionsmodells gemäß den Richtlinien der Text Encoding Initiative (TEI). Anschließend stelle ich die edierten Nachschriften aus beiden Vortragszyklen vor und zeige, wie sich mit den digitalen Volltexten arbeiten lässt. Dabei kommen quantitative Untersuchungen und Verfahren wie automatische Kollation bzw. Plagiatssuche, aber auch ‚traditionell hermeneutische‘ Methoden zum Einsatz. Schließlich geht es mir in meiner Arbeit darum, die Grundlage für die weitere Erforschung der beiden Vortragsreihen wesentlich zu verbessern und anhand einiger exemplarischer Analysen erste Schritte in diese Richtung zu unternehmen. ; This dissertation is located in the field of ...

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Dissertation
Format:	Online
DDC Klassifikation:	Literaturen germanischer Sprachen; Deutsche Literatur (830)
Schlagworte:	Alexander von Humboldt; Kosmos-Vorträge; Nachschriften; Edition; Korpus; Annotation; Text Encoding Initiative / TEI-XML; Digital Humanities; Digitale Edition; Kosmos-Lectures; Attendee's Notebooks; Corpus; Digital Scholarly Edition
Lizenz:	creativecommons.org/licenses/by-sa/4.0/

Soziale Netzwerkanalysen zum mittelhochdeutschen Artusroman oder: Vorgreiflicher Versuch, Märchenhaftigkeit des Erzählens zu messen:Anhang

Autor*in: Braun, M. (Manuel) ; Ketschik, N. (Nora)

Erschienen: 2019

Begleitende Datenpublikation zum Aufsatz "Soziale Netzwerkanalysen zum mittelhochdeutschen Artusroman oder: Vorgreiflicher Versuch, Märchenhaftigkeit des Erzählens zu messen" von Manuel Braun und Nora Ketschik, der im Themenheft "Digitale... mehr

Volltext:	https://miami.uni-muenster.de/Record/96be20cc-6e51-4817-9e05-1bd194cda970 https://repositorium.uni-muenster.de/transfer/miami/96be20cc-6e51-4817-9e05-1bd194cda970
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:hbz:6-55189457397 https://doi.org/10.17879/55189456328

Begleitende Datenpublikation zum Aufsatz "Soziale Netzwerkanalysen zum mittelhochdeutschen Artusroman oder: Vorgreiflicher Versuch, Märchenhaftigkeit des Erzählens zu messen" von Manuel Braun und Nora Ketschik, der im Themenheft "Digitale Mediävistik" der Zeitschrift "Das Mittelalter", Band 24/1 (2019) erschien. Enthalten sind die Abbildungen zum Beitrag.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Weitere
Format:	Online
DDC Klassifikation:	Literaturen germanischer Sprachen; Deutsche Literatur (830)
Schlagworte:	Soziale Netzwerkanalyse; Annotation; Entitätserkennung; Artusromane; europäische Volksmärchen; Social Network Analysis; Entity Extraction; Arthurian Romance; European Folktale; German literature and literatures of related languages
Lizenz:	CC BY-NC 4.0 ; info:eu-repo/semantics/openAccess

Anschaulichkeit messen. Eine quantitative Metaphernanalyse an deutschsprachigen Erzählanfängen zwischen 1880 und 1926

Autor*in: Herrmann, J. Berenike

Erschienen: 2018

Die vorliegende Arbeit erforscht mögliche Bedingungen des Zustandekommens eines Anschaulichkeitspotenzials deutschsprachiger Erzähltexte des Zeitraums 1880-1926. Das Vorgehen ist das einer quantitativen Analyse metaphorischen Sprachgebrauchs, wobei... mehr

Zitierfähiger Link:

https://nbn-resolving.org/urn:nbn:de:0070-pub-29560091

Die vorliegende Arbeit erforscht mögliche Bedingungen des Zustandekommens eines Anschaulichkeitspotenzials deutschsprachiger Erzähltexte des Zeitraums 1880-1926. Das Vorgehen ist das einer quantitativen Analyse metaphorischen Sprachgebrauchs, wobei ich eine Synthese von Kognitiver Linguistik, formalisierter Korpuslinguistik, gebrauchsbasierter Textlinguistik und traditioneller Rhetorik erprobe. Metaphorische Sprache, und zwar insbesondere konventionalisierte Metaphorik, wird dabei in Form der Hauptwortarten lexiko-grammatikalisch differenziert, um ausgehend von den typischen Diskursfunktionen der Wortarten (v. a. Referenzialierung von Objekten, Zuständen, Prozessen, Eigenschaften, und Relationen) Muster bezüglich eines Anschaulichkeitspotenzials zu erforschen. Das untersuchte Korpus wurde aus dem digitalen Referenzkorpus Deutsches Textarchiv extrahiert und besteht aus den Eingangspassagen 35 literarischer Werke. Der Texteingang wird als Schlüsselstelle der Kommunikation zwischen Text und Leser gewählt, die besondere Anforderungen an die Gestaltung des Diskurses stellt, indem sie repräsentative, aber auch persuasive Funktionen hat.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einem Sammelband; Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Literaturen germanischer Sprachen; Deutsche Literatur (830)
Schlagworte:	Digital Humanities; Metaphern; Anschaulichkeit; Annotation; Korpus; Stil; Moderne; Realismus
Lizenz:	CC0 1.0

Lightweight grammatical annotation in the TEI: new perspectives

Autor*in: Bański, Piotr ; Haaf, Susanne ; Mueller, Martin

Erschienen: 2018

Verlag: Paris, France : European language resources association (ELRA)

In mid-2017, as part of our activities within the TEI Special Interest Group for Linguists (LingSIG), we submitted to the TEI Technical Council a proposal for a new attribute class that would gather attributes facilitating simple token-level... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/7487 https://ids-pub.bsz-bw.de/files/7487/Banski_Haaf_Mueller_Lightweight_grammatical_annotation_2018.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-74879

In mid-2017, as part of our activities within the TEI Special Interest Group for Linguists (LingSIG), we submitted to the TEI Technical Council a proposal for a new attribute class that would gather attributes facilitating simple token-level linguistic annotation. With this proposal, we addressed community feedback complaining about the lack of a specific tagset for lightweight linguistic annotation within the TEI. Apart from @lemma and @lemmaRef, up till now TEI encoders could only resort to using the generic attribute @ana for inline linguistic annotation, or to the quite complex system of feature structures for robust linguistic annotation, the latter requiring relatively complex processing even for the most basic types of linguistic features. As a result, there now exists a small set of basic descriptive devices which have been made available at the cost of only very small changes to the TEI tagset. The merit of a predefined TEI tagset for lightweight linguistic annotation is the homogeneity of tagging and thus better interoperability of simple linguistic resources encoded in the TEI. The present paper introduces the new attributes, makes a case for one more addition, and presents the advantages of the new system over the legacy TEI solutions.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Text Encoding Initiative; Annotation
Lizenz:	creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

Aspekte der texttechnologischen Modellierung

Autor*in: Lobin, Henning

Erschienen: 2018

Verlag: Wiesbaden : VS Verlag für Sozialwissenschaften

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/7616 https://ids-pub.bsz-bw.de/files/7616/Lobin_Mehler_Aspekte_der_texttechnologischen_Modellierung_2004.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-76166

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Texttechnologie; Natürlichsprachiges System; Automatische Sprachanalyse; Annotation; XML; SGML
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Dokumentgrammatiken als Grundlage von XML-Tools

Autor*in: Lobin, Henning

Erschienen: 2018

Verlag: Wiesbaden : VS Verlag für Sozialwissenschaften

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/7617 https://ids-pub.bsz-bw.de/files/7617/Lobin_Dokumentgrammatiken_als_Grundlage_XML_Tools_2004.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-76173

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Automatische Sprachverarbeitung; Annotation; Strukturbaum; Informationsstruktur; Informationsmodellierung; Natürlichsprachiges System; XML; SGML
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Textauszeichnungssprachen und Dokumentgrammatiken

Autor*in: Lobin, Henning

Erschienen: 2018

Verlag: Tübingen : Stauffenburg Verlag

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/7618 https://ids-pub.bsz-bw.de/files/7618/Lobin_Textauszeichnung_und_Dokumentgrammatiken_2004.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-76184

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Annotation; Auszeichnungssprache; Strukturbaum; Grammatiktheorie; XML; SGML
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

The role of generic and logical document structure in relational discourse analysis

Autor*in: Bärenfänger, Maja ; Lüngen, Harald ; Hilbert, Mirco ; Lobin, Henning

Erschienen: 2018

Verlag: Amsterdam/ Philadelphia : Benjamins

This study examines what kind of cues and constraints for discourse interpretation can be derived from the logical and generic document structure of complex texts by the example of scientific journal articles. We performed statistical analysis on a... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/7764 https://ids-pub.bsz-bw.de/files/7764/Baerenfaenger_Luengen_Hilbert_Lobin_Role_of_generic_and_logic_2010_Postprint.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-77647 https://doi.org/10.1075/pbns.194.05bar

This study examines what kind of cues and constraints for discourse interpretation can be derived from the logical and generic document structure of complex texts by the example of scientific journal articles. We performed statistical analysis on a corpus of scientific articles annotated on different annotations layers within the framework of XML-based multi-layer annotation. We introduce different discourse segment types that constrain the textual domains in which to identify rhetorical relation spans, and we show how a canonical sequence of text type structure categories is derived from the corpus annotations. Finally, we demonstrate how and which text type structure categories assigned to complex discourse segments of the type “block” statistically constrain the occurrence of rhetorical relation types.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Diskursanalyse; Texttechnologie; Korpus; Wissenschaftssprache; Annotation
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

A syntax-based scheme for the annotation and segmentation of German spoken language interactions

Autor*in: Westpfahl, Swantje ; Gorisch, Jan

Erschienen: 2018

Verlag: Stroudsburg, PA, USA : Association for Computational Linguistics

Unlike corpora of written language where segmentation can mainly be derived from orthographic punctuation marks, the basis for segmenting spoken language corpora is not predetermined by the primary data, but rather has to be established by the corpus... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/7923 https://ids-pub.bsz-bw.de/files/7923/Westpfahl_Gorisch_A_syntax_based_scheme_for_the_annotation_2018.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-79235

Unlike corpora of written language where segmentation can mainly be derived from orthographic punctuation marks, the basis for segmenting spoken language corpora is not predetermined by the primary data, but rather has to be established by the corpus compilers. This impedes consistent querying and visualization of such data. Several ways of segmenting have been proposed, some of which are based on syntax. In this study, we developed and evaluated annotation and segmentation guidelines in reference to the topological field model for German. We can show that these guidelines are used consistently across annotators. We also investigated the influence of various interactional settings with a rather simple measure, the word-count per segment and unit-type. We observed that the word count and the distribution of each unit type differ in varying interactional settings and that our developed segmentation and annotation guidelines are used consistently across annotators. In conclusion, our syntax-based segmentations reflect interactional properties that are intrinsic to the social interactions that participants are involved in. This can be used for further analysis of social interaction and opens the possibility for automatic segmentation of transcripts.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Gesprochene Sprache; Korpus; Segmentierung; Annotation
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Sprucing up the trees – error detection in treebanks

Autor*in: Rehbein, Ines ; Ruppenhofer, Josef

Erschienen: 2018

Verlag: Stroudsburg PA, USA : The Association for Computational Linguistics

We present a method for detecting annotation errors in manually and automatically annotated dependency parse trees, based on ensemble parsing in combination with Bayesian inference, guided by active learning. We evaluate our method in different... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/7993 https://ids-pub.bsz-bw.de/files/7993/Rehbein_Ruppenhofer_Sprucing_up_the_trees_2018.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-79938

We present a method for detecting annotation errors in manually and automatically annotated dependency parse trees, based on ensemble parsing in combination with Bayesian inference, guided by active learning. We evaluate our method in different scenarios: (i) for error detection in dependency treebanks and (ii) for improving parsing accuracy on in- and out-of-domain data.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Automatische Spracherkennung; Annotation; Parser
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Universal Dependencies are hard to parse – or are they?

Autor*in: Rehbein, Ines ; Steen, Julius ; Do, Bich-Ngoc ; Frank, Anette

Erschienen: 2018

Verlag: Linköping, Schweden : Linköping University Electronic Press

Universal Dependency (UD) annotations, despite their usefulness for cross-lingual tasks and semantic applications, are not optimised for statistical parsing. In the paper, we ask what exactly causes the decrease in parsing accuracy when training a... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8023 https://ids-pub.bsz-bw.de/files/8023/Rehbein_etal._Universal_dependences_are_hard_to%20parse_2017.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-80232

Universal Dependency (UD) annotations, despite their usefulness for cross-lingual tasks and semantic applications, are not optimised for statistical parsing. In the paper, we ask what exactly causes the decrease in parsing accuracy when training a parser on UD-style annotations and whether the effect is similarly strong for all languages. We conduct a series of experiments where we systematically modify individual annotation decisions taken in the UD scheme and show that this results in an increased accuracy for most, but not for all languages. We show that the encoding in the UD scheme, in particular the decision to encode content words as heads, causes an increase in dependency length for nearly all treebanks and an increase in arc direction entropy for many languages, and evaluate the effect this has on parsing accuracy.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Syntax; Annotation; Parser; Universalgrammatik
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Detecting annotation noise in automatically labelled data

Autor*in: Rehbein, Ines ; Ruppenhofer, Josef

Erschienen: 2018

Verlag: Stroudsburg PA, USA : The Association for Computational Linguistics

We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8034 https://ids-pub.bsz-bw.de/files/8034/Rehbein_Ruppenhofer_Detecting_annotation_noise_2017.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-80343 https://doi.org/10.18653/v1/P17-1107

We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Computerlinguistik; Automatische Sprachverarbeitung; Annotation; Fehleranalyse
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

POS tagset refinement for linguistic analysis and the impact on statistical parsing

Autor*in: Rehbein, Ines ; Hirschmann, Hagen

Erschienen: 2018

Verlag: Tübingen : University of Tübingen

The annotation of parts of speech (POS) in linguistically annotated corpora is a fundamental annotation layer which provides the basis for further syntactic analyses, and many NLP tools rely on POS information as input. However, most POS annotation... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8036 https://ids-pub.bsz-bw.de/files/8036/Rehbein_Hirschmann_POS_tagset_refinement_2014.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-80368

The annotation of parts of speech (POS) in linguistically annotated corpora is a fundamental annotation layer which provides the basis for further syntactic analyses, and many NLP tools rely on POS information as input. However, most POS annotation schemes have been developed with written (newspaper) text in mind and thus do not carry over well to text from other domains and genres. Recent discussions have concentrated on the shortcomings of present POS annotation schemes with regard to their applicability to data from domains other than newspaper text.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Korpus; Parts of speech; Syntaktische Analyse; Annotation
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Reply relations in CMC: types and annotation

Autor*in: Lüngen, Harald ; Herzberg, Laura

Erschienen: 2018

Verlag: Antwerpen : University of Antwerp

This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8041 https://ids-pub.bsz-bw.de/files/8041/Luengen_Herzberg_Reply_relations_in_CMC_2018.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-80414

This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of such relations in chat, wiki talk, and blog corpora. We distinguish technical reply structures, indentation structures, and interpretative reply relations, which include reply relations induced by linguistic markers. We sort out the different levels of description and annotation that are involved and propose a solution for their combined representation within the TEI annotation framework.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Annotation; Text Encoding Initiative; Computerunterstützte Kommunikation
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Reply relations in CMC: types and annotation

Autor*in: Lüngen, Harald ; Herzberg, Laura

Erschienen: 2018

Verlag: Antwerpen : University of Antwerp

This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8126 https://ids-pub.bsz-bw.de/files/8126/Luengen_Herzberg_Reply_relations_in_CMC_2018.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-81268

This paper analyses reply relations in computer-mediated communication (CMC), which occur between post units in CMC interactions and which describe references between posts. We take a look at existing practices in the description and annotation of such relations in chat, wiki talk, and blog corpora. We distinguish technical reply structures, indentation structures, and interpretative reply relations, which include reply relations induced by linguistic markers. We sort out the different levels of description and annotation that are involved and propose a solution for their combined representation within the TEI annotation framework.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Computerunterstützte Kommunikation; Korpus; Annotation; Text Encoding Initiative; Antwort
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Modeling and annotating complex data structures

Autor*in: Bański, Piotr ; Witt, Andreas

Erschienen: 2018

Verlag: London u.a. : Routledge, Taylor & Francis Group

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8222 https://ids-pub.bsz-bw.de/files/8222/Banski_Witt_Modeling_and_annotating_complex_data_structures_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-82229 https://doi.org/10.4324/9781315552941

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Digital Humanities; Annotation; Datenstruktur; Argumentstruktur
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Redewiedergabe – Schritte zur automatischen Erkennung ; Speech, thought and writing representation – towards automatic detection

Autor*in: Brunner, Annelen

Erschienen: 2019

Verlag: Berlin [u.a.] : de Gruyter

This contribution presents a quantitative approach to speech, thought and writing representation (ST&WR) and steps towards its automatic detection. Automatic detection is necessary for studying ST&WR in a large number of texts and thus identifying... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8769 https://ids-pub.bsz-bw.de/files/8769/Brunner_Redewiedergabe_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-87699 https://doi.org/10.1515/zgl-2019-0007

This contribution presents a quantitative approach to speech, thought and writing representation (ST&WR) and steps towards its automatic detection. Automatic detection is necessary for studying ST&WR in a large number of texts and thus identifying developments in form and usage over time and in different types of texts. The contribution summarizes results of a pilot study: First, it describes the manual annotation of a corpus of short narrative texts in relation to linguistic descriptions of ST&WR. Then, two different techniques of automatic detection – a rule-based and a machine learning approach – are described and compared. Evaluation of the results shows success with automatic detection, especially for direct and indirect ST&WR.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Sprachstatistik; Automatische Sprachanalyse; Redewiedergabe; Annotation; Korpus
Lizenz:	creativecommons.org/licenses/by-nc-nd/3.0/de/deed.de ; info:eu-repo/semantics/openAccess

Das Redewiedergabe-Korpus. Eine neue Ressource

Autor*in: Brunner, Annelen ; Weimer, Lukas ; Tu, Ngoc Duyen Tanja ; Engelberg, Stefan ; Jannidis, Fotis

Erschienen: 2019

Verlag: Frankfurt am Main : Zenodo

In diesem Beitrag wird das Redewiedergabe-Korpus (RW-Korpus) vorgestellt, ein historisches Korpus fiktionaler und nicht-fiktionaler Texte, das eine detaillierte manuelle Annotation mit Redewiedergabeformen enthält. Das Korpus entsteht im Rahmen eines... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8771 https://ids-pub.bsz-bw.de/files/8771/Brunner_Weimer_Tu_Engelberg_Jannidis_Das_Redewiedergabe_Korpus_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-87710 https://doi.org/10.5281/zenodo.2600812

In diesem Beitrag wird das Redewiedergabe-Korpus (RW-Korpus) vorgestellt, ein historisches Korpus fiktionaler und nicht-fiktionaler Texte, das eine detaillierte manuelle Annotation mit Redewiedergabeformen enthält. Das Korpus entsteht im Rahmen eines laufenden DFG-Projekts und ist noch nicht endgültig abgeschlossen, jedoch ist für Frühjahr 2019 ein Beta-Release geplant, welches der Forschungsgemeinschaft zur Verfügung gestellt wird. Das endgültige Release soll im Frühjahr 2020 erfolgen. Das RW-Korpus stellt eine neuartige Ressource für die Redewiedergabe-Forschung dar, die in dieser Detailliertheit für das Deutsche bisher nicht verfügbar ist, und kann sowohl für quantitative linguistische und literaturwissenschaftliche Untersuchungen als auch als Trainingsmaterial für maschinelles Lernen dienen.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Redewiedergabe; Annotation; Automatische Spracherkennung; Deutsch
Lizenz:	creativecommons.org/licenses/by-sa/4.0/ ; info:eu-repo/semantics/openAccess

Guideline: Syntactic annotation and segmentation in the SegCor Project

Autor*in: Westpfahl, Swantje ; Proske, Nadine ; Hobich, Melanie ; Borlinghaus, Anton ; Strub, Hanna

Erschienen: 2019

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache

Bibliographische Angaben
Zugang

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8919 https://ids-pub.bsz-bw.de/files/8919/Westpfahl_etal._Guideline_Syntactic_annotation_2018.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-89194 https://doi.org/10.14618/ids-pub-8919

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Bericht
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Gesprochene Sprache; Korpus; Syntax; Annotation
Lizenz:	creativecommons.org/licenses/by-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

Types and annotation of reply relations in computer-mediated communication

Autor*in: Lüngen, Harald ; Herzberg, Laura

Erschienen: 2019

Verlag: Berlin [u.a.] : de Gruyter

This paper presents types and annotation layers of reply relations in computer- mediated communication (CMC). Reply relations hold between post units in CMC interactions and describe references from one given post to a previous post. We classify... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9264 https://ids-pub.bsz-bw.de/files/9264/Herzberg_Luengen_Types_and_annotation_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-92645 https://doi.org/10.1515/eujal-2019-0006

This paper presents types and annotation layers of reply relations in computer- mediated communication (CMC). Reply relations hold between post units in CMC interactions and describe references from one given post to a previous post. We classify three types of reply relations in CMC interactions: first, technical replies, i. e. the possibility to reply directly to a previous post by clicking a ‘reply’ button; second, indentations, e. g. in wiki talk pages in which users insert their contributions in the existing talk page by indenting them and third, interpretative reply relations, i. e. the reply action is not realised formally but signalled by other structural or linguistics means such as address markers ‘@’, greetings, citations and/or Q-A structures. We take a look at existing practices in the description and representation of such relations in corpora and examples of chat, Wikipedia talk pages, Twitter and blogs. We then provide an annotation proposal that combines the different levels of description and representation of reply relations and which adheres to the schemas and practices for encoding CMC corpus documents within the TEI framework as defined by the TEI CMC SIG. It constitutes a prerequisite for correctly identifying higher levels of interactional relations such as dialogue acts or discussion trees. ; Der vorliegende Artikel stellt Typen und Annotationsebenen von Antwortrelationen in der internetbasierten Kommunikation (IBK) vor. Antwortrelationen bestehen zwischen Posts in IBK-Interaktionen und beschreiben Referenzen, die zwischen einem Initialbeitrag und einem Folgebeitrag bestehen. Wir klassifizieren drei Arten von Antwortrelationen in IBK-Interaktionen: erstens, technische Antwortrelationen, welche dadurch gekennzeichnet sind, dass durch das Betätigen einer „Antwort“-Schaltfläche eine Antwort initiiert wird, bspw. in Blogs; zweitens, Einrückungen, z. B. auf Wikipedia-Diskussionsseiten, in denen Benutzer ihre Beiträge in die entsprechende Stelle des Diskussionsverlaufs einfügen, indem sie ihre ...

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Computerunterstützte Kommunikation; Korpus; Annotation
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Little strokes fell great oaks. Creating CoRoLa, the reference corpus of contemporary Romanian

Autor*in: Tufiș, Dan ; Barbu Mititelu, Verginica ; Irimia, Elena ; Păiș, Vasile ; Ion, Radu ; Diewald, Nils ; Mitrofan, Maria ; Onofrei, Mihaela

Erschienen: 2019

Verlag: Bucureşti : Editura Academiei Române

The paper presents the quite long-standing tradition of Romanian corpus acquisition and processing, which reaches its peak with the reference corpus of contemporary Romanian language (CoRoLa). The paper describes decisions behind the kinds of texts... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9385 https://ids-pub.bsz-bw.de/files/9385/Tufis_Mititelu_Irimia_et_al_Little_strokes_fell_great_oaks_CoRoLa_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-93851

The paper presents the quite long-standing tradition of Romanian corpus acquisition and processing, which reaches its peak with the reference corpus of contemporary Romanian language (CoRoLa). The paper describes decisions behind the kinds of texts collected, as well as processing and annotation steps, highlighting the structure and importance of metadata to the corpus. The reader is also introduced to the three ways in which (s)he can plunge into the rich linguistic data of the corpus, waiting to be discovered. Besides querying the corpus, word embeddings extracted from it are useful to various natural language processing applications and for linguists, when user-friendly interfaces offer them the possibility to exploit the data.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Rumänisch; Korpus; Annotation; Metadaten
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Translate and label! An encoder-decoder approach for cross-lingual semantic role labeling

Autor*in: Daza, Angel ; Frank, Anette

Erschienen: 2019

Verlag: Stroudsburg, PA, USA : The Association for Computational Linguistics

We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations in a resource-poor target language. Unlike annotation projection techniques, our model does not need... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9439 https://ids-pub.bsz-bw.de/files/9439/Daza_Arevalo_Frank_Translate_and_label_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-94395 https://doi.org/10.18653/v1/D19-1056

We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations in a resource-poor target language. Unlike annotation projection techniques, our model does not need parallel data during inference time. Our approach can be applied in monolingual, multilingual and cross-lingual settings and is able to produce dependencybased and span-based SRL annotations. We benchmark the labeling performance of our model in different monolingual and multilingual settings using well-known SRL datasets. We then train our model in a cross-lingual setting to generate new SRL labeled data. Finally, we measure the effectiveness of our method by using the generated data to augment the training basis for resource-poor languages and perform manual evaluation to show that it produces high-quality sentences and assigns accurate semantic role annotations. Our proposed architecture offers a flexible method for leveraging SRL data in multiple languages.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Simultanübersetzen; Automatische Sprachverarbeitung; Annotation; Computerlinguistik; Semantik
Lizenz:	creativecommons.org/licenses/by/4.0/deed.de ; info:eu-repo/semantics/openAccess

Uralic multimedia corpora: ISO/TEI corpus data in the project INEL

Autor*in: Arkhangelskiy, Timofey ; Ferger, Anne ; Hedeland, Hanna

Erschienen: 2020

Verlag: Stroudsburg, PA : Association for Computational Linguistics

In this paper, we describe a data processing pipeline used for annotated spoken corpora of Uralic languages created in the INEL (Indigenous Northern Eurasian Languages) project. With this processing pipeline we convert the data into a loss-less... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9718 https://ids-pub.bsz-bw.de/files/9718/Arkhangelskiy_Ferger_Hedeland_Uralic_multimedia_corpora_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-97187 https://doi.org/10.18653/v1/W19-0310

In this paper, we describe a data processing pipeline used for annotated spoken corpora of Uralic languages created in the INEL (Indigenous Northern Eurasian Languages) project. With this processing pipeline we convert the data into a loss-less standard format (ISO/TEI) for long-term preservation while simultaneously enabling a powerful search in this version of the data. For each corpus, the input we are working with is a set of files in EXMARaLDA XML format, which contain transcriptions, multimedia alignment, morpheme segmentation and other kinds of annotation. The first step of processing is the conversion of the data into a certain subset of TEI following the ISO standard ’Transcription of spoken language’ with the help of an XSL transformation. The primary purpose of this step is to obtain a representation of our data in a standard format, which will ensure its long-term accessibility. The second step is the conversion of the ISO/TEI files to a JSON format used by the “Tsakorpus” search platform. This step allows us to make the corpora available through a web-based search interface. As an addition, the existence of such a converter allows other spoken corpora with ISO/TEI annotation to be made accessible online in the future. ; Tässä paperissa kuvataan aineistonnprosessointimenetelmä joka on käytössä uralilaisten puhuttujen korpusten luonnissa kieltedokumentointiprojekti INELissä. Prosessointimenetelmää käytetään konvertoimaan dataa häviöttömään ISO/TEI- standardiformaattiin pitkän aikavälin säilytystä varten sekä samanaikaisesti tehokkaisiin hakutoimintoihin tälle akineistoversiolle. Jokaisen korpuksen lähtöaineistona on joukko tiedostoja EXMARaLDAn XML-formaatissa, joka sisältää transkriptejä, multimediaa kohdennuksineen, morfeemijäsennyksiä ja muita annotaatiota. Ensimmäinen käsittelyaskel on aineiston konvertointi TEI:n osajoukkoon, joka muodostaa ISO-standardin puhutun kielen transkripteille, XSL-transformaatioita käyttäen. Tämän askelen ensisijainen tarkoitus on saada aineisto sellaiseen ...

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Computerlinguistik; Uralische Sprachen; Korpus; Text Encoding Initiative; Gesprochene Sprache; Annotation
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Bereich

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Clemens Räthel (Hg.): Den Ädelmodiga Abbedissan / Die edelmütige Äbtissin. Berliner Beiträge zur Skandinavistik, Band 28. Berlin: Nordeuropa-Institut 2021, 245 S.

Patrick Ledderose: Dramatische Zeiten. Zeitkonzepte in skandinavischen Theatertexten um 1900 und 2000. Nordica, Band 28. Baden- Baden: Rombach Wissenschaft 2021, 391 S.

„… ein Gemisch von Gehörtem und selbst Zugeseztem“ ; Nachschriften der ‚Kosmos-Vorträge‘ Alexander von Humboldts: Dokumentation, Kontextualisierung und exemplarische Analysen

Soziale Netzwerkanalysen zum mittelhochdeutschen Artusroman oder: Vorgreiflicher Versuch, Märchenhaftigkeit des Erzählens zu messen:Anhang

Anschaulichkeit messen. Eine quantitative Metaphernanalyse an deutschsprachigen Erzählanfängen zwischen 1880 und 1926

Lightweight grammatical annotation in the TEI: new perspectives

Aspekte der texttechnologischen Modellierung

Dokumentgrammatiken als Grundlage von XML-Tools

Textauszeichnungssprachen und Dokumentgrammatiken

The role of generic and logical document structure in relational discourse analysis

A syntax-based scheme for the annotation and segmentation of German spoken language interactions

Sprucing up the trees – error detection in treebanks

Universal Dependencies are hard to parse – or are they?

Detecting annotation noise in automatically labelled data

POS tagset refinement for linguistic analysis and the impact on statistical parsing

Reply relations in CMC: types and annotation

Reply relations in CMC: types and annotation

Modeling and annotating complex data structures

Redewiedergabe – Schritte zur automatischen Erkennung ; Speech, thought and writing representation – towards automatic detection

Das Redewiedergabe-Korpus. Eine neue Ressource

Guideline: Syntactic annotation and segmentation in the SegCor Project

Types and annotation of reply relations in computer-mediated communication

Little strokes fell great oaks. Creating CoRoLa, the reference corpus of contemporary Romanian

Translate and label! An encoder-decoder approach for cross-lingual semantic role labeling

Uralic multimedia corpora: ISO/TEI corpus data in the project INEL

Kontakt

Partner