Suchergebnisse

Modeling and Measuring Short Text Similarities. On the Multi-Dimensional Differences between German Poetry of Realism and Modernism

Autor*in: Ehrmanntraut, Anton ; Hagen, Thora ; Jannidis, Fotis ; Konle, Leonard ; Kröncke, Merten ; Winko, Simone

Erschienen: 2025

Verlag: Darmstadt : Universitäts- und Landesbibliothek Darmstadt ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

This study contributes to the ongoing discussion on how to operationalize text similarity for the purposes of computational literary studies by defining, justifying theoretically and employing a multi-dimensional text model. Additionally, we evaluate... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/13079 https://ids-pub.bsz-bw.de/files/13079/Ehrmanntraut_Hagen_Jannidis_Modeling_and_measuring_2025.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-130792 https://doi.org/10.48694/jcls.116

This study contributes to the ongoing discussion on how to operationalize text similarity for the purposes of computational literary studies by defining, justifying theoretically and employing a multi-dimensional text model. Additionally, we evaluate a set of strategies to implement this model for very short texts like poetry using a range of methods from weighted sparse vectors up to very recent neural sentence embeddings based on annotations of emotions, genre and similarity. And finally, we show the relevance of using such a complex text model by applying the best method to a research question about the development of early modernism in German poetry. While we can confirm some important hypotheses from literary studies, we are also able to differentiate or relativize others. In particular, our findings do not support the widely held thesis that the change from realism to modernism was a revolutionary 'rupture'.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Ähnlichkeit; Lyrik; Modernismus; Realismus
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Type- and Token-based Word Embeddings in the Digital Humanities

Autor*in: Ehrmanntraut, Anton ; Hagen, Thora ; Konle, Leonard ; Jannidis, Fotis

Erschienen: 2025

Verlag: Aachen : CEUR Workshop Proceedings ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

In the general perception of the NLP community, the new dynamic, context-sensitive, token-based embeddings from language models like BERT have replaced the older static, type-based embeddings like word2vec or fastText, due to their better... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/13080 https://ids-pub.bsz-bw.de/files/13080/Ehrmanntraut_Hagen_Konle_Type_and_token_based_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-130808

In the general perception of the NLP community, the new dynamic, context-sensitive, token-based embeddings from language models like BERT have replaced the older static, type-based embeddings like word2vec or fastText, due to their better performance. We can show that this is not the case for one area of applications for word embeddings: the abstract representation of the meaning of words in a corpus. This application is especially important for the Computational Humanities, for example in order to show the development of words or ideas. The main contribution of our papers are: 1) We offer a systematic comparison between dynamic and static embeddings in respect to word similarity. 2) We test the best method to convert token embeddings to type embeddings. 3) We contribute new evaluation datasets for word similarity in German. The main goal of our contribution is to make an evidence-based argument that research on static embeddings, which basically stopped after 2019, should be continued not only because it needs less computing power and smaller corpora, but also because for this specific set of applications their performance is on par with that of dynamic embeddings.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Neurolinguistisches Programmieren; Korpus
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Quantitative Analysis of Gendered Assumptions in a Nineteenth-Century Women’s Encyclopedia

Autor*in: Ketzan, Eric ; Hagen, Thora ; Jannidis, Fotis ; Witt, Andreas

Erschienen: 2025

Verlag: Tokyo : DH2022 Local Organizing Committee ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

This paper quantifies textual patterns relating to gendered assumptions in a fairly unique text, an entire “women’s encyclopedia” from 1830’s Germany, which at 10 volumes and 1,461,000 word tokens was of comparable size to contemporary general... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/13093 https://ids-pub.bsz-bw.de/files/13093/Ketzan_Hagen_Jannidis_Quantitative_analysis_of_gendered_2025.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-130938

This paper quantifies textual patterns relating to gendered assumptions in a fairly unique text, an entire “women’s encyclopedia” from 1830’s Germany, which at 10 volumes and 1,461,000 word tokens was of comparable size to contemporary general encyclopedias, but written and marketed for a female audience. We perform experiments on classifying gender of biographical entries and querying a specific textual feature, calendar dates, with context from comparison 19th-20th century encyclopedias from the EncycNet corpus.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Quantitative Analyse; Textlinguistik; Geschlechterforschung; Korpus
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Tracing the shift to “objectivity” in German encyclopedias of the long nineteenth century

Autor*in: Hagen, Thora ; Konle, Leonard ; Ketzan, Erik ; Jannidis, Fotis ; Witt, Andreas

Erschienen: 2025

Verlag: Graz : Zentrum für Informationsmodellierung - Austrian Centre for Digital Humanities, University of Graz ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

This paper presents experiments on tracing the shift toward "objectivity" in encyclopedias of the long nineteenth century, as discussed by scholars, via query of surface features (personal pronoun, exclamation points, and interjections) and emotion... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/13094 https://ids-pub.bsz-bw.de/files/13094/Hagen_Konle_Tracing_the_shift_2023.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-130941 https://doi.org/10.5281/zenodo.8107633

This paper presents experiments on tracing the shift toward "objectivity" in encyclopedias of the long nineteenth century, as discussed by scholars, via query of surface features (personal pronoun, exclamation points, and interjections) and emotion analysis. We report a decline in these personal and emotive, and thus less "objective", textual characteristics.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Enzyklopädie; Deutsch; Objektivität; Digital Humanities; Korpus
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Mental Maps in EncycNet: Exploring Global Representation in a Historical, German Knowledge Graph

Autor*in: Hagen, Thora ; Jannidis, Fotis ; Witt, Andreas

Erschienen: 2025

Verlag: Genf : Zenodo ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

First popularized by Google in 2012 (Singhal 2012), knowledge graphs (KG) have now become a staple data representation method. KGs can be described as directed graphs, where the nodes (subject and object entities) represent any type of human... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/13095 https://ids-pub.bsz-bw.de/files/13095/Hagen_Jannidis_Witt_Mental_Maps_in_EncycNet_2024.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-130950

First popularized by Google in 2012 (Singhal 2012), knowledge graphs (KG) have now become a staple data representation method. KGs can be described as directed graphs, where the nodes (subject and object entities) represent any type of human knowledge. The edge specifies the relationship between subject and object. KGs are increasingly moving into the (digital) humanities. At their core, the humanities facilitate the preservation, dissemination, and analysis of cultural heritage knowledge. In all three aspects, this work is gradually changing towards the digital, such as digital thesauri. The network structure of KGs in particular opens up new possibilities for data aggregation and data analysis in the humanities. For one, KGs are an opportunity to interlink different fields of study through ontologies. They can also be used as additional sources for text preparation, as a new technology to query, aggregate, and analyze data in new ways, or to discover previously unseen relationships (Hawkins 2022, Zhang et al. 2021). In other words, as per Hyvönen (2020), KGs are not only good for data exploration and solving pre-set problems, they can also be employed for “finding research problems in the first place, for addressing them, and even for solving them automatically under the constraints set by the human researcher.” This abstract introduces the first openly published version of the EncycNet KG, a semantic knowledge graph built from historical German encyclopedias, as well as its potential for the digital humanities. In particular, using EncycNet and Wikidata, we analyze how the representation of countries in encyclopedias has changed from the 19th century until today.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Wissensgraph; Digital Humanities
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Bereich

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Modeling and Measuring Short Text Similarities. On the Multi-Dimensional Differences between German Poetry of Realism and Modernism

Type- and Token-based Word Embeddings in the Digital Humanities

Quantitative Analysis of Gendered Assumptions in a Nineteenth-Century Women’s Encyclopedia

Tracing the shift to “objectivity” in German encyclopedias of the long nineteenth century

Mental Maps in EncycNet: Exploring Global Representation in a Historical, German Knowledge Graph

Kontakt

Partner