Ergebnisse für *

Es wurden 8 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 8 von 8.

Sortieren

  1. Domain adaptation with linked encyclopedic data: A case study for historical german
    Autor*in: Hagen, Thora
    Erschienen: 2025
    Verlag:  Aachen : CEUR Workshop Proceedings ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper outlines a proposal for the use of knowledge graphs for historical German domain adaptation. From the EncycNet project, the encyclopedia-based knowledge graph from the early 20th century was borrowed to examine whether text-based domain... mehr

     

    This paper outlines a proposal for the use of knowledge graphs for historical German domain adaptation. From the EncycNet project, the encyclopedia-based knowledge graph from the early 20th century was borrowed to examine whether text-based domain adaptation using the source encyclopedia’s text or graph-based adaptation produces a better domain-specific model. To evaluate the approach, a novel historical test dataset based on a second encyclopedia of the early 20th century was created. This dataset is categorized by knowledge type (factual, linguistic, lexical) with special attention paid to distinguishing simple and expert knowledge. The main finding is that, surprisingly, simple knowledge has the most potential for improvement, whereas expert knowledge lags behind. In this study, broad signals like simple definitions and word origin yielded the best results, while more specialized knowledge such as synonyms were not as effectively represented. A follow-up study was carried out in favor of simple contemporary lexical knowledge to control for historicity and text genre, where the results confirm that language models can still be enhanced by incorporating simple lexical knowledge using the proposed workflow.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Semantik
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  2. Modeling and Measuring Short Text Similarities. On the Multi-Dimensional Differences between German Poetry of Realism and Modernism
    Erschienen: 2025
    Verlag:  Darmstadt : Universitäts- und Landesbibliothek Darmstadt ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This study contributes to the ongoing discussion on how to operationalize text similarity for the purposes of computational literary studies by defining, justifying theoretically and employing a multi-dimensional text model. Additionally, we evaluate... mehr

     

    This study contributes to the ongoing discussion on how to operationalize text similarity for the purposes of computational literary studies by defining, justifying theoretically and employing a multi-dimensional text model. Additionally, we evaluate a set of strategies to implement this model for very short texts like poetry using a range of methods from weighted sparse vectors up to very recent neural sentence embeddings based on annotations of emotions, genre and similarity. And finally, we show the relevance of using such a complex text model by applying the best method to a research question about the development of early modernism in German poetry. While we can confirm some important hypotheses from literary studies, we are also able to differentiate or relativize others. In particular, our findings do not support the widely held thesis that the change from realism to modernism was a revolutionary 'rupture'.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch
    Medientyp: Aufsatz aus einer Zeitschrift
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Ähnlichkeit; Lyrik; Modernismus; Realismus
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  3. Type- and Token-based Word Embeddings in the Digital Humanities
    Erschienen: 2025
    Verlag:  Aachen : CEUR Workshop Proceedings ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    In the general perception of the NLP community, the new dynamic, context-sensitive, token-based embeddings from language models like BERT have replaced the older static, type-based embeddings like word2vec or fastText, due to their better... mehr

     

    In the general perception of the NLP community, the new dynamic, context-sensitive, token-based embeddings from language models like BERT have replaced the older static, type-based embeddings like word2vec or fastText, due to their better performance. We can show that this is not the case for one area of applications for word embeddings: the abstract representation of the meaning of words in a corpus. This application is especially important for the Computational Humanities, for example in order to show the development of words or ideas. The main contribution of our papers are: 1) We offer a systematic comparison between dynamic and static embeddings in respect to word similarity. 2) We test the best method to convert token embeddings to type embeddings. 3) We contribute new evaluation datasets for word similarity in German. The main goal of our contribution is to make an evidence-based argument that research on static embeddings, which basically stopped after 2019, should be continued not only because it needs less computing power and smaller corpora, but also because for this specific set of applications their performance is on par with that of dynamic embeddings.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Neurolinguistisches Programmieren; Korpus
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  4. Verwendung von Wissensgraphen zur inhaltlichen Ergänzung kleinerer Textkorpora
    Autor*in: Hagen, Thora
    Erschienen: 2025
    Verlag:  Genf : Zenodo ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    Die Korpuserstellung ist einer der essenziellsten Schritte um ein Forschungsvorhaben im Bereich der Digital Humanities durchzuführen. Vor allem für speziellere Domänen (etwa bei der Analyse von Subgenres oder Dialekten) ist allerdings häufig nicht... mehr

     

    Die Korpuserstellung ist einer der essenziellsten Schritte um ein Forschungsvorhaben im Bereich der Digital Humanities durchzuführen. Vor allem für speziellere Domänen (etwa bei der Analyse von Subgenres oder Dialekten) ist allerdings häufig nicht genügend Material verfügbar, um Methoden aus dem NLP Bereich nachnutzen zu können, da diese Gigabytes an Text verlangen. Dieser Aufsatz zeigt wie Wissensgraphen, welche zum Beispiel aus Wörterbüchern erstellt werden können, helfen, kleinere Textkorpora aufzuwerten. In dem hier durchgeführten Experiment wird ein auf 20 Megabytes trainiertes FastText Modell mit den Informationen aus GermaNet angereichert. Das resultierende Modell weist die selbe Performanz auf wie ein einfaches FastText Modell, welches auf etwa dreimal soviel Daten trainiert wurde. Ein Beitrag zur 8. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" - DHd 2022 Kulturen des digitalen Gedächtnisses.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Wissensgraph; Korpus; Neurolinguistisches Programmieren; GermaNet; Digital Humanities
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  5. Introducing traveling word pairs in historical semantic change: a case study of privacy words in 18th and 19th century English
    Erschienen: 2025
    Verlag:  Aachen : Sun SITE Central Europe ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    In recent years, Lexical semantic change detection (LSCD) has become a central task of NLP. Because most studies in LSCD only consider the semantic change of words in isolation, in this paper, we propose a new direction for the analysis of semantic... mehr

     

    In recent years, Lexical semantic change detection (LSCD) has become a central task of NLP. Because most studies in LSCD only consider the semantic change of words in isolation, in this paper, we propose a new direction for the analysis of semantic shifts: traveling word pairs. First, we introduce shift correlation to find pairs of words that semantically shift together in a similar fashion. Second, we propose word relation shift to analyze how the relationship between two words has changed over time. As a test case, we investigate the word privacy (and related words identified by a pre-existing dictionary), as an example of a word that has shifted semantics historically and remains vibrantly explored as a concept in contemporary humanistic discourse. We report that the term privacy in comparison shows relatively little change initially – with correlation analysis revealing more about how key terms surrounding privacy have shifted in tandem, and explore nuanced changes through word pair analysis, suggesting a shift toward concreteness in particular.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Bedeutungswandel; Fallstudie; Englisch; Semantik; Computerlinguistik; Natürliche Sprache; Sprachwandel; Sprache; Geschichte
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  6. Quantitative Analysis of Gendered Assumptions in a Nineteenth-Century Women’s Encyclopedia
    Erschienen: 2025
    Verlag:  Tokyo : DH2022 Local Organizing Committee ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper quantifies textual patterns relating to gendered assumptions in a fairly unique text, an entire “women’s encyclopedia” from 1830’s Germany, which at 10 volumes and 1,461,000 word tokens was of comparable size to contemporary general... mehr

     

    This paper quantifies textual patterns relating to gendered assumptions in a fairly unique text, an entire “women’s encyclopedia” from 1830’s Germany, which at 10 volumes and 1,461,000 word tokens was of comparable size to contemporary general encyclopedias, but written and marketed for a female audience. We perform experiments on classifying gender of biographical entries and querying a specific textual feature, calendar dates, with context from comparison 19th-20th century encyclopedias from the EncycNet corpus.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Quantitative Analyse; Textlinguistik; Geschlechterforschung; Korpus
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  7. Tracing the shift to “objectivity” in German encyclopedias of the long nineteenth century
    Erschienen: 2025
    Verlag:  Graz : Zentrum für Informationsmodellierung - Austrian Centre for Digital Humanities, University of Graz ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper presents experiments on tracing the shift toward "objectivity" in encyclopedias of the long nineteenth century, as discussed by scholars, via query of surface features (personal pronoun, exclamation points, and interjections) and emotion... mehr

     

    This paper presents experiments on tracing the shift toward "objectivity" in encyclopedias of the long nineteenth century, as discussed by scholars, via query of surface features (personal pronoun, exclamation points, and interjections) and emotion analysis. We report a decline in these personal and emotive, and thus less "objective", textual characteristics.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Enzyklopädie; Deutsch; Objektivität; Digital Humanities; Korpus
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  8. Mental Maps in EncycNet: Exploring Global Representation in a Historical, German Knowledge Graph
    Erschienen: 2025
    Verlag:  Genf : Zenodo ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    First popularized by Google in 2012 (Singhal 2012), knowledge graphs (KG) have now become a staple data representation method. KGs can be described as directed graphs, where the nodes (subject and object entities) represent any type of human... mehr

     

    First popularized by Google in 2012 (Singhal 2012), knowledge graphs (KG) have now become a staple data representation method. KGs can be described as directed graphs, where the nodes (subject and object entities) represent any type of human knowledge. The edge specifies the relationship between subject and object. KGs are increasingly moving into the (digital) humanities. At their core, the humanities facilitate the preservation, dissemination, and analysis of cultural heritage knowledge. In all three aspects, this work is gradually changing towards the digital, such as digital thesauri. The network structure of KGs in particular opens up new possibilities for data aggregation and data analysis in the humanities. For one, KGs are an opportunity to interlink different fields of study through ontologies. They can also be used as additional sources for text preparation, as a new technology to query, aggregate, and analyze data in new ways, or to discover previously unseen relationships (Hawkins 2022, Zhang et al. 2021). In other words, as per Hyvönen (2020), KGs are not only good for data exploration and solving pre-set problems, they can also be employed for “finding research problems in the first place, for addressing them, and even for solving them automatically under the constraints set by the human researcher.” This abstract introduces the first openly published version of the EncycNet KG, a semantic knowledge graph built from historical German encyclopedias, as well as its potential for the digital humanities. In particular, using EncycNet and Wikidata, we analyze how the representation of countries in encyclopedias has changed from the 19th century until today.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Germanische Sprachen; Deutsch (430)
    Schlagworte: Wissensgraph; Digital Humanities
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess