Ergebnisse für *

Es wurden 7 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 7 von 7.

Sortieren

  1. A pragmatic approach to XML interoperability – the Component Metadata Infrastructure (CMDI)
    Erschienen: 2022
    Verlag:  Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    XML has been designed for creating structured documents, but the information that is encoded in these structures are, by definition, out of scope for XML. Additional sources, normally not easily interpretable by computers, such as documentation are... mehr

     

    XML has been designed for creating structured documents, but the information that is encoded in these structures are, by definition, out of scope for XML. Additional sources, normally not easily interpretable by computers, such as documentation are needed to determine the intention of specific tags in a tag-set. The Component Metadata Infrastructure (CMDI) takes a rather pragmatic approach to foster interoperability between XML instances in the domain of metadata descriptions for language resources. This paper gives an overview of this approach.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: XML; Metadaten; Repository; Datenmanagement; Computerlinguistik
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  2. Word sense alignment and disambiguation for historical encyclopedias
    Erschienen: 2022
    Verlag:  Gießen : Graphen & Netzwerke; AG des Verbandes Digital Humanities im deutschsprachigen Raum e.V. ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper will address the challenge of creating a knowledge graph from a corpus of historical encyclopedias with a special focus on word sense alignment (WSA) and disambiguation (WSD). More precisely, we examine WSA and WSD approaches based on... mehr

     

    This paper will address the challenge of creating a knowledge graph from a corpus of historical encyclopedias with a special focus on word sense alignment (WSA) and disambiguation (WSD). More precisely, we examine WSA and WSD approaches based on article similarity to link messy historical data, utilizing Wikipedia as aground-truth component – as the lack of a critical overlap in content paired with the amount of variation between and within the encyclopedias does not allow for choosing a ”baseline” encyclopedia to align the others to. Additionally, we are comparing the disambiguation performance of conservative methods like the Lesk algorithm to more recent approaches, i.e. using language models to disambiguate senses.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Semasiologie; Enzyklopädie; Wissensgraph; Korpus; Wikipedia; Computerlinguistik
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  3. Was darf die sprachwissenschaftliche Forschung? Juristische Fragen bei der Arbeit mit Sprachdaten
    Erschienen: 2022
    Verlag:  Paderborn : Wilhelm Fink ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

    Sich in der Linguistik mit rechtlichen Themen beschäftigen zu müssen, ist auf den ersten Blick überraschend. Da jedoch in den Sprachwissenschaften empirisch gearbeitet wird und Sprachdaten, insbesondere Texte und Ton- und Videoaufnahmen sowie... mehr

     

    Sich in der Linguistik mit rechtlichen Themen beschäftigen zu müssen, ist auf den ersten Blick überraschend. Da jedoch in den Sprachwissenschaften empirisch gearbeitet wird und Sprachdaten, insbesondere Texte und Ton- und Videoaufnahmen sowie Transkripte gesprochener Sprache, in den letzten Jahren auch verstärkt Sprachdaten internetbasierter Kommunikation, als Basis für die linguistische Forschung dienen, müssen rechtliche Rahmenbedingungen für jede Art von Datennutzung beachtet werden. Natürlich arbeiten auch andere Wissenschaften, wie z. B. die Astronomie oder die Meteorologie, empirisch. Jedoch gibt es einen grundsätzlichen Unterschied der empirischen Basis: Im Gegensatz zu Temperaturen, die gemessen, oder Konstellationen von Himmelskörpern, die beobachtet werden, basieren Sprachdaten auf schriftlichen, mündlichen oder gebärdeten Äußerungen von Menschen, wodurch sich juristisch begründete Beschränkungen ihrer Nutzung ergeben.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Deutsch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Recht (340); Sprache (400)
    Schlagworte: Datenschutz; Digital Humanities; Forschungsdaten; Recht; Personenbezogene Daten; Forschung; Sprachdaten; Geistiges Eigentum; Urheberrecht; Wissenschaft; Creative Commons; Open Access
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  4. Ethical issues in language resources and language technology – Tentative taxonomy
    Erschienen: 2022
    Verlag:  Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    Ethical issues in Language Resources and Language Technology are often invoked, but rarely discussed. This is at least partly because little work has been done to systematize ethical issues and principles applicable in the fields of Language... mehr

     

    Ethical issues in Language Resources and Language Technology are often invoked, but rarely discussed. This is at least partly because little work has been done to systematize ethical issues and principles applicable in the fields of Language Resources and Language Technology. This paper provides an overview of ethical issues that arise at different stages of Language Resources and Language Technology development, from the conception phase through the construction phase to the use phase. Based on this overview, the authors propose a tentative taxonomy of ethical issues in Language Resources and Language Technology, built around five principles: Privacy, Property, Equality, Transparency and Freedom. The authors hope that this tentative taxonomy will facilitate ethical assessment of projects in the field of Language Resources and Language Technology, and structure the discussion on ethical issues in this domain, which may eventually lead to the adoption of a universally accepted Code of Ethics of the Language Resources and Language Technology community.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Sprache (400); Ethik (170)
    Schlagworte: Ethik; Privatsphäre; Eigentum; Gleichheit; Transparenz; Freiheit; Daten; Sprachdaten
    Lizenz:

    creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

  5. Language matters. The European research infrastructure CLARIN, today and tomorrow
    Erschienen: 2022
    Verlag:  Berlin/Boston : de Gruyter ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    CLARIN stands for “Common Language Resources and Technology Infrastructure”. In 2012 CLARIN ERIC was established as a legal entity with the mission to create and maintain a digital infrastructure to support the sharing, use, and sustainability of... mehr

     

    CLARIN stands for “Common Language Resources and Technology Infrastructure”. In 2012 CLARIN ERIC was established as a legal entity with the mission to create and maintain a digital infrastructure to support the sharing, use, and sustainability of language data (in written, spoken, or multimodal form) available through repositories from all over Europe, in support of research in the humanities and social sciences and beyond. Since 2016 CLARIN has had the status of Landmark research infrastructure and currently it provides easy and sustainable access to digital language data and also offers advanced tools to discover, explore, exploit, annotate, analyse, or combine such datasets, wherever they are located. This is enabled through a networked federation of centres: language data repositories, service centres, and knowledge centres with single sign-on access for all members of the academic community in all participating countries. In addition, CLARIN offers open access facilities for other interested communities of use, both inside and outside of academia. Tools and data from different centres are interoperable, so that data collections can be combined and tools from different sources can be chained to perform operations at different levels of complexity. The strategic agenda adopted by CLARIN and the activities undertaken are rooted in a strong commitment to the Open Science paradigm and the FAIR data principles. This also enables CLARIN to express its added value for the European Research Area and to act as a key driver of innovation and contributor to the increasing number of industry programmes running on data-driven processes and the digitalization of society at large.

     

    Export in Literaturverwaltung   RIS-Format
      BibTeX-Format
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Aufsatz aus einem Sammelband
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Infrastruktur; Sprachdaten; Datensatz; Open Access; Open Science; FAIR data principles; Innovation; Digitalisierung; SSH
    Lizenz:

    creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

  6. Preface
    Erschienen: 2022
    Verlag:  Berlin/Boston : de Gruyter ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

  7. The future is now. The digital transformation in the German linguistics community and the key role of the IDS
    Erschienen: 2022
    Verlag:  Budapest : Nyelvtudományi Kutatóközpont / Hungarian Research Centre for Linguistics ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

    The Leibniz-Institute for the German Language (IDS) was established in Mannheim in 1964. Since then, it has been at the forefront of innovation in German linguistics as a hub for digital language data. This chapter presents various lessons learnt... mehr

     

    The Leibniz-Institute for the German Language (IDS) was established in Mannheim in 1964. Since then, it has been at the forefront of innovation in German linguistics as a hub for digital language data. This chapter presents various lessons learnt from over five decades of work by the IDS, ranging from the importance of sustainability, through its strong technical base and FAIR principles, to the IDS’ role in national and international cooperation projects and its expertise on legal and ethical issues related to language resources and language technology.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Digitalisierung; Deutsch; Leibniz-Institut für Deutsche Sprache (IDS); Digitale Daten; Sprachdaten; Nachhaltigkeit; FAIR data principles; Ethik; Kooperation
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess