Suchergebnisse

Filtern nach

Letzte Suchanfragen

Ergebnisse für *

Es wurden 2 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 2 von 2.

Sortieren

Feature-based encoding and querying language resources with character semantics

Autor*in: Hughes, Baden ; Gibbon, Dafydd ; Trippel, Thorsten

Erschienen: 2024

Verlag: Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/12623 https://ids-pub.bsz-bw.de/files/12623/Hughes_Gibbon_Feature_based_encoding_2006.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-126238

In this paper we discuss the explicit representation of character features pertaining to written language resources, which we argue are critically necessary in the long term of archiving language data. Much focus on the creation of language resources and their associated preservation is at the level of the corpus itself; however it is generally accepted that long term interpretation of these language resources requires more than a best practice data format. In particular, where language resources are created in linguistic fieldwork, and especially for minority languages, the need for preservation not only of the resource itself, but of additional metadata which allows for the resource to be accurately interpreted in the future is becoming a topic of research in itself. In this paper we extend earlier work on semantically based character decomposition to include representation of character properties in a variety of models, and a mechanism for exploiting these properties through queries.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Sprachdaten; Archivierung; Metadaten; Phonetik; Ontologie <Wissensverarbeitung>; XML
Lizenz:	creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

Building a historical corpus for Classical Portuguese: some technological aspects

Autor*in: Paixão de Sousa, Maria Clara ; Trippel, Thorsten

Erschienen: 2024

Verlag: Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/12640 https://ids-pub.bsz-bw.de/files/12640/Paixao_de_Sousa_Trippel_Building_a_historical_corpus_2006.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-126407

This paper describes the restructuring process of a large corpus of historical documents and the system architecture that is used for accessing it. The initial challenge of this process was to get the most out of existing material, normalizing the legacy markup and harvesting the inherent information using widely available standards. This resulted in a conceptual and technical restructuring of the formerly existing corpus. The development of the standardized markup and techniques allowed the inclusion of important new materials, such as original 16th and 17th century prints and manuscripts; and enlarged the potential user groups. On the technological side, we were grounded on the premise that open standards are the best way of making sure that the resources will be accessible even after years in an archive. This is a welcomed result in view of the additional consequence of the remodeled corpus concept: it serves as a repository for important historical documents, some of which had been preserved for 500 years in paper format. This very rich material can from now on be handled freely for linguistic research goals.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Portugiesisch; Archivierung; Annotation; Metadaten; Sprachdaten; Computerlinguistik
Lizenz:	creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Bereich

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Feature-based encoding and querying language resources with character semantics

Building a historical corpus for Classical Portuguese: some technological aspects

Kontakt

Partner