Suchergebnisse

Datenübernahmerichtlinien des Leibniz-Instituts für Deutsche Sprache

Autor*in: Arnold, Denis

Erschienen: 2019

Verlag: Institut für Deutsche Sprache, Bibliothek, Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Beteiligt:	Fankhauser, Peter (Verfasser); Fisseni, Bernhard (Verfasser); Kupietz, Marc (Verfasser); Lüngen, Harald (Verfasser); Schmidt, Thomas (Verfasser); Witt, Andreas (Verfasser)
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	doi: 10.14618/ids-pub-8791 urn: urn:nbn:de:bsz:mh39-87919
Schlagworte:	Korpus <Linguistik>; Deutsch; Daten; Datenschutz; Forschungsdaten; Korpus <Linguistik>
Weitere Schlagworte:	Datenschutzrichtlinie
Umfang:	Online-Ressource

Das Gesamtkonzept des Deutschen Referenzkorpus DeReKo. Vom Design bis zur Verwendung und darüber hinaus

Autor*in: Kupietz, Marc

Erschienen: 2023

Verlag: de Gruyter, Berlin/Boston ; Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung], Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Beteiligt:	Lüngen, Harald (Verfasser); Diewald, Nils (Verfasser); Deppermann, Arnulf (Herausgeber); Fandrych, Christian (Herausgeber); Kupietz, Marc (Herausgeber); Schmidt, Thomas (Herausgeber)
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	doi: 10.1515/9783111085708-002 urn: urn:nbn:de:bsz:mh39-115951
Schlagworte:	Korpus <Linguistik>; Deutsch; Kontrastive Linguistik; Korpus <Linguistik>; Empirische Linguistik; Germanistik; Datenaufbereitung; Sprachdaten; Heuristik; Forschungsdaten; Kontrastive Linguistik
Weitere Schlagworte:	Deutsches Referenzkorpus (DeReKo); Korpusdesign; Korpusaufbereitung
Umfang:	Online-Ressource
Bemerkung(en):	In: Korpora in der germanistischen Sprachwissenschaft. Mündlich, schriftlich, multimedial. - Berlin/Boston : de Gruyter, 2023, S. 1-28.-(Jahrbuch / Leibniz-Institut für Deutsche Sprache (IDS) ; 2022). - ISBN 978-3-11-108570-8

Multimodale und agile Korpora. Perspektiven für Digital Herrnhut (N-ARC1)

Autor*in: Lasch, Alexander

Erschienen: 2023

Verlag: de Gruyter, Berlin/Boston ; Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung], Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Beteiligt:	Deppermann, Arnulf (Herausgeber); Fandrych, Christian (Herausgeber); Kupietz, Marc (Herausgeber); Schmidt, Thomas (Herausgeber)
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	doi: 10.1515/9783111085708-011 urn: urn:nbn:de:bsz:mh39-116084
Schlagworte:	Brüdergemeine; Korpus <Linguistik>; Korpus <Linguistik>; Multimodalität; Forschungsdaten; Brüdergemeine; Wissensbasis
Weitere Schlagworte:	Multimodale Korpora; Digital Herrnhut; Referenzkorpus; Nex-Gen Agile Reference Corpus (NARC); Datenerschließung; Datenstrukturierung; Datenerweiterung; Datenvernetzung
Umfang:	Online-Ressource
Bemerkung(en):	In: Korpora in der germanistischen Sprachwissenschaft. Mündlich, schriftlich, multimedial. - Berlin/Boston : de Gruyter, 2023, S. 225-249.-(Jahrbuch / Leibniz-Institut für Deutsche Sprache (IDS) ; 2022). - ISBN 978-3-11-108570-8

Datenübernahmerichtlinien des Leibniz-Instituts für Deutsche Sprache

Autor*in: Arnold, Denis ; Fankhauser, Peter ; Fisseni, Bernhard ; Kupietz, Marc ; Lüngen, Harald ; Schmidt, Thomas ; Witt, Andreas

Erschienen: 2019

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Bibliographische Angaben
Zugang

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/8791 https://ids-pub.bsz-bw.de/files/8791/LZA_IDS_Depositing_Policy_2019.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-87919 https://doi.org/10.14618/ids-pub-8791

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Datenschutz; Forschungsdaten; Korpus
Lizenz:	creativecommons.org/licenses/by-sa/4.0/deed.de ; info:eu-repo/semantics/openAccess

Proceedings of the LREC 2020 Workshop, Language Resources and Evaluation Conference, 11–16 May 2020, 8th Workshop on Challenges in the Management of Large Corpora (CMLC-8)

Autor*in: Bański, Piotr ; Barbaresi, Adrien ; Clematide, Simon ; Kupietz, Marc ; Lüngen, Harald ; Pisetta, Ines

Erschienen: 2020

Verlag: Paris : European Language Resources Association (ELRA)

In order to satisfy the information needs of a wide range of researchers across a number of disciplines, large textual datasets require careful design, collection, cleaning, encoding, annotation, storage, retrieval, and curation. This daunting set of... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9811 https://ids-pub.bsz-bw.de/files/9811/Banski_Barbaresi_Clematide_Kupietz_Luengen_Pisetta_Proceedings_LREC_2020.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-98112

In order to satisfy the information needs of a wide range of researchers across a number of disciplines, large textual datasets require careful design, collection, cleaning, encoding, annotation, storage, retrieval, and curation. This daunting set of tasks has coalesced into a number of key themes and questions that are of interest to the contributing research communities: (a) what sampling techniques can we apply? (b) what quality issues should we be aware of? (c) what infrastructures and frameworks are being developed for the efficient storage, annotation, analysis and retrieval of large datasets? (d) what affordances do visualisation techniques offer for the exploratory analysis approaches of corpora? (e) what legal paths can be followed in dealing with IPR and data protection issues governing both the data sources and the query results? (f) how to guarantee that corpus data remain available and usable in a sustainable way?

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Computerlinguistik; Forschungsdaten; Datenmanagement
Lizenz:	creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

RKorAPClient: An R Package for Accessing the German Reference Corpus DeReKo via KorAP

Autor*in: Kupietz, Marc ; Diewald, Nils ; Margaretha, Eliza

Erschienen: 2020

Verlag: Paris : European Language Resources Association

Making corpora accessible and usable for linguistic research is a huge challenge in view of (too) big data, legal issues and a rapidly evolving methodology. This does not only affect the design of user-friendly graphical interfaces to corpus analysis... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9843 https://ids-pub.bsz-bw.de/files/9843/Kupietz_Diewald_Margaretha_RKorAPClient_2020.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-98430

Making corpora accessible and usable for linguistic research is a huge challenge in view of (too) big data, legal issues and a rapidly evolving methodology. This does not only affect the design of user-friendly graphical interfaces to corpus analysis tools, but also the availability of programming interfaces supporting access to the functionality of these tools from various analysis and development environments. RKorAPClient is a new research tool in the form of an R package that interacts with the Web API of the corpus analysis platform KorAP, which provides access to large annotated corpora, including the German reference corpus DeReKo with 45 billion tokens. In addition to optionally authenticated KorAP API access, RKorAPClient provides further processing and visualization features to simplify common corpus analysis tasks. This paper introduces the basic functionality of RKorAPClient and exemplifies various analysis tasks based on DeReKo, that are bundled within the R package and can serve as a basic framework for advanced analysis and visualization approaches.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Visualisierung; Forschungsdaten; R; Web Services
Lizenz:	creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

Das Gesamtkonzept des Deutschen Referenzkorpus DeReKo. Vom Design bis zur Verwendung und darüber hinaus

Autor*in: Kupietz, Marc ; Lüngen, Harald ; Diewald, Nils

Erschienen: 2023

Verlag: Berlin/Boston : de Gruyter ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

Das Deutsche Referenzkorpus DeReKo dient als eine empirische Grundlage für die germanistische Linguistik. In diesem Beitrag geben wir einen Überblick über Grundlagen und Neuigkeiten zu DeReKo und seine Verwendungsmöglichkeiten sowie einen Einblick in... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11595 https://ids-pub.bsz-bw.de/files/11595/Kupietz_Luengen_Diewald_Das_Gesamtkonzept_des_DeReKo_2023.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-115951 https://doi.org/10.1515/9783111085708-002

Das Deutsche Referenzkorpus DeReKo dient als eine empirische Grundlage für die germanistische Linguistik. In diesem Beitrag geben wir einen Überblick über Grundlagen und Neuigkeiten zu DeReKo und seine Verwendungsmöglichkeiten sowie einen Einblick in seine strategische Gesamtkonzeption, die zum Ziel hat, DeReKo trotz begrenzter Ressourcen für einerseits möglichst viele und andererseits auch für innovative und anspruchsvolle Anwendungen nutzbar zu machen. Insbesondere erläutern wir dabei Strategien zur Aufbereitung sehr großer Korpora mit notwendigerweise heuristischen Verfahren und Herausforderungen, die sich auf dem Weg zur linguistischen Erschließung solcher Korpora stellen.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Korpus; Empirische Linguistik; Germanistik; Datenaufbereitung; Sprachdaten; Heuristik; Forschungsdaten; Kontrastive Linguistik
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-9) 2021. Limerick, 12 July 2021 (Online-Event)

Autor*in: Lüngen, Harald ; Kupietz, Marc ; Bański, Piotr ; Barbaresi, Adrien ; Clematide, Simon ; Pisetta, Ines

Erschienen: 2021

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache

Contents: 1. Julien Abadji, Pedro Javier Ortiz Suárez, Laurent Romary and Benoît Sagot: "Ungoliant: An Optimized Pipeline for the Generation of a Very Large-Scale Multilingual Web Corpus", S.1-9. 2. Markus Gärtner, Felicitas Kleinkopf, Melanie... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10467 https://ids-pub.bsz-bw.de/files/10467/CMLC9_Proceedings_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-104676 https://doi.org/10.14618/ids-pub-10467

Contents: 1. Julien Abadji, Pedro Javier Ortiz Suárez, Laurent Romary and Benoît Sagot: "Ungoliant: An Optimized Pipeline for the Generation of a Very Large-Scale Multilingual Web Corpus", S.1-9. 2. Markus Gärtner, Felicitas Kleinkopf, Melanie Andresen and Sibylle Hermann: "Corpus Reusability and Copyright - Challenges and Opportunities", S.10-19. 3. Nils Diewald, Eliza Margaretha and Marc Kupietz: "Lessons learned in Quality Management for Online Research Software Tools in Linguistics", S.20-26.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Datenmanagement; Computerlinguistik; Urheberrecht; Forschungsdaten
Lizenz:	creativecommons.org/licenses/by/4.0/deed.de ; info:eu-repo/semantics/openAccess

Legal issues related to the use of twitter data in language research

Autor*in: Kamocki, Paweł ; Hannesschläger, Vanessa ; Hoorn, Esther ; Kelli, Aleksei ; Kupietz, Marc ; Lindén, Krister ; Puksas, Andrius

Erschienen: 2021

Verlag: Utrecht : CLARIN ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Twitter data is used in a wide variety of research disciplines in Social Sciences and Humanities. Although most Twitter data is publicly available, its re-use and sharing raise many legal questions related to intellectual property and personal data... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10718 https://ids-pub.bsz-bw.de/files/10718/Kamocki_Hannesschlaeger_Legal_issues_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-107188

Twitter data is used in a wide variety of research disciplines in Social Sciences and Humanities. Although most Twitter data is publicly available, its re-use and sharing raise many legal questions related to intellectual property and personal data protection. Moreover, the use of Twitter and its content is subject to the Terms of Service, which also regulate re-use and sharing. This extended abstract provides a brief analysis of these issues and introduces the new Academic Research product track, which enables authorized researchers to access Twitter API on a preferential basis.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Recht (340); Sprache (400)
Schlagworte:	Recht; Twitter <Softwareplattform>; Forschungsdaten; Social Media; Sozialwissenschaften; Digital Humanities; Geistiges Eigentum; Datenschutz
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Recent developments in the European Reference Corpus EuReCo

Autor*in: Kupietz, Marc ; Diewald, Nils ; Trawiński, Beata ; Cosma, Ruxandra ; Cristea, Dan ; Tufiş, Dan ; Váradi, Tamás ; Wöllstein, Angelika

Erschienen: 2023

Verlag: Louvain-la-Neuve : Presses universitaires de Louvain ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

This paper reports on recent developments within the European Reference Corpus EuReCo, an open initiative that aims at providing and using virtual and dynamically definable comparable corpora based on existing national, reference or other large... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11829 https://ids-pub.bsz-bw.de/files/11829/Kupietz_Diewald_Recent_developments_in_EuReCo_2020.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-118291

This paper reports on recent developments within the European Reference Corpus EuReCo, an open initiative that aims at providing and using virtual and dynamically definable comparable corpora based on existing national, reference or other large corpora. Given the well-known shortcomings of other types of multilingual corpora such as parallel/translation corpora (shining-through effects, over-normalization, simplification, etc.) or web-based comparable corpora (covering only web material), EuReCo provides a unique linguistic resource offering new perspectives for fine-grained contrastive research on authentic cross-linguistic data, applications in translation studies and foreign language teaching and learning.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Forschungsdaten; Sprachdaten; Kontrastive Linguistik; Übersetzungswissenschaft; Fremdsprachenunterricht; Fremdsprachenlernen
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Bereich

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Datenübernahmerichtlinien des Leibniz-Instituts für Deutsche Sprache

Das Gesamtkonzept des Deutschen Referenzkorpus DeReKo. Vom Design bis zur Verwendung und darüber hinaus

Multimodale und agile Korpora. Perspektiven für Digital Herrnhut (N-ARC1)

Datenübernahmerichtlinien des Leibniz-Instituts für Deutsche Sprache

Proceedings of the LREC 2020 Workshop, Language Resources and Evaluation Conference, 11–16 May 2020, 8th Workshop on Challenges in the Management of Large Corpora (CMLC-8)

RKorAPClient: An R Package for Accessing the German Reference Corpus DeReKo via KorAP

Das Gesamtkonzept des Deutschen Referenzkorpus DeReKo. Vom Design bis zur Verwendung und darüber hinaus

Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-9) 2021. Limerick, 12 July 2021 (Online-Event)

Legal issues related to the use of twitter data in language research

Recent developments in the European Reference Corpus EuReCo

Kontakt

Partner