Suchergebnisse

Datensatz Schwache Maskulina

Autor*in: Weber, Thilo

Erschienen: 2023

Verlag: Leibniz-Institut für Deutsche Sprache, Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Beteiligt:	Hansen, Sandra (Verfasser)
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	doi: 10.14618/schwachemaskulinadb urn: urn:nbn:de:bsz:mh39-124718
Schlagworte:	Korpus <Linguistik>; Deutsch; Datensatz; Maskulinum; Substantiv; Korpus <Linguistik>; Grammatik; Grammis
Weitere Schlagworte:	Schwaches Maskulinum; Korpusgrammatik; Deutsches Referenzkorpus (DeReKo); Unbestimmter Artikel
Umfang:	Online-Ressource
Bemerkung(en):	In: Mannheim : Leibniz-Institut für Deutsche Sprache, (2023) In: Grammatisches Informationssystem „grammis“

Datensatz Genitiv- und von-Attribute

Autor*in: Kopf, Kristin

Erschienen: 2021

Verlag: Leibniz-Institut für Deutsche Sprache (IDS), Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Beteiligt:	Bildhauer, Felix (Verfasser)
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	doi: 10.14618/genitivvondb urn: urn:nbn:de:bsz:mh39-107238
Schlagworte:	Attribut; Nominalphrase; Korpus <Linguistik>; Genitivattribut; Grammis; Grammis; Grammatik; Datensatz; Genitivattribut; Korpus <Linguistik>; Nominalphrase
Umfang:	Online-Ressource
Bemerkung(en):	In: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), (2021) In: Grammatisches Informationssystem „grammis“

Datensatz Verschachtelte Genitivattribute

Autor*in: Kopf, Kristin

Erschienen: 2021

Verlag: Leibniz-Institut für Deutsche Sprache (IDS), Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	doi: 10.14618/schachtelgenitivDB urn: urn:nbn:de:bsz:mh39-107223
Schlagworte:	Genitivattribut; Grammis; Nomen; Nominalphrase; Grammatik; Grammis; Grammatik; Datensatz; Genitivattribut; Korpus <Linguistik>; Nominalphrase
Umfang:	Online-Ressource
Bemerkung(en):	In: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), (2021) In: Grammatisches Informationssystem „grammis“

Datensatz Nominalphrasen

Autor*in: Weber, Thilo

Erschienen: 2021

Verlag: Leibniz-Institut für Deutsche Sprache (IDS), Mannheim

Bibliographische Angaben
Zugang

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

Quelle:	DNB Sachgruppe Deutsche Sprache und Literatur
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
Weitere Identifier:	doi: 10.14618/lex.nominalphrasendb urn: urn:nbn:de:bsz:mh39-107201
Schlagworte:	Nominalphrase; Grammis; Grammatik; Linguistik; Syntax; Nominalphrase; Datensatz; Grammis; Syntax; Determination <Linguistik>; Grammatik
Umfang:	Online-Ressource
Bemerkung(en):	In: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS), (2021) In: Grammatisches Informationssystem „grammis“

Datensatz Nominalphrasen

Autor*in: Weber, Thilo

Erschienen: 2021

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Der Datensatz Nominalphrasen enthält Belege zu nichtpronominalen (d.h. vollen, lexikalischen) Nominalphrasen (NPs) mit einem Substantiv oder einer Nominalisierung als Kopf. Jeder Beleg ist in Bezug auf eine Reihe linguistisch relevanter Merkmale... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10720 https://ids-pub.bsz-bw.de/files/10720/Weber_Datensatz_Nominalphrasen_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-107201 https://doi.org/10.14618/lex.nominalphrasendb

Der Datensatz Nominalphrasen enthält Belege zu nichtpronominalen (d.h. vollen, lexikalischen) Nominalphrasen (NPs) mit einem Substantiv oder einer Nominalisierung als Kopf. Jeder Beleg ist in Bezug auf eine Reihe linguistisch relevanter Merkmale annotiert. Insgesamt enthält der Datensatz 8.137 Belegstellen. Nach dem Aussortieren von Fehlbelegen (siehe Spalten „valide“ und „nicht-valide_Begründung“) bleiben noch 7.813 einschlägige Belege. Die Suchanfrage erfolgte über das Kopfnomen; für Details zur Datenerhebung siehe Weber (2021a). Das Kopfnomen erscheint in der Spalte „Kopf_der_NP“. In manchen Fällen besteht die NP nur aus dem Kopfnomen, in den meisten Fällen geht sie aber darüber hinaus; sie erstreckt sich dann auf einen Teil des vorangehenden Kontexts (Spalte „Satzkontext_vor_Beleg“) und/oder des nachfolgenden Kontexts („Satzkontext_nach_Beleg“). Der Datensatz dient der Untersuchung der syntaktischen Funktionen von NPs (Weber 2021a) und der Determination in der NP (Weber 2021b).

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Nominalphrase; Datensatz; Grammis; Syntax; Determination; Grammatik
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Datensatz Verschachtelte Genitivattribute

Autor*in: Kopf, Kristin

Erschienen: 2021

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Der Datensatz enthält 409 Korpusbelege aus Nominalphrasen mit eingebetteten Genitivattributen, die wiederum ein eingebettetes Genitivattribut aufweisen (Petras Nachfolgers Beisein). Die Belege sind danach klassifiziert, ob die erste eingebettete... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10722 https://ids-pub.bsz-bw.de/files/10722/Kopf_Datensatz_Verschachtelte_Genitivattribute_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-107223 https://doi.org/10.14618/schachtelgenitivDB

Der Datensatz enthält 409 Korpusbelege aus Nominalphrasen mit eingebetteten Genitivattributen, die wiederum ein eingebettetes Genitivattribut aufweisen (Petras Nachfolgers Beisein). Die Belege sind danach klassifiziert, ob die erste eingebettete Nominalphrase vor oder hinter dem Kopfnomen der Gesamtnominalphrase steht (Petras Nachfolgers Beisein vs. Beisein Petras Nachfolgers) und ob die erste eingebettete Nominalphrase neben einem Genitiv noch ein Adjektiv enthält (Beisein Petras direkten Nachfolgers). Für jeden Beleg werden zudem die Lemmas der drei Nomen in ihrer Einbettungsreihenfolge angegeben. Darüber hinaus sind Metadaten (Land, Jahr) enthalten. Der Datensatz enthält die Gesamtheit der relevanten Belege aus dem KoGra-Untersuchungskorpus mit den im Folgenden aufgeführten Strukturen. Die Abfragen für die vier Strukturtypen führten zu 15.875 potenziellen Belegen, von denen sich bei manueller Durchsicht 409 als tatsächliche Nominalphrasen mit zweifach eingebetteten Genitivattributen erwiesen. Der Datensatz dient der Untersuchung der Sonderfälle des Genitivattributs (Kopf 2021).

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Grammis; Grammatik; Datensatz; Genitivattribut; Korpus; Nominalphrase
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Datensatz Genitiv- und von-Attribute

Autor*in: Kopf, Kristin ; Bildhauer, Felix

Erschienen: 2021

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Der Datensatz enthält 16.604 Korpusbelege aus Nominalphrasen mit Genitiv- und von-Attributen (die Ideen zahlreicher Kinder, die Ideen von zahlreichen Kindern), wobei die Genitivattribute prä- oder postnominal erscheinen können (Mannheims... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10723 https://ids-pub.bsz-bw.de/files/10723/Kopf_Bildhauer_Datensatz_Genitiv_und_von_Attribute_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-107238 https://doi.org/10.14618/genitivvondb

Der Datensatz enthält 16.604 Korpusbelege aus Nominalphrasen mit Genitiv- und von-Attributen (die Ideen zahlreicher Kinder, die Ideen von zahlreichen Kindern), wobei die Genitivattribute prä- oder postnominal erscheinen können (Mannheims Sehenswürdigkeiten, die Sehenswürdigkeiten Mannheims). Für jeden Beleg sind Informationen zu Land, Dekade und Medium enthalten. Hinzu kommen Angaben zu Kopf- und/oder Attributslemma (z. B. Namentyp, Flexionsklasse), Gesamtphrase (z. B. Definitheit, Kasus) und Attributsphrase (z. B. Kasusdistinktion, Länge). Zahlreiche Sonderfälle sind ebenfalls annotiert (z. B. Genitiv bei nichtflektiertem Adjektiv wie Gebäck Mannheimer Bäckereien, Phrasen mit adjektivisch flektierendem Attributsnomen wie die Ideen Jugendlicher, die Ideen von Jugendlichen).

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Grammis; Grammatik; Datensatz; Genitivattribut; Korpus; Nominalphrase
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Datenmanagement – Gegenstand und Dienst der Computerlinguistik. 40th Annual Conference of the German Linguistic Society. Stuttgart, Germany.

Autor*in: Trippel, Thorsten

Erschienen: 2021

Verlag: Konstanz : Deutsche Gesellschaft für Sprachwissenschaft ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Datenmanagement wird durch die Forschungsföderungsorganisationen (etwa in Horizon 2020 der EU, die Allianz der deutschen Wissenschaftsorganisationen oder in DFG geförderten Projekten) mehr und mehr Teil der Forschungslandschaft. Für die... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10812 https://ids-pub.bsz-bw.de/files/10812/Trippel_Datenmanagement_2018.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-108122

Datenmanagement wird durch die Forschungsföderungsorganisationen (etwa in Horizon 2020 der EU, die Allianz der deutschen Wissenschaftsorganisationen oder in DFG geförderten Projekten) mehr und mehr Teil der Forschungslandschaft. Für die Computerlinguistik ist das Forschungsdatenmanagement aber auch Teil des Forschungsgebietes: Datenmodellierung und Transformation für die nachhaltige Datenspeicherung gehören in den Bereich der Texttechnologie und Textlinguistik, ebenso die Modellierung der beschreibenden Daten zu Datensätzen.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Datenmanagement; Computerlinguistik; Forschungsdaten; Datenspeicherung; Texttechnologie; Textlinguistik; Datensatz; Metadaten; Sprachverarbeitung; Linked Data
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Datensatz attributive dass-Sätze und zu-Infinitive

Autor*in: Bildhauer, Felix ; Weber, Thilo

Erschienen: 2023

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache

Der Datensatz enthält 10.113 Korpusbelege für Konstruktionen, in denen ein Substantiv mit einem dass-Satz oder einem zu-Infinitiv auftritt (das Versprechen, dass man sich irgendwann wiedersieht vs. das Versprechen, sich irgendwann wiederzusehen). Die... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11610 https://ids-pub.bsz-bw.de/files/11610/Bildhauer_Weber_Datensatz_attributive_dass_Saetze_2023.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-116107 https://doi.org/10.14618/attributsatzdb

Der Datensatz enthält 10.113 Korpusbelege für Konstruktionen, in denen ein Substantiv mit einem dass-Satz oder einem zu-Infinitiv auftritt (das Versprechen, dass man sich irgendwann wiedersieht vs. das Versprechen, sich irgendwann wiederzusehen). Die Daten wurden erhoben aus: 1. dem Korpusgrammatik-Untersuchungskorpus (Bubenhofer et al. 2014), basierend auf dem Deutschen Referenzkorpus DeReKo (Kupietz et al. 2010, 2018), Release 2017-II. 2. dem Subkorpus “Forum” des DECOW16B-Webkorpus (Schäfer & Bildhauer 2012).

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Datensatz; Korpus; Grammatik; Grammis
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Datensatz Schwache Maskulina

Autor*in: Weber, Thilo ; Hansen, Sandra

Erschienen: 2024

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache

Der Datensatz enthält eine Sammlung von 1.156 Substantiven (mit wenigen Ausnahmen Maskulina), die sich im Korpusgrammatik-Untersuchungskorpus (Bubenhofer et al. 2014), basierend auf dem Deutschen Referenzkorpus DeReKo (Kupietz et al. 2010, 2018),... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/12471 https://ids-pub.bsz-bw.de/files/12471/Weber_Hansen_Datensatz_Schwache_Maskulina_2023.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-124718 https://doi.org/10.14618/schwachemaskulinadb

Der Datensatz enthält eine Sammlung von 1.156 Substantiven (mit wenigen Ausnahmen Maskulina), die sich im Korpusgrammatik-Untersuchungskorpus (Bubenhofer et al. 2014), basierend auf dem Deutschen Referenzkorpus DeReKo (Kupietz et al. 2010, 2018), Release 2017-II, unmittelbar nach einem Beleg für die Akkusativ- oder Dativform des unbestimmten Artikels ( einen / einem ) mindestens einmal mit der “schwachen” Endung -(e)n belegen lassen (z.B. einen Aktivisten , einem Autoren ). Einzelheiten zur Datenerhebung in Weber & Hansen (2023).

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Unbestimmt
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Datensatz; Maskulinum; Substantiv; Korpus; Grammatik; Grammis
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Webtechnologien zur visuellen Darstellung von 3D-Objekten anhand von Datensätzen

Smart Graphics

Autor*in: Schlender, Klaus

Erschienen: 2017

Verlag: GRIN Verlag, München

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Verbundkataloge
Sprache:	Deutsch
Medientyp:	Ebook
Format:	Online
ISBN:	9783668403710
Weitere Identifier:	9783668403710 urn: urn:nbn:de:101:1-201709011180
Auflage/Ausgabe:	1. Auflage
Schlagworte:	Visualisierung; Gebrauchsgrafik; Design; Technologie; Datensatz
Weitere Schlagworte:	(Produktform)Electronic book text; (BISAC Subject Heading)COM051000; Smart;Graphics;Webtechnologien;Darstellung;3D;Objekten;Datensatz;Java;Enterprise;Edition;Kontext;EE;Bachelor;Ingenieur;Medieninformatik;Informatik;Kunst;Design;Grafikdesign;Gestaltungsregeln;Gestaltgesetze;Frondend;HTML;CSS;JS;JavaScript;Canvas;WebGL;three;threejs;Visualisierungstechnologie;Backend;MVC;Entwurfsmuster;Prasentationsschicht;Steuerungsschicht;Geschäftslogikschicht;Persistenzschicht;Technologien;Daten;Austausch;Java Server Faces;API;JSON; (VLB-WN)1635
Umfang:	Online-Ressourcen, 48 Seiten
Bemerkung(en):	Lizenzpflichtig. - Vom Verlag als Druckwerk on demand und/oder als E-Book angeboten

Geschlechtergerechte Sprache auf den Webseiten deutscher, österreichischer, schweizerischer und Südtiroler Städte

Autor*in: Müller-Spitzer, Carolin ; Ochs, Samira

Erschienen: 2023

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11743 https://ids-pub.bsz-bw.de/files/11743/Mueller_Spitzer_Ochs_Geschlechtergerechte_Sprache_2023.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-117430 https://doi.org/10.14618/sr-2-2023_mue

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Deutsch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Geschlechtergerechte Sprache; Website; Datensatz; Korpus; Deutsch
Lizenz:	creativecommons.org/licenses/by-sa/3.0/de/deed.de ; info:eu-repo/semantics/openAccess

New opportunities for researching digital youth language: The NottDeuYTSch corpus

Autor*in: Cotgrove, Louis

Erschienen: 2023

Verlag: Tübingen : Narr ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

This article details the process of creating the Nottinghamer Korpus deutscher YouTube-Sprache ('The Nottingham German YouTube Language Corpus' - or NottDeuYTSch corpus) and outlines potential research opportunities. The corpus was compiled to... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11879 https://ids-pub.bsz-bw.de/files/11879/Cotgrove_New_opportunities_for_researching_2023.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-118796

This article details the process of creating the Nottinghamer Korpus deutscher YouTube-Sprache ('The Nottingham German YouTube Language Corpus' - or NottDeuYTSch corpus) and outlines potential research opportunities. The corpus was compiled to analyse the online language produced by young German-speakers and offers significant opportunity for in-depth research across several linguistic fields including lexis, morphology, syntax, orthography, and conversational and discursive analysis. The NottDeuYTSch corpus contains over 33 million words taken from approximately 3 million YouTube comments from videos published between 2008 to 2018 targeted at a young, German-speaking demographic and represent an authentic language snapshot of young German speakers. The corpus was proportionally sampled based on video category and year from a database of 112 popular German-speaking YouTube channels in the DACH region for optimal representativeness and balance and contains a considerable amount of associated metadata for each comment that enable further longitudinal cross-sectional analyses. The NottDeuYTSch corpus is available for analysis as part of the German Reference Corpus (DeReKo).

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Germanische Sprachen; Deutsch (430)
Schlagworte:	Jugendsprache; Korpus; YouTube; Deutsch; Metadaten; Computerunterstützte Kommunikation; Datensatz
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Implicitly abusive comparisons – a new dataset and linguistic analysis

Autor*in: Wiegand, Michael ; Geulig, Maja ; Ruppenhofer, Josef

Erschienen: 2021

Verlag: Stroudsburg, Pennsylvania : Association for Computational Linguistics

We examine the task of detecting implicitly abusive comparisons (e.g. “Your hair looks like you have been electrocuted”). Implicitly abusive comparisons are abusive comparisons in which abusive words (e.g. “dumbass” or “scum”) are absent. We detail... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10417 https://ids-pub.bsz-bw.de/files/10417/Wiegand_Geulig_Ruppenhofer_Implicitly_Abusive_Comparisons_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-104170

We examine the task of detecting implicitly abusive comparisons (e.g. “Your hair looks like you have been electrocuted”). Implicitly abusive comparisons are abusive comparisons in which abusive words (e.g. “dumbass” or “scum”) are absent. We detail the process of creating a novel dataset for this task via crowdsourcing that includes several measures to obtain a sufficiently representative and unbiased set of comparisons. We also present classification experiments that include a range of linguistic features that help us better understand the mechanisms underlying abusive comparisons.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Vergleich <Rhetorik>; Datensatz; Crowdsourcing; Beleidigung; Beschimpfung
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Implicitly abusive language – What does it actually look like and why are we not getting there?

Autor*in: Wiegand, Michael ; Ruppenhofer, Josef ; Eder, Elisabeth

Erschienen: 2021

Verlag: Stroudsburg, Pennsylvania : Association for Computational Linguistics

Abusive language detection is an emerging field in natural language processing which has received a large amount of attention recently. Still the success of automatic detection is limited. Particularly, the detection of implicitly abusive language,... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10449 https://ids-pub.bsz-bw.de/files/10449/Wiegand_Ruppenhofer_Eder_Implicitly_abusive_language_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-104498

Abusive language detection is an emerging field in natural language processing which has received a large amount of attention recently. Still the success of automatic detection is limited. Particularly, the detection of implicitly abusive language, i.e. abusive language that is not conveyed by abusive words (e.g. dumbass or scum), is not working well. In this position paper, we explain why existing datasets make learning implicit abuse difficult and what needs to be changed in the design of such datasets. Arguing for a divide-and-conquer strategy, we present a list of subtypes of implicitly abusive language and formulate research tasks and questions for future research.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Automatische Sprachanalyse; Forschungsdaten; Datensatz; Beleidigung; Beschimpfung
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

LRTwiki: enriching the likelihood ratio test with encyclopedic information for the extraction of relevant terms

Autor*in: Jakob, Niklas ; Müller, Mark-Christoph ; Gurevych, Iryna

Erschienen: 2022

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

This paper introduces LRTwiki, an improved variant of the Likelihood Ratio Test (LRT). The central idea of LRTwiki is to employ a comprehensive domain specific knowledge source as additional “on-topic” data sets, and to modify the calculation of the... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11090 https://ids-pub.bsz-bw.de/files/11090/Jakob_Mueller_Gurevych_LRTwiki_2009.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-110906

This paper introduces LRTwiki, an improved variant of the Likelihood Ratio Test (LRT). The central idea of LRTwiki is to employ a comprehensive domain specific knowledge source as additional “on-topic” data sets, and to modify the calculation of the LRT algorithm to take advantage of this new information. The knowledge source is created on the basis of Wikipedia articles. We evaluate on the two related tasks product feature extraction and keyphrase extraction, and find LRTwiki to yield a significant improvement over the original LRT in both tasks.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Bibliotheks- und Informationswissenschaften (020); Sprache (400)
Schlagworte:	Likelihood-Quotienten-Test; Enzyklopädie; Information Extraction; Datensatz; Algorithmus; Wikipedia; Fehleranalyse
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Flexible UIMA components for information retrieval research

Autor*in: Müller, Christof ; Zesch, Torsten ; Müller, Mark-Christoph ; Bernhard, Delphine ; Ignatova, Kateryna ; Gurevych, Iryna ; Mühlhäuser, Max

Erschienen: 2022

Verlag: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

In this paper, we present a suite of flexible UIMA-based components for information retrieval research which have been successfully used (and re-used) in several projects in different application domains. Implementing the whole system as UIMA... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11096 https://ids-pub.bsz-bw.de/files/11096/Mueller_Zesch_Flexible_UIMA_components_2008.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-110969

In this paper, we present a suite of flexible UIMA-based components for information retrieval research which have been successfully used (and re-used) in several projects in different application domains. Implementing the whole system as UIMA components is beneficial for configuration management, component reuse, implementation costs, analysis and visualization.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400); Bibliotheks- und Informationswissenschaften (020)
Schlagworte:	Information Retrieval; Konfigurationsmanagement; Information Extraction; Datensatz; Forschung; Algorithmus; Automatische Sprachanalyse; Informationsmanagement
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Knowledge sources for bridging resolution in multi-party dialog

Autor*in: Müller, Mark-Christoph ; Mieskes, Margot ; Strube, Michael

Erschienen: 2022

Verlag: Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

In this paper we investigate the coverage of the two knowledge sources WordNet and Wikipedia for the task of bridging resolution. We report on an annotation experiment which yielded pairs of bridging anaphors and their antecedents in spoken... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11102 https://ids-pub.bsz-bw.de/files/11102/Mueller_Mieskes_Knowledge_sources_2008.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-111024

In this paper we investigate the coverage of the two knowledge sources WordNet and Wikipedia for the task of bridging resolution. We report on an annotation experiment which yielded pairs of bridging anaphors and their antecedents in spoken multi-party dialog. Manual inspection of the two knowledge sources showed that, with some interesting exceptions, Wikipedia is superior to WordNet when it comes to the coverage of information necessary to resolve the bridging anaphors in our data set. We further describe a simple procedure for the automatic extraction of the required knowledge from Wikipedia by means of an API, and discuss some of the implications of the procedure’s performance.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Dialog; WordNet; Wikipedia; Gesprochene Sprache; Information; Datensatz; Wissensextraktion; API; Diskurs; Semantic Web; Lexikon
Lizenz:	creativecommons.org/licenses/by-nc-sa/3.0/ ; info:eu-repo/semantics/openAccess

Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022

Autor*in: Bański, Piotr ; Barbaresi, Adrien ; Clematide, Simon ; Kupietz, Marc ; Lüngen, Harald

Erschienen: 2022

Verlag: Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Contents: 1. Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu and Carol Luca Gasan: Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text. Pp. 1-7 2. Modest von Korff: Exhaustive Indexing of... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11111 https://ids-pub.bsz-bw.de/files/11111/Banski_CMLC_10_Proceedings_2022.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-111115

Contents: 1. Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu and Carol Luca Gasan: Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text. Pp. 1-7 2. Modest von Korff: Exhaustive Indexing of PubMed Records with Medical Subject Headings. Pp. 8-15 3. Luca Brigada Villa: UDeasy: a Tool for Querying Treebanks in CoNLL-U Format. Pp. 16-19 4. Nils Diewald: Matrix and Double-Array Representations for Efficient Finite State Tokenization. Pp. 20-26 5. Peter Fankhauser and Marc Kupietz: Count-Based and Predictive Language Models for Exploring DeReKo. Pp. 27-31 6. Hanno Biber: “The word expired when that world awoke.” New Challenges for Research with Large Text Corpora and Corpus-Based Discourse Studies in Totalitarian Times. Pp. 32-35

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Daten; Datenmanagement; Datensammlung; Datenanalyse; Datensatz; Datenqualität
Lizenz:	creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations

Autor*in: Jakob, Niklas ; Weber, Stefan Hagen ; Müller, Mark-Christoph ; Gurevych, Iryna

Erschienen: 2022

Verlag: New York : Association for Computing Machinery ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

In this paper we show that the extraction of opinions from free-text reviews can improve the accuracy of movie recommendations. We present three approaches to extract movie aspects as opinion targets and use them as features for the collaborative... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11139 https://ids-pub.bsz-bw.de/files/11139/Jakob_Weber_Mueller_Beyond_the_stars_2009.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-111390 https://doi.org/10.1145/1651461.1651473

In this paper we show that the extraction of opinions from free-text reviews can improve the accuracy of movie recommendations. We present three approaches to extract movie aspects as opinion targets and use them as features for the collaborative filtering. Each of these approaches requires different amounts of manual interaction. We collected a data set of reviews with corresponding ordinal (star) ratings of several thousand movies to evaluate the different features for the collaborative filtering. We employ a state-of-the-art collaborative filtering engine for the recommendations during our evaluation and compare the performance with and without using the features representing user preferences mined from the free-text reviews provided by the users. The opinion mining based features perform significantly better than the baseline, which is based on star ratings and genre information only.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Rezension; Film; Empfehlung; Kollaborative Filterung; Datensatz; Benutzer; Automatische Sprachanalyse; Textanalyse; Datenbank; Data Mining; Algorithmus; Empfehlungssystem
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Identifying implicitly abusive remarks about identity groups using a linguistically informed approach

Autor*in: Wiegand, Michael ; Eder, Elisabeth ; Ruppenhofer, Josef

Erschienen: 2022

Verlag: Association for Computational Linguistics : Stroudsburg ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

We address the task of distinguishing implicitly abusive sentences on identity groups (“Muslims contaminate our planet”) from other group-related negative polar sentences (“Muslims despise terrorism”). Implicitly abusive language are utterances not... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11261 https://ids-pub.bsz-bw.de/files/11261/Wiegand_Identifying_implicitly_abusive_remarks_2022.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-112614 https://doi.org/10.18653/v1/2022.naacl-main.410

We address the task of distinguishing implicitly abusive sentences on identity groups (“Muslims contaminate our planet”) from other group-related negative polar sentences (“Muslims despise terrorism”). Implicitly abusive language are utterances not conveyed by abusive words (e.g. “bimbo” or “scum”). So far, the detection of such utterances could not be properly addressed since existing datasets displaying a high degree of implicit abuse are fairly biased. Following the recently-proposed strategy to solve implicit abuse by separately addressing its different subtypes, we present a new focused and less biased dataset that consists of the subtype of atomic negative sentences about identity groups. For that task, we model components that each address one facet of such implicit abuse, i.e. depiction as perpetrators, aspectual classification and non-conformist views. The approach generalizes across different identity groups and languages.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Datensatz; Beleidigung; Beschimpfung; Computerlinguistik
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Language matters. The European research infrastructure CLARIN, today and tomorrow

Autor*in: de Jong, Franciska ; Van Uytvanck, Dieter ; Frontini, Francesca ; van den Bosch, Antal ; Fišer, Darja ; Witt, Andreas

Erschienen: 2022

Verlag: Berlin/Boston : de Gruyter ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

CLARIN stands for “Common Language Resources and Technology Infrastructure”. In 2012 CLARIN ERIC was established as a legal entity with the mission to create and maintain a digital infrastructure to support the sharing, use, and sustainability of... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11285 https://ids-pub.bsz-bw.de/files/11285/de_Jong_Van_Uytvanck_Language_matters_2022.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-112858 https://doi.org/10.1515/9783110767377-002

CLARIN stands for “Common Language Resources and Technology Infrastructure”. In 2012 CLARIN ERIC was established as a legal entity with the mission to create and maintain a digital infrastructure to support the sharing, use, and sustainability of language data (in written, spoken, or multimodal form) available through repositories from all over Europe, in support of research in the humanities and social sciences and beyond. Since 2016 CLARIN has had the status of Landmark research infrastructure and currently it provides easy and sustainable access to digital language data and also offers advanced tools to discover, explore, exploit, annotate, analyse, or combine such datasets, wherever they are located. This is enabled through a networked federation of centres: language data repositories, service centres, and knowledge centres with single sign-on access for all members of the academic community in all participating countries. In addition, CLARIN offers open access facilities for other interested communities of use, both inside and outside of academia. Tools and data from different centres are interoperable, so that data collections can be combined and tools from different sources can be chained to perform operations at different levels of complexity. The strategic agenda adopted by CLARIN and the activities undertaken are rooted in a strong commitment to the Open Science paradigm and the FAIR data principles. This also enables CLARIN to express its added value for the European Research Area and to act as a key driver of innovation and contributor to the increasing number of industry programmes running on data-driven processes and the digitalization of society at large.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einem Sammelband
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Infrastruktur; Sprachdaten; Datensatz; Open Access; Open Science; FAIR data principles; Innovation; Digitalisierung; SSH
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

Show imperatives in smartphone-based showing sequences in Czech and German

Autor*in: Oloff, Florence

Erschienen: 2022

Verlag: Göttingen : Verlag für Gesprächsforschung ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

This article examines how the most frequent imperative forms of the verb to show in German (zeig mal) and Czech (ukaž) are deployed in object-centred sequences. Specifically, it focuses on smartphone-based showing activities as these were the main... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11310 https://ids-pub.bsz-bw.de/files/11310/Oloff_Show_imperatives_2021.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-113101

This article examines how the most frequent imperative forms of the verb to show in German (zeig mal) and Czech (ukaž) are deployed in object-centred sequences. Specifically, it focuses on smartphone-based showing activities as these were the main sequential environments of show imperatives in the datasets investigated. In both languages, the imperative form does not merely aim to elicit a responsive action from the smartphone holder (such as making the device available) but projects an individual course of action from the requester’s side in the form of an immediate visual inspection of the digital content. This inspection is carried out as part of a joint course of action, allowing the recipient to provide a more detailed response to a prior action. Therefore, this specific imperative form is proven to be cross-linguistically suited to technology-mediated inspection sequences. ; In diesem Beitrag wird untersucht, wie die häufigsten Imperativformen des Verbs zeigen im Deutschen (zeig mal) und im Tschechischen (ukaž) in objektzentrierten Sequenzen eingesetzt werden. Insbesondere wird sich die Analyse auf Smartphonegestützte Zeigeaktivitäten konzentrieren, die in den untersuchten Datensätzen die sequentielle Hauptumgebung der zeig-Imperative darstellen. In beiden Sprachen zielt diese Imperativform nicht nur auf eine responsive Handlung des/-r Smartphone-Besitzers/-in ab (d.h. auf das Bereitstellen des Geräts), sondern projiziert eine individuelle Handlung des/-r Rezipienten/-in, nämlich eine unmittelbare visuelle Inspektion des digitalen Inhalts. Diese Inspektion erfolgt im Dienste eines gemeinsamen Projekts und ermöglicht es dem/-r Rezipienten/-in, eine detailliertere Antwort auf einen vorherigen Redebeitrag zu geben. So kann gezeigt werden, dass diese spezifische Imperativform sprachübergreifend an die Möglichkeiten technologievermittelter Inspektionssequenzen angepasst ist.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Aufsatz aus einer Zeitschrift
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Datensatz; Interaktion; Sprachgebrauch; Konversationsanalyse; Smartphone; Deutsch; Tschechisch; Imperativ; Auffforderung
Lizenz:	rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

Metadata formats for learner corpora: case study and discussion

Autor*in: Lange, Herbert

Erschienen: 2023

Verlag: Linköping : LiU Electronic Press ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Metadata provides important information relevant both to finding and understanding corpus data. Meaningful linguistic data requires both reasonable annotations and documentation of these annotations. This documentation is part of the metadata of a... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11458 https://ids-pub.bsz-bw.de/files/11458/Lange_Metadata_formats_for_learner_corpora_2022.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-114588 https://doi.org/10.3384/ecp190011

Metadata provides important information relevant both to finding and understanding corpus data. Meaningful linguistic data requires both reasonable annotations and documentation of these annotations. This documentation is part of the metadata of a dataset. While corpus documentation has often been provided in the form of accompanying publications, machinereadable metadata, both containing the bibliographic information and documenting the corpus data, has many advantages. Metadata standards allow for the development of common tools and interfaces. In this paper I want to add a new perspective from an archive’s point of view and look at the metadata provided for four learner corpora and discuss the suitability of established standards for machine-readable metadata. I am are aware that there is ongoing work towards metadata standards for learner corpora. However, I would like to keep the discussion going and add another point of view: increasing findability and reusability of learner corpora in an archiving context.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Metadaten; Korpus; Computerlinguistik; Annotation; Dokumentation; Datensatz; Archivierung
Lizenz:	creativecommons.org/licenses/by/4.0/ ; info:eu-repo/semantics/openAccess

RefCo and its checker: improving language documentation corpora’s reusability through a semi-automatic review process

Autor*in: Lange, Herbert ; Aznar, Jocelyn

Erschienen: 2023

Verlag: Paris : European Language Resources Association (ELRA) ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

The QUEST (QUality ESTablished) project aims at ensuring the reusability of audio-visual datasets (Wamprechtshammer et al., 2022) by devising quality criteria and curating processes. RefCo (Reference Corpora) is an initiative within QUEST in... mehr

Volltext:	https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/11459 https://ids-pub.bsz-bw.de/files/11459/Lange_Aznar_RefCo_and_its_checker_2022.pdf
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:mh39-114592

The QUEST (QUality ESTablished) project aims at ensuring the reusability of audio-visual datasets (Wamprechtshammer et al., 2022) by devising quality criteria and curating processes. RefCo (Reference Corpora) is an initiative within QUEST in collaboration with DoReCo (Documentation Reference Corpus, Paschen et al. (2020)) focusing on language documentation projects. Previously, Aznar and Seifart (2020) introduced a set of quality criteria dedicated to documenting fieldwork corpora. Based on these criteria, we establish a semi-automatic review process for existing and work-in-progress corpora, in particular for language documentation. The goal is to improve the quality of a corpus by increasing its reusability. A central part of this process is a template for machine-readable corpus documentation and automatic data verification based on this documentation. In addition to the documentation and automatic verification, the process involves a human review and potentially results in a RefCo certification of the corpus. For each of these steps, we provide guidelines and manuals. We describe the evaluation process in detail, highlight the current limits for automatic evaluation and how the manual review is organized accordingly.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt Germanistik
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Sprache (400)
Schlagworte:	Korpus; Dokumentation; Datensatz; Zertifizierung; Richtlinie; Sprachdaten; Gesprochene Sprache; Annotation; Computerlinguistik
Lizenz:	creativecommons.org/licenses/by-nc/4.0/ ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Bereich

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Datensatz Schwache Maskulina

Datensatz Genitiv- und von-Attribute

Datensatz Verschachtelte Genitivattribute

Datensatz Nominalphrasen

Datensatz Nominalphrasen

Datensatz Verschachtelte Genitivattribute

Datensatz Genitiv- und von-Attribute

Datenmanagement – Gegenstand und Dienst der Computerlinguistik. 40th Annual Conference of the German Linguistic Society. Stuttgart, Germany.

Datensatz attributive dass-Sätze und zu-Infinitive

Datensatz Schwache Maskulina

Webtechnologien zur visuellen Darstellung von 3D-Objekten anhand von Datensätzen

Geschlechtergerechte Sprache auf den Webseiten deutscher, österreichischer, schweizerischer und Südtiroler Städte

New opportunities for researching digital youth language: The NottDeuYTSch corpus

Implicitly abusive comparisons – a new dataset and linguistic analysis

Implicitly abusive language – What does it actually look like and why are we not getting there?

LRTwiki: enriching the likelihood ratio test with encyclopedic information for the extraction of relevant terms

Flexible UIMA components for information retrieval research

Knowledge sources for bridging resolution in multi-party dialog

Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022

Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations

Identifying implicitly abusive remarks about identity groups using a linguistically informed approach

Language matters. The European research infrastructure CLARIN, today and tomorrow

Show imperatives in smartphone-based showing sequences in Czech and German

Metadata formats for learner corpora: case study and discussion

RefCo and its checker: improving language documentation corpora’s reusability through a semi-automatic review process

Kontakt

Partner