Ergebnisse für *

Es wurden 4 Ergebnisse gefunden.

Zeige Ergebnisse 1 bis 4 von 4.

Sortieren

  1. LRTwiki: enriching the likelihood ratio test with encyclopedic information for the extraction of relevant terms
    Erschienen: 2022
    Verlag:  Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    This paper introduces LRTwiki, an improved variant of the Likelihood Ratio Test (LRT). The central idea of LRTwiki is to employ a comprehensive domain specific knowledge source as additional “on-topic” data sets, and to modify the calculation of the... mehr

     

    This paper introduces LRTwiki, an improved variant of the Likelihood Ratio Test (LRT). The central idea of LRTwiki is to employ a comprehensive domain specific knowledge source as additional “on-topic” data sets, and to modify the calculation of the LRT algorithm to take advantage of this new information. The knowledge source is created on the basis of Wikipedia articles. We evaluate on the two related tasks product feature extraction and keyphrase extraction, and find LRTwiki to yield a significant improvement over the original LRT in both tasks.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Bibliotheks- und Informationswissenschaften (020); Sprache (400)
    Schlagworte: Likelihood-Quotienten-Test; Enzyklopädie; Information Extraction; Datensatz; Algorithmus; Wikipedia; Fehleranalyse
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  2. Flexible UIMA components for information retrieval research
    Erschienen: 2022
    Verlag:  Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    In this paper, we present a suite of flexible UIMA-based components for information retrieval research which have been successfully used (and re-used) in several projects in different application domains. Implementing the whole system as UIMA... mehr

     

    In this paper, we present a suite of flexible UIMA-based components for information retrieval research which have been successfully used (and re-used) in several projects in different application domains. Implementing the whole system as UIMA components is beneficial for configuration management, component reuse, implementation costs, analysis and visualization.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400); Bibliotheks- und Informationswissenschaften (020)
    Schlagworte: Information Retrieval; Konfigurationsmanagement; Information Extraction; Datensatz; Forschung; Algorithmus; Automatische Sprachanalyse; Informationsmanagement
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  3. Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations
    Erschienen: 2022
    Verlag:  New York : Association for Computing Machinery ; Mannheim : Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]

    In this paper we show that the extraction of opinions from free-text reviews can improve the accuracy of movie recommendations. We present three approaches to extract movie aspects as opinion targets and use them as features for the collaborative... mehr

     

    In this paper we show that the extraction of opinions from free-text reviews can improve the accuracy of movie recommendations. We present three approaches to extract movie aspects as opinion targets and use them as features for the collaborative filtering. Each of these approaches requires different amounts of manual interaction. We collected a data set of reviews with corresponding ordinal (star) ratings of several thousand movies to evaluate the different features for the collaborative filtering. We employ a state-of-the-art collaborative filtering engine for the recommendations during our evaluation and compare the performance with and without using the features representing user preferences mined from the free-text reviews provided by the users. The opinion mining based features perform significantly better than the baseline, which is based on star ratings and genre information only.

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Rezension; Film; Empfehlung; Kollaborative Filterung; Datensatz; Benutzer; Automatische Sprachanalyse; Textanalyse; Datenbank; Data Mining; Algorithmus; Empfehlungssystem
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess

  4. Information extraction with the Darmstadt Knowledge Processing Software Repository (Extended Abstract)
    Erschienen: 2022
    Verlag:  Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

    Current Natural Language Processing (NLP) systems feature high-complexity processing pipelines that require the use of components at different levels of linguistic and application specific processing. These components often have to interface with... mehr

     

    Current Natural Language Processing (NLP) systems feature high-complexity processing pipelines that require the use of components at different levels of linguistic and application specific processing. These components often have to interface with external e.g. machine learning and information retrieval libraries as well as tools for human annotation and visualization. At the UKP Lab, we are working on the Darmstadt Knowledge Processing Software Repository (DKPro) (Gurevych et al., 2007a; Müller et al., 2008) to create a highly flexible, scalable and easy-to-use toolkit that allows rapid creation of complex NLP pipelines for semantic information processing on demand. The DKPro repository consists of several main parts created to serve the purposes of different NLP application areas

     

    Export in Literaturverwaltung
    Quelle: BASE Fachausschnitt Germanistik
    Sprache: Englisch
    Medientyp: Konferenzveröffentlichung
    Format: Online
    DDC Klassifikation: Sprache (400)
    Schlagworte: Information Extraction; Automatische Sprachanalyse; Maschinelles Lernen; Information Retrieval; Annotation; Informationsverarbeitung; Text Mining
    Lizenz:

    rightsstatements.org/page/InC/1.0/ ; info:eu-repo/semantics/openAccess