Word Embedding-Based Approaches for Measuring Semantic Similarity of Arabic-English Sentences

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Billah Nagoudi, El Moatez; Ferrero, Jérémy; Schwab, Didier; Cherroun, Hadda
المصدر:
6th International Conference on Arabic Language Processing
https://hal.science/hal-01683494
6th International Conference on Arabic Language Processing, Oct 2017, Fez, Morocco
الموضوع:
Cross-Language; Word Embeddings; Semantic Sentences Similarity; Machine Translation; Arabic-English; [INFO]Computer Science [cs]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
نوع التسجيلة:
conference object
اللغة:
English

معلومة اضافية
- Contributors:
  Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole (GETALP ); Laboratoire d'Informatique de Grenoble (LIG ); Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 )-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 ); Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 ); Compilatio Labs (CLabs); Compilatio; Université Grenoble Alpes 2016-2019 (UGA 2016-2019 )
- بيانات النشر:
  HAL CCSD
- الموضوع:
  2017
- Collection:
  Université Grenoble Alpes: HAL
- الموضوع:
  Fez; Morocco
- نبذة مختصرة :
  International audience ; Semantic Textual Similarity (STS) is an important component in many Natural Language Processing (NLP) applications, and plays an important role in diverse areas such as information retrieval, machine translation, information extraction and plagiarism detection. In this paper we propose two word embedding-based approaches devoted to measuring the semantic similarity between Arabic-English cross-language sentences. The main idea is to exploit Machine Translation (MT) and an improved word embedding representations in order to capture the syntactic and semantic properties of words. MT is used to translate English sentences into Arabic language in order to apply a classical monolingual comparison. Afterwards, two word embedding-based methods are developed to rate the semantic similarity. Additionally, Words Alignment (WA), Inverse Document Frequency (IDF) and Part-of-Speech (POS) weighting are applied on the examined sentences to support the identification of words that are most descriptive in each sentence. The performances of our approaches are evaluated on a cross-language dataset containing more than 2400 Arabic-English pairs of sentence. Moreover, the proposed methods are confirmed through the Pearson correlation between our similarity scores and human ratings.
- Relation:
  hal-01683494; https://hal.science/hal-01683494; https://hal.science/hal-01683494/document; https://hal.science/hal-01683494/file/Lecture_Notes_in_Computer_Science__LNCS__2%20%2819%29.pdf
- Rights:
  info:eu-repo/semantics/OpenAccess
- الرقم المعرف:
  edsbas.7C157401

تعليقات

No Comments.

Word Embedding-Based Approaches for Measuring Semantic Similarity of Arabic-English Sentences

اتصل بنا

اتبع