Linguistically Conditioned Semantic Textual Similarity

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Tu, Jingxuan; Xu, Keer; Yue, Liulu; Ye, Bingyang; Rim, Kyeongmin; Pustejovsky, James
الموضوع:
Computer Science - Computation and Language; Computer Science - Artificial Intelligence
نوع التسجيلة:
Working Paper
الدخول الالكتروني :
http://arxiv.org/abs/2406.03673

معلومة اضافية
- الموضوع:
  2024
- Collection:
  Computer Science
- نبذة مختصرة :
  Semantic textual similarity (STS) is a fundamental NLP task that measures the semantic similarity between a pair of sentences. In order to reduce the inherent ambiguity posed from the sentences, a recent work called Conditional STS (C-STS) has been proposed to measure the sentences' similarity conditioned on a certain aspect. Despite the popularity of C-STS, we find that the current C-STS dataset suffers from various issues that could impede proper evaluation on this task. In this paper, we reannotate the C-STS validation set and observe an annotator discrepancy on 55% of the instances resulting from the annotation errors in the original label, ill-defined conditions, and the lack of clarity in the task definition. After a thorough dataset analysis, we improve the C-STS task by leveraging the models' capability to understand the conditions under a QA task setting. With the generated answers, we present an automatic error identification pipeline that is able to identify annotation errors from the C-STS data with over 80% F1 score. We also propose a new method that largely improves the performance over baselines on the C-STS data by training the models with the answers. Finally we discuss the conditionality annotation based on the typed-feature structure (TFS) of entity types. We show in examples that the TFS is able to provide a linguistic foundation for constructing C-STS data with new conditions.
  Comment: To appear in the ACL 2024 main proceedings
- الرقم المعرف:
  edsarx.2406.03673

تعليقات

No Comments.

Linguistically Conditioned Semantic Textual Similarity

اتصل بنا

اتبع