Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

An Analysis of French-Language Tweets About COVID-19 Vaccines: Supervised Learning Approach

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Laboratoire de Psychologie Sociale et Cognitive (LAPSCO); Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne (UCA); Institut national polytechnique Clermont Auvergne (INP Clermont Auvergne); Université Clermont Auvergne (UCA); Laboratoire de Mathématiques Blaise Pascal (LMBP)
    • بيانات النشر:
      HAL CCSD
      JMIR Publications
    • الموضوع:
      2022
    • Collection:
      Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe)
    • نبذة مختصرة :
      International audience ; As the pandemic progressed, disinformation, fake news and conspiracy spread through many parts of society. However, the disinformation spreading through social media is, according to the literature, one of the causes of increased COVID-19 vaccine hesitancy. In this context, the analysis of social media is particularly important, but the large amount of data exchanged on social networks requires specific methods. This is why machine learning and natural language processing (NLP) models are increasingly applied to social media data.Objective: The aim of this study is to examine the capability of the CamemBERT French language model to faithfully predict elaborated categories, with the knowledge that tweets about vaccination are often ambiguous, sarcastic or irrelevant to the studied topic.Methods: A total of 901,908 unique French tweets related to vaccination published between July 12, 2021, and August 11, 2021, were extracted using the Twitter API v2. Approximately 2,000 randomly selected tweets were labeled with two types of categorization: (1) arguments for ("pros") or against ("cons") vaccination (sanitary measures included) and (2) the type of content of tweets ("scientific", "political", "social", or "vaccination status"). The CamemBERT model was fine-tuned and tested for the classification of French tweets. The model performance was assessed by computing the F1-score, and confusion matrices were obtained. Results: The accuracy of the applied machine learning reached up to 70.6% for the first classification ("pros" and "cons" tweets) and up to 90.0% for the second classification ("scientific" and "political" tweets). Furthermore, a tweet was 1.86 times more likely to be incorrectly classified by the model if it contained fewer than 170 characters (odds ratio = 1.86; 1.20 < 95% confidence interval < 2.86).Conclusions: The accuracy is affected by the classification chosen and the topic of the message examined. When the vaccine debate is jostled by contested political decisions, ...
    • Relation:
      hal-03659154; https://hal.science/hal-03659154; https://hal.science/hal-03659154v2/document; https://hal.science/hal-03659154v2/file/Sauvayre_et_al_An_Analysis_of_French-Language_Tweets_About_COVID-19_Vaccines_Supervised_Learning_Approach_accepted.pdf
    • الرقم المعرف:
      10.2196/37831
    • Rights:
      http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.33476818