Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

The Dark Side of Likert-type Scales: Implications of the Midscale Disagreement Problem

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Cognition, langues, langage, ergonomie (CLLE); École Pratique des Hautes Études (EPHE); Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-Université Toulouse - Jean Jaurès (UT2J); Université de Toulouse (UT)-Université de Toulouse (UT)-Université Bordeaux Montaigne (UBM)-Centre National de la Recherche Scientifique (CNRS)-Toulouse Mind & Brain Institut (TMBI); Université Toulouse - Jean Jaurès (UT2J); Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3); Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J); Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3); Université de Toulouse (UT)
    • بيانات النشر:
      HAL CCSD
    • الموضوع:
      2024
    • Collection:
      EPHE (Ecole pratique des hautes études, Paris): HAL
    • الموضوع:
    • نبذة مختصرة :
      International audience ; Proper stimulus control in psychology experiments plays a fundamental role in the validity of their results. In many cases, the criteria used for stimulus selection include dimensions (e.g. concreteness, emotional valence/arousal) assessed behaviourally through Likert-type scales. Typically, a pilot group of participants is asked to rate a list of potential stimuli on a scale. The average rating for each item is then computed and used to determine which items fit the criteria to establish the experiment’s stimulus lists. Recently, several factors such as technological advances, the need for standardised materials and a high number of theoretically relevant dimensions to control have also led to a proliferation of large rating databases which provide standard summary statistics for hundreds to tens of thousands of items. Despite their importance and the increasing amount of resources dedicated to their collection, however, the ratings in themselves have been subject to surprisingly little methodological consideration. One of the key assumptions in the aforementioned approach is that the average rating reflects the item’s position on the scale’s continuum. Through a case study in psycholinguistics, we show instead that most items with an average rating towards the middle of the scale display high disagreement among raters, and thus that their averages do not, by themselves, capture any meaningful information about the underlying responses. Rather, they are an artefact of the scale. After providing an intuitive graphical interpretation of Likert-type summary statistics (the typically reported means and standard deviations), we derive two additional implications of this midscale disagreement problem and argue that it greatly affects the validity of a large number of studies – either because of inadequate stimulus sampling or statistical modelling. We finally extend our analysis to some fields in which Likert-type ratings are treated as the dependant variable and show that they also suffer ...
    • الدخول الالكتروني :
      https://univ-tlse2.hal.science/hal-04712729
      https://univ-tlse2.hal.science/hal-04712729v1/document
      https://univ-tlse2.hal.science/hal-04712729v1/file/Poster_PSE6.pdf
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.4E32DD66