When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

نوع التسجيلة:
Electronic Resource
الدخول الالكتروني :
http://arxiv.org/abs/2210.01478

معلومة اضافية
- Publisher Information:
  2022-10-04 2022-10-27
- Added Details:
  Jin, Zhijing
  Levine, Sydney
  Gonzalez, Fernando
  Kamal, Ojasv
  Sap, Maarten
  Sachan, Mrinmaya
  Mihalcea, Rada
  Tenenbaum, Josh
  Schölkopf, Bernhard
- نبذة مختصرة :
  AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability to determine when a rule should be broken, especially in novel or unusual situations. In this paper, we present a novel challenge set consisting of rule-breaking question answering (RBQA) of cases that involve potentially permissible rule-breaking -- inspired by recent moral psychology studies. Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments. MORALCOT outperforms seven existing LLMs by 6.2% F1, suggesting that modeling human reasoning might be necessary to capture the flexibility of the human moral mind. We also conduct a detailed error analysis to suggest directions for future work to improve AI safety using RBQA. Our data is open-sourced at https://huggingface.co/datasets/feradauto/MoralExceptQA and code at https://github.com/feradauto/MoralCoT
  Comment: NeurIPS 2022 Oral
- الموضوع:
  Computer Science - Computation and Language; Computer Science - Artificial Intelligence; Computer Science - Computers and Society; Computer Science - Machine Learning; text
- Other Numbers:
  COO oai:arXiv.org:2210.01478
  1381571114
- Contributing Source:
  CORNELL UNIV
  From OAIster®, provided by the OCLC Cooperative.
- الرقم المعرف:
  edsoai.on1381571114

HoldingsOnline

تعليقات

No Comments.

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

اتصل بنا

اتبع