Recommendation of deep reinforcement learning based on value function considering error reduction

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ على الانترنت اقرأ أكثر حفظ في قائمتي

المؤلفون: JinLian Zhou; DeRong Shen; Ying Guo; Yan Wu; JianHua Ma
المصدر:
Scientific Reports, Vol 15, Iss 1, Pp 1-17 (2025)
الموضوع:
Knowledge graph; Reinforcement learning; Proximal policy optimization; Recommendation system; Medicine; Science
نوع التسجيلة:
article
اللغة:
English
الدخول الالكتروني :
https://doaj.org/article/eea8ac0ad98a4ddb8ad30f5b3916cbf3

معلومة اضافية
- بيانات النشر:
  Nature Portfolio, 2025.
- الموضوع:
  2025
- Collection:
  LCC:Medicine
  LCC:Science
- نبذة مختصرة :
  Abstract Deep reinforcement learning (DRL) algorithms have been widely applied in user cold-start recommender systems because they can gradually capture users’ dynamic interest preferences. Deep Q-Networks (DQN) have become the most popular reinforcement learning (RL) method due to their simple update strategy and excellent performance. In many user cold-start scenarios, the action space is gradually reduced to avoid recommending duplicate items to users. However, current DQN-based RL recommender systems output the entire action space fixedly, inevitably leading to discrepancies with the gradually shrinking action space. This paper demonstrates that such discrepancies cause a decrement error in the action space corresponding to the temporal difference (TD) in the original RL, rendering standard DQN reinforcement learning methods inaccurate in Q-value estimation. Moreover, in long-term recommendation scenarios, the differences in the lengths of interactions recommended to different users are significant, making it difficult to ignore such errors, thereby challenging the applicability of these methods in scenarios where the action space gradually reduces. To address this issue, this paper introduces a new algorithm called Q-AD (Q-learning Action Decrease), which is based on DQN and aims to mitigate the reduction error in the action space by buffering the Q-value estimation error at each update. Q-AD augments the standard DQN with an error reduction term for TD updates. Through experiments, it was observed that the Q-AD algorithm significantly reduces value estimation errors and achieves better accuracy and efficiency compared to previous methods across different datasets.
- File Description:
  electronic resource
- ISSN:
  2045-2322
- Relation:
  https://doaj.org/toc/2045-2322
- الرقم المعرف:
  10.1038/s41598-025-18926-7
- الرقم المعرف:
  edsdoj.8ac0ad98a4ddb8ad30f5b3916cbf3

تعليقات

No Comments.

Recommendation of deep reinforcement learning based on value function considering error reduction

اتصل بنا

اتبع