Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Beyond Value Estimation: Adaptive Potential Functions For Reinforcement Learning

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • المؤلفون: Chen, Yifei
  • المصدر:
    Chen, Y 2025, 'Beyond Value Estimation: Adaptive Potential Functions For Reinforcement Learning', Doctor of Philosophy, University of Groningen, [Groningen]. https://doi.org/10.33612/diss.1246874213
  • نوع التسجيلة:
    book
  • اللغة:
    English
  • معلومة اضافية
    • بيانات النشر:
      University of Groningen
    • الموضوع:
      2025
    • Collection:
      University of Groningen research database
    • نبذة مختصرة :
      Despite tremendous advances in reinforcement learning algorithms over the last decade, fundamental limitations remain. While reinforcement learning is an effective process in the biological domain, its digital counterpart requires extensive training experience that evolves slowly, with significant uncertainty and variation. From the outset of this project, we hypothesized that ineffective estimation of value and utility might be the root cause of these challenges. This thesis focuses on improving reinforcement learning (RL) by addressing two major challenges: Inaccurate value estimation and dependence on extensive domain knowledge. It investigates the value-estimation process, a core aspect of RL, with two primary contributions: 1. Learning Rate and Overestimation Bias: This study examines how learning rates influence overestimation bias in Q-learning algorithms, demonstrating that decaying learning rates enhance value estimation accuracy and training performance across diverse environments. We emphasize that setting correct learning rates is essential in avoiding the overestimation problem of Q-learning algorithms. 2. Adaptive Potential Function (APF): This novel reward-shaping mechanism accelerates learning by leveraging historical trial data. APF is integrated into RL frameworks (e.g., Q-learning and actor-critic methods) and adapted for tasks in discrete, continuous, and high-dimensional state spaces. Experimental results confirm its effectiveness across mazes, robotic control tasks, and Atari games. Additionally, a deep-learning-based encoder, W-Net, further enhances APF's performance in high-dimensional environments by reducing memory demands. Empirical results show substantial improvements in training efficiency and performance across various Atari games. This thesis concludes by highlighting APF and W-Net as key contributions to RL research, with possibilities for broader applications.
    • File Description:
      application/pdf
    • الرقم المعرف:
      10.33612/diss.1246874213
    • الدخول الالكتروني :
      https://hdl.handle.net/11370/166afea6-e169-41ba-86bc-f8604fb1e0fb
      https://research.rug.nl/en/publications/166afea6-e169-41ba-86bc-f8604fb1e0fb
      https://doi.org/10.33612/diss.1246874213
      https://pure.rug.nl/ws/files/1246874215/Title_and_contents.pdf
      https://pure.rug.nl/ws/files/1246874217/Chapter_1.pdf
      https://pure.rug.nl/ws/files/1246874219/Chapter_2.pdf
      https://pure.rug.nl/ws/files/1246874221/Chapter_3.pdf
      https://pure.rug.nl/ws/files/1246874223/Chapter_4.pdf
      https://pure.rug.nl/ws/files/1246874225/Chapter_5.pdf
      https://pure.rug.nl/ws/files/1246874227/Chapter_6.pdf
    • Rights:
      info:eu-repo/semantics/openAccess
    • الرقم المعرف:
      edsbas.521290C8