Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Value Iteration Networks with Double Estimator for Planetary Rover Path Planning.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • المؤلفون: Jin X;Jin X; Lan W; Lan W; Wang T; Wang T; Yu P; Yu P
  • المصدر:
    Sensors (Basel, Switzerland) [Sensors (Basel)] 2021 Dec 16; Vol. 21 (24). Date of Electronic Publication: 2021 Dec 16.
  • نوع النشر :
    Journal Article
  • اللغة:
    English
  • معلومة اضافية
    • المصدر:
      Publisher: MDPI Country of Publication: Switzerland NLM ID: 101204366 Publication Model: Electronic Cited Medium: Internet ISSN: 1424-8220 (Electronic) Linking ISSN: 14248220 NLM ISO Abbreviation: Sensors (Basel) Subsets: MEDLINE
    • بيانات النشر:
      Original Publication: Basel, Switzerland : MDPI, c2000-
    • الموضوع:
    • نبذة مختصرة :
      Path planning technology is significant for planetary rovers that perform exploration missions in unfamiliar environments. In this work, we propose a novel global path planning algorithm, based on the value iteration network (VIN), which is embedded within a differentiable planning module, built on the value iteration (VI) algorithm, and has emerged as an effective method to learn to plan. Despite the capability of learning environment dynamics and performing long-range reasoning, the VIN suffers from several limitations, including sensitivity to initialization and poor performance in large-scale domains. We introduce the double value iteration network (dVIN), which decouples action selection and value estimation in the VI module, using the weighted double estimator method to approximate the maximum expected value, instead of maximizing over the estimated action value. We have devised a simple, yet effective, two-stage training strategy for VI-based models to address the problem of high computational cost and poor performance in large-size domains. We evaluate the dVIN on planning problems in grid-world domains and realistic datasets, generated from terrain images of a moon landscape. We show that our dVIN empirically outperforms the baseline methods and generalize better to large-scale environments.
    • References:
      Nature. 2015 Feb 26;518(7540):529-33. (PMID: 25719670)
      Nature. 2020 Dec;588(7839):604-609. (PMID: 33361790)
    • Grant Information:
      2016YFC0301500 National Key Research and Development Program of China
    • Contributed Indexing:
      Keywords: deep neural network; double estimator method; planetary rover path planning; reinforcement learning; value iteration algorithm
    • الموضوع:
      Date Created: 20211228 Date Completed: 20211229 Latest Revision: 20211231
    • الموضوع:
      20221213
    • الرقم المعرف:
      PMC8709000
    • الرقم المعرف:
      10.3390/s21248418
    • الرقم المعرف:
      34960508