Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Knowledge Distillation for Ensemble Learning, Generative Modeling, and Continual Learning ; 앙상블학습, 생성모델링, 지속학습을 위한 지식 증류

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      한보형; Minsoo Kang; 공과대학 전기·정보공학부; Machine Learning, Computer Vision
    • بيانات النشر:
      서울대학교 대학원
    • الموضوع:
      2024
    • Collection:
      Seoul National University: S-Space
    • نبذة مختصرة :
      학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2024. 2. 한보형. ; Deep neural networks have achieved superior performance in various domains including computer vision, natural language processing, robotics, and many other applications. Despite their exceptional accomplishments, the applicability to resource-hungry systems is still limited due to their computational or physical costs in terms of model sizes, FLOPs, and power consumption. Besides the problem, when we train the models under the settings for a sequence of tasks in an online manner, they can not generalize well, especially on the previous tasks. In the dissertation, we address these two limitations of the deep neural networks via knowledge distillation. In case of ensemble learning, although it is effective and straightforward to achieve better generalization performance through the use of predictions based on multiple models, it significantly increases inference costs, which makes its application in portable devices more challenging. To tackle the issue, we compress multiple models into a single model via knowledge distillation, where we consider the ensemble models as a teacher. Based on the ensemble-based teacher, we propose an oracle knowledge distillation loss incorporated with neural architecture search techniques, which successfully addresses the model capacity issues and enables the target model to achieve competitive accuracy with the teacher. In the context of generative models, they also typically require high inference costs, which limit their application in low-power devices such as mobile phones. Existing knowledge distillation approaches for the compression of the generative models simply employ L1 or L2 norms to compute the pixel-wise difference between the representations of two networks, where these frameworks are widely adopted in image classification tasks. In addition to these methods, we propose a knowledge distillation approach that maximizes the mutual information between the student and the teacher. For the domain of generative modeling, ...
    • File Description:
      xiv, 128
    • ISBN:
      978-0-00-000000-2
      0-00-000000-0
    • Relation:
      000000179939; https://hdl.handle.net/10371/210033; https://dcollection.snu.ac.kr/common/orgView/000000179939; 000000000051▲000000000062▲000000179939▲
    • الدخول الالكتروني :
      https://hdl.handle.net/10371/210033
      https://dcollection.snu.ac.kr/common/orgView/000000179939
    • الرقم المعرف:
      edsbas.4059D192