Weak Labelling for File-level Source Code Classification

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Sas, Cezar; Capiluppi, Andrea
المصدر:
Sas , C & Capiluppi , A 2023 , Weak Labelling for File-level Source Code Classification . in T Zhang , X Xia & N Novielli (eds) , Proceedings - 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2023 . Proceedings - 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2023 , Institute of Electrical and Electronics Engineers Inc. , pp. 698-702 , 30th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2023 , Macao , China , 21/03/2023 . https://doi.org/10.1109/SANER56733.2023.00074
الموضوع:
semantic reverse engineering; software categories key-words; software classification
نوع التسجيلة:
article in journal/newspaper
اللغة:
English

معلومة اضافية
- Contributors:
  Zhang, Tao; Xia, Xin; Novielli, Nicole
- بيانات النشر:
  Institute of Electrical and Electronics Engineers Inc.
- الموضوع:
  2023
- Collection:
  University of Groningen research database
- نبذة مختصرة :
  Software repository hosting services contain large amounts of open-source software, with GitHub hosting over 200 million repositories, from new to established ones. However, these repositories are not easy to find, calling for various attempts to classify their application domains automatically. However, most proposed approaches use artifacts, like README files, as a proxy for the project, losing the information in the source code and the interaction between files. Furthermore, they all focus on the project-level, ignoring the decomposition of software projects into components and modules.This work presents a weak labelling approach based on keyword extraction to annotate source files in a software project.Our findings suggest that using keywords to perform file-level annotations is an effective approach that can capture enough information from the source file so that new labels can be predicted.The long-term goal of our research is to classify source code files and use these annotations to identify semantic components in software projects. In addition, these annotations can be used for semantic reverse engineering, software reuse, and more. We plan to train machine learning models that use our proposed weak supervision to better annotate source files inside software projects.
- File Description:
  application/pdf
- ISBN:
  978-1-66545-278-6
  1-66545-278-1
- Relation:
  urn:ISBN:9781665452786
- الرقم المعرف:
  10.1109/SANER56733.2023.00074
- الدخول الالكتروني :
  https://hdl.handle.net/11370/3a266d78-ef50-47e0-b3fd-ad61643e1fd1
  https://research.rug.nl/en/publications/3a266d78-ef50-47e0-b3fd-ad61643e1fd1
  https://doi.org/10.1109/SANER56733.2023.00074
  https://pure.rug.nl/ws/files/744460647/Weak_Labelling_for_File-level_Source_Code_Classification.pdf
  http://www.scopus.com/inward/record.url?scp=85160512827&partnerID=8YFLogxK
- Rights:
  info:eu-repo/semantics/openAccess
- الرقم المعرف:
  edsbas.B9E2448

تعليقات

No Comments.

Weak Labelling for File-level Source Code Classification

اتصل بنا

اتبع