- Patent Number:
11809,523
- Appl. No:
17/178717
- Application Filed:
February 18, 2021
- نبذة مختصرة :
A method and information storage media having instructions stored thereon for supervised Deep Learning (DL) systems to learn directly from unlabeled data without any user annotation. The annotation-free solutions incorporate a new learning module, the Localization, Synthesis and Teacher/Annotation Network (LSTN) module, which features a data synthesis and generation engine as well as a Teacher network for object detection and segmentation that feeds the processing loop with new annotated objects detected from images captured at the field. The first step in the LSTN module learns how to localize and segment the objects within a given image/scene following an unsupervised approach as no annotations about the objects' segmentation mask or bounding box are provided.
- Inventors:
IRIDA LABS S.A. (Patras, GR)
- Assignees:
IRIDA LABS S.A. (Patras, GR)
- Claim:
1. A method for learning to generate bounding box and segmentation masks from categorical labeled images comprising: collecting two or more images from two or more categories, that are fed, as in, in a process pipeline, and for each of said images: localizing boundaries of objects within the images in an unsupervised learning manner by utilizing a deep Convolutional Neural Network (CNN) classification model, with a global average pool layer, configured to generate soft object proposals and configured to generate weak binary masks around the objects by applying a threshold on activation maps of the classification CNN; using the threshold to define a segmentation mask and assign pixels in object/non-object categories; modeling a distribution of object/non-object pixels represented as vectors learnt from the classification CNN; using the modeled distribution and a threshold to assign pixels to object/non-object categories and extract segmentation masks; training a segmentation CNN model on extracted coarse segmentation masks, thereby determining finer object boundaries; generating novel annotated images by arbitrarily blending segmented objects with other background images; generating bounding boxes by fitting a rectangle on the fine segmentation masks; and outputting the annotated images.
- Claim:
2. The method of claim 1 , wherein the deep CNN classification model is trained to localize objects within images via activation maps.
- Claim:
3. The method of claim 1 , wherein the CNN segmentation model is trained to perform fine object segmentation and bounding box regression.
- Claim:
4. A non-transitory computer readable information storage medium having stored therein instructions, that when executed by one or more processors, cause a method to be performed for learning to generate bounding box and segmentation masks from categorical labeled images, comprising: collecting two or more images from two or more categories, that are fed, as is in a process pipeline, and for each of said images: localizing boundaries of objects within the images in an unsupervised learning manner by utilizing a deep Convolutional Neural Network (CNN) classification model, with a global average pool layer, configured to generate soft object proposals and configured to generate weak binary masks around the objects by applying a threshold on activation maps of the classification CNN; using the threshold to define a segmentation mask and assign pixels in object/non-object categories; modeling a distribution of object/non-object pixels represented as vectors learnt from the classification CNN; using the modeled distribution and a threshold to assign pixels to object/non-object categories and extract segmentation masks; training a segmentation CNN model on extracted coarse segmentation masks, thereby determining finer object boundaries; generating novel annotated images by arbitrarily blending segmented objects with other background images; generating bounding boxes by fitting a rectangle on the fine segmentation masks; and outputting the annotated images.
- Claim:
5. The media of claim 4 , wherein the deep CNN classification model is trained to localize objects within images via activation maps.
- Claim:
6. The media of claim 4 , wherein the CNN segmentation model is trained to perform fine object segmentation and bounding box regression.
- Patent References Cited:
20210027103 January 2021 Brower
20210142107 May 2021 Vineet
20210241034 August 2021 Laradji
20210287054 September 2021 Zhang
20220012530 January 2022 Singh
- Other References:
Pinheiro Pedro O et al: “Learning to Refine Object Segments”, Computer Vision—ECCV 2016—14th European Conference, Oct. 11-14, 2016, vol. 9905, pp. 75-91, XP055828334 (Year: 2016). cited by examiner
Anonymous: “Unsupervised learning”, Wikipedia, Feb. 16, 2021, pp. 1-3, XP055918811 Retrieved from the Internet: https://en.wikipedia.org/w/index.php?title-Unsupervised_learning&oldid=1007200825 (Year: 2021). cited by examiner
Zhou Bolei et al: “Learning Deep Features for Discriminative Localization”, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 27, 2016, pp. 2921-2929, XP033021473 (Year: 2016). cited by examiner
Anonymous “Unsupervised Learning” Wikipedia; pp. 1-3; Feb. 16, 2021. cited by applicant
Pinheiro, Pedro O. et al. “Learning to Refine Object Segments” Springer International Publishing AG 2016; ECCV 2016, Part I, LNCS 9905, pp. 75-91, 2016. cited by applicant
Zhou, Bolei et al. “Learning Deep Features for Discriminative Localization” IEEE Conference on Computer Vision and Pattern Recognition; 2016. cited by applicant
International Search Report for International Application No. PCT/B2022/051331, dated May 18, 2022. cited by applicant
Written Opinion for International Application No. PCT/IB2022/051331, dated May 18, 2022. cited by applicant
International Preliminary Report on Patentability for International Application No. PCT/IB2022/051331; dated Aug. 31, 2023. cited by applicant
- Primary Examiner:
Hsieh, Ping Y
- Attorney, Agent or Firm:
Vick, Jason H.
Sheridan Ross, PC
- الرقم المعرف:
edspgr.11809523
No Comments.