Self-supervised object detection from audio-visual correspondence

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Afouras, T.; Asano, Y.M.; Fagan, F.; Vedaldi, A.; Metze, F.
المصدر:
Afouras , T , Asano , Y M , Fagan , F , Vedaldi , A & Metze , F 2022 , Self-supervised object detection from audio-visual correspondence . in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition : New Orleans, Louisiana, 19-24 June 2022 : proceedings . CVPR , IEEE Computer Society , Los Alamitos, California , pp. 10565-10576 , 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 , New Orleans , United States , 19/06/22 . https://doi.org/10.48550/arXiv.2104.06401 , https://doi.org/10.1109/CVPR52688.2022.01032
نوع التسجيلة:
article in journal/newspaper
اللغة:
English

معلومة اضافية
- بيانات النشر:
  IEEE Computer Society
- الموضوع:
  2022
- Collection:
  Universiteit van Amsterdam: Digital Academic Repository (UvA DARE)
- نبذة مختصرة :
  We tackle the problem of learning object detectors without supervision. Differently from weakly-supervised object detection, we do not assume image-level class labels. Instead, we extract a supervisory signal from audio-visual data, using the audio component to “teach” the object detector. While this problem is related to sound source localisation, it is considerably harder because the detector must classify the objects by type, enumerate each instance of the object, and do so even when the object is silent. We tackle this problem by first designing a self-supervised framework with a contrastive objective that jointly learns to classify and localise objects. Then, without using any supervision, we simply use these self-supervised labels and boxes to train an image-based object detector. With this, we outperform previous unsupervised and weakly-supervised detectors for the task of object detection and sound source localization. We also show that we can align this detector to ground-truth classes with as little as one label per pseudo-class, and show how our method can learn to detect generic objects that go beyond instruments, such as airplanes and cats.
- File Description:
  application/pdf
- ISBN:
  978-1-66546-947-0
  1-66546-947-1
- Relation:
  urn:ISBN:9781665469470
- الرقم المعرف:
  10.48550/arXiv.2104.06401
- الدخول الالكتروني :
  https://dare.uva.nl/personal/pure/en/publications/selfsupervised-object-detection-from-audiovisual-correspondence(6874f5ad-b084-4bbb-b3c8-675faafd998a).html
  https://doi.org/10.48550/arXiv.2104.06401
  https://hdl.handle.net/11245.1/6874f5ad-b084-4bbb-b3c8-675faafd998a
  https://pure.uva.nl/ws/files/130946932/Afouras_Self_Supervised_Object_Detection_From_Audio_Visual_Correspondence_CVPR_2022_paper.pdf
  https://www.proceedings.com/65666.html
  https://openaccess.thecvf.com/content/CVPR2022/html/Afouras_Self-Supervised_Object_Detection_From_Audio-Visual_Correspondence_CVPR_2022_paper.html
- Rights:
  info:eu-repo/semantics/openAccess
- الرقم المعرف:
  edsbas.BABD00C1

تعليقات

No Comments.

Self-supervised object detection from audio-visual correspondence

اتصل بنا

اتبع