Adding object detection skills to visual dialogue agents

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Bani, G.; Belli, D.; Dagan, G.; Geenen, A.; Skliar, A.; Venkatesh, A.; Baumgärtner, T.; Bruni, E.; Fernández, R.
المصدر:
Bani , G , Belli , D , Dagan , G , Geenen , A , Skliar , A , Venkatesh , A , Baumgärtner , T , Bruni , E & Fernández , R 2019 , Adding object detection skills to visual dialogue agents . in L Leal-Taixé & S Roth (eds) , Computer Vision – ECCV 2018 Workshops : Munich, Germany, September 8-14, 2018 : proceedings . vol. IV , Lecture Notes in Computer Science , vol. 11132 , Cham , pp. 180-187 , 15th European Conference on Computer Vision, Workshops , Munich , Bavaria , Germany ....
نوع التسجيلة:
article in journal/newspaper
اللغة:
English

معلومة اضافية
- Contributors:
  Leal-Taixé, L.; Roth, S.
- الموضوع:
  2019
- Collection:
  Universiteit van Amsterdam: Digital Academic Repository (UvA DARE)
- نبذة مختصرة :
  Our goal is to equip a dialogue agent that asks questions about a visual scene with object detection skills. We take the first steps in this direction within the GuessWhat?! game. We use Mask R-CNN object features as a replacement for ground-truth annotations in the Guesser module, achieving an accuracy of 57.92%. This proves that our system is a viable alternative to the original Guesser, which achieves an accuracy of 62.77% using ground-truth annotations, and thus should be considered an upper bound for our automated system. Crucially, we show that our system exploits the Mask R-CNN object features, in contrast to the original Guesser augmented with global, VGG features. Furthermore, by automating the object detection in GuessWhat?!, we open up a spectrum of opportunities, such as playing the game with new, non-annotated images and using the more granular visual features to condition the other modules of the game architecture.
- File Description:
  application/pdf
- ISBN:
  978-3-030-11017-8
  3-030-11017-6
- Relation:
  https://dare.uva.nl/personal/pure/en/publications/adding-object-detection-skills-to-visual-dialogue-agents(d2e5c306-f1de-4a2a-a130-ae7815b41255).html; urn:ISBN:9783030110178
- الرقم المعرف:
  10.1007/978-3-030-11018-5_17
- الدخول الالكتروني :
  https://dare.uva.nl/personal/pure/en/publications/adding-object-detection-skills-to-visual-dialogue-agents(d2e5c306-f1de-4a2a-a130-ae7815b41255).html
  https://doi.org/10.1007/978-3-030-11018-5_17
  https://hdl.handle.net/11245.1/d2e5c306-f1de-4a2a-a130-ae7815b41255
  https://pure.uva.nl/ws/files/36985420/BaniEtal_sivl2018.pdf
  https://pure.uva.nl/ws/files/36985422/Adding_Object_Detection_Skills_to_Visual_Dialogue_Agents.pdf
  https://staff.fnwi.uva.nl/r.fernandezrovira/papers/2018/BaniEtal-sivl2018.pdf
- Rights:
  info:eu-repo/semantics/openAccess
- الرقم المعرف:
  edsbas.F6CD094D

تعليقات

No Comments.

Adding object detection skills to visual dialogue agents

اتصل بنا

اتبع