Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Adding object detection skills to visual dialogue agents

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • المؤلفون: Bani, G.; Belli, D.; Dagan, G.; Geenen, A.; Skliar, A.; Venkatesh, A.; Baumgärtner, T.; Bruni, E.; Fernández, R.
  • المصدر:
    Bani , G , Belli , D , Dagan , G , Geenen , A , Skliar , A , Venkatesh , A , Baumgärtner , T , Bruni , E & Fernández , R 2019 , Adding object detection skills to visual dialogue agents . in L Leal-Taixé & S Roth (eds) , Computer Vision – ECCV 2018 Workshops : Munich, Germany, September 8-14, 2018 : proceedings . vol. IV , Lecture Notes in Computer Science , vol. 11132 , Cham , pp. 180-187 , 15th European Conference on Computer Vision, Workshops , Munich , Bavaria , Germany ....
  • نوع التسجيلة:
    article in journal/newspaper
  • اللغة:
    English
  • معلومة اضافية
    • Contributors:
      Leal-Taixé, L.; Roth, S.
    • الموضوع:
      2019
    • Collection:
      Universiteit van Amsterdam: Digital Academic Repository (UvA DARE)
    • نبذة مختصرة :
      Our goal is to equip a dialogue agent that asks questions about a visual scene with object detection skills. We take the first steps in this direction within the GuessWhat?! game. We use Mask R-CNN object features as a replacement for ground-truth annotations in the Guesser module, achieving an accuracy of 57.92%. This proves that our system is a viable alternative to the original Guesser, which achieves an accuracy of 62.77% using ground-truth annotations, and thus should be considered an upper bound for our automated system. Crucially, we show that our system exploits the Mask R-CNN object features, in contrast to the original Guesser augmented with global, VGG features. Furthermore, by automating the object detection in GuessWhat?!, we open up a spectrum of opportunities, such as playing the game with new, non-annotated images and using the more granular visual features to condition the other modules of the game architecture.
    • File Description:
      application/pdf
    • ISBN:
      978-3-030-11017-8
      3-030-11017-6
    • Relation:
      https://dare.uva.nl/personal/pure/en/publications/adding-object-detection-skills-to-visual-dialogue-agents(d2e5c306-f1de-4a2a-a130-ae7815b41255).html; urn:ISBN:9783030110178
    • الرقم المعرف:
      10.1007/978-3-030-11018-5_17
    • الدخول الالكتروني :
      https://dare.uva.nl/personal/pure/en/publications/adding-object-detection-skills-to-visual-dialogue-agents(d2e5c306-f1de-4a2a-a130-ae7815b41255).html
      https://doi.org/10.1007/978-3-030-11018-5_17
      https://hdl.handle.net/11245.1/d2e5c306-f1de-4a2a-a130-ae7815b41255
      https://pure.uva.nl/ws/files/36985420/BaniEtal_sivl2018.pdf
      https://pure.uva.nl/ws/files/36985422/Adding_Object_Detection_Skills_to_Visual_Dialogue_Agents.pdf
      https://staff.fnwi.uva.nl/r.fernandezrovira/papers/2018/BaniEtal-sivl2018.pdf
    • Rights:
      info:eu-repo/semantics/openAccess
    • الرقم المعرف:
      edsbas.F6CD094D