Abstract: A good knowledge-based visual question answering (KB-VQA) model requires detailed visual information, semantically clear questions, and relevant external knowledge to address open visual ...
Abstract: The rapidly evolving field of robotics necessitates methods that can facilitate the fusion of multiple modalities. Specifically, when it comes to interacting with tangible objects, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results