Information fusion in visual question answering: A Survey
نویسندگان
چکیده
منابع مشابه
Investigating Embedded Question Reuse in Question Answering
The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...
متن کاملVisual question answering: A survey of methods and datasets
Visual Question Answering (VQA) is a challenging task that has received increasing attention from both the computer vision and the natural language processing communities. Given an image and a question in natural language, it requires reasoning over visual elements of the image and general knowledge to infer the correct answer. In the first part of this survey, we examine the state of the art b...
متن کاملBiomedical question answering: A survey
OBJECTIVES In this survey, we reviewed the current state of the art in biomedical QA (Question Answering), within a broader framework of semantic knowledge-based QA approaches, and projected directions for the future research development in this critical area of intersection between Artificial Intelligence, Information Retrieval, and Biomedical Informatics. MATERIALS AND METHODS We devised a ...
متن کاملSurvey of Recent Advances in Visual Question Answering
Visual Question Answering (VQA) presents a unique challenge as it requires the ability to understand and encode the multi-modal inputs in terms of image processing and natural language processing. The algorithm further needs to learn how to perform reasoning over this multi-modal representation so it can answer the questions correctly. This paper presents a survey of different approaches propos...
متن کاملGeneralized Hadamard-Product Fusion Operators for Visual Question Answering
We propose a generalized class of multimodal fusion operators for the task of visual question answering (VQA). We identify generalizations of existing multimodal fusion operators based on the Hadamard product, and show that specific nontrivial instantiations of this generalized fusion operator exhibit superior performance in terms of OpenEnded accuracy on the VQA task. In particular, we introdu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Fusion
سال: 2019
ISSN: 1566-2535
DOI: 10.1016/j.inffus.2019.03.005