VAQA: Visual Arabic Question Answering

نویسندگان

چکیده

Abstract Visual Question Answering (VQA) is the problem of automatically answering a natural language question about given image or video. Standard Arabic sixth most spoken around world. However, to best our knowledge, there are neither research attempts nor datasets for VQA in Arabic. In this paper, we generate first (VAQA) dataset, which fully generated. The dataset consists almost 138k Image-Question-Answer (IQA) triplets and specialized yes/no questions real-world images. A novel database schema an IQA ground-truth generation algorithm specially designed facilitate automatic VAQA creation. We propose Arabic-VQA system, where task formulated as binary classification problem. proposed system five modules, namely visual features extraction, pre-processing, textual feature fusion, answer prediction. Since it Arabic, investigate several approaches channel, identify effective pre-processing representation. For purpose, 24 models developed, two question-tokenization approaches, three word-embedding algorithms, four LSTM networks with different architectures investigated. comprehensive performance comparison conducted between all these on dataset. Experiments indicate that ranges from 80.8 84.9%, while utilizing Arabic-specified considering special case separating tool "Image missing" embedding words using fine-tuned Word2Vec AraVec2.0 have significantly improved performance. best-performing model treats separate token, embeds Skip-Gram model, extracts one-layer unidirectional LSTM. Further, compared related developed other popular language, their only according scope showing very comparable

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Arabic-English Question Answering

The goal of a Question Answering (QA) system is to provide inexperienced users with a flexible access to the information allowing them for writing a query in natural language and obtaining a concise answer. QA systems are mainly suited to English as the target language. In this paper we will investigate how much the translation of the queries, from the Arabic into the English language, could re...

متن کامل

Question Analysis for Arabic Question Answering Systems

The first step of processing a question in Question Answering(QA) Systems is to carry out a detailed analysis of the question for the purpose of determining what it is asking for and how to perfectly approach answering it. Our Question analysis uses several techniques to analyze any question given in natural language: a Stanford POS Tagger & parser for Arabic language, a named entity recognizer...

متن کامل

Towards Logical Inference for Arabic Question-Answering

This article constitutes an opening to think of the modeling and the analysis of Arabic texts within a question-answering system. It is a question of exceeding the traditional investigations focused on morpho-syntactic approaches. We present a new approach that analyzes a text, transforms it to logical predicates and extracts the accurate answer. In addition, we represent different levels of in...

متن کامل

Investigating Embedded Question Reuse in Question Answering

The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...

متن کامل

Revisiting Visual Question Answering Baselines

Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding. Many of the recently proposed VQA systems include attention or memory mechanisms designed to support “reasoning”. For multiple-choice VQA, nearly all of these systems train a multi-class classifier on image and question features to predict ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Arabian journal for science and engineering

سال: 2023

ISSN: ['2191-4281', '2193-567X']

DOI: https://doi.org/10.1007/s13369-023-07687-y