Answer Search Indonesian Language Hadith Using Vector Space Model in PDF Document
نویسنده
چکیده
Digital text documents are spread in various formats, the most widely used formats today include word format, and PDF format. This research will try to make text search application in text document using vector space approach model. The document format used is a PDF document. Text in PDF will be extracted and then made rank using vector space model. The PDF document consists of ten pages and each page contains a hadith. In general the system can search from the PDF document quite well and able to display the list of results in accordance with the relevance rank with the question. Keywords— answer retrieval, vector spaces model, text mining
منابع مشابه
Improved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملIndonesian-English Cross Language Question Answering
Our Indonesian-English Cross Language Question Answering (CLQA) is divided into 4 components: question analyzer, keyword translator, passage retriever and answer finder component. The Indonesian question is inputted into a question analyzer which yields Indonesian keyword list, Indonesian question focus and question class. We defined the question class by using an SVM machine implemented in Wek...
متن کاملA Machine Learning Approach for an Indonesian-English Cross Language Question Answering System
We have built a CLQA (Cross Language Question Answering) system for a source language with limited data resources (e.g. Indonesian) using a machine learning approach. The CLQA system consists of four modules: question analyzer, keyword translator, passage retriever and answer finder. We used machine learning in two modules, the question classifier (part of the question analyzer) and the answer ...
متن کاملClassification of Hadiths using LVQ based on VSM Considering Words Orde
The religion of Islam is based on a sacred text called Qur‟an, a divine speech expressed in Arabic language. Qur‟an constitutes the main root of Islam jurisprudence which has a second source of inspiration known as Hadiths. As the Muslim‟s life is governed by those holy texts, need of their authenticity is required. Using VSM (Vector Space Model), we can represent Hadiths as a vector of words. ...
متن کامل