نتایج جستجو برای: arabic text classification
تعداد نتایج: 727070 فیلتر نتایج به سال:
In this work, we present a method for classifying handwritten and printed Arabic text zones in noisy document images. We use Three-Adjacent-Segment (TAS) [8] based features which capture properties of a script. We construct two different codebooks of the local shape features extracted from a set of handwritten and printed Arabic documents and use it to train both Support Vector Machine and Fish...
Many algorithms have been implemented to the problem of Automatic Text Categorization (ATC). Most of the work in this area has been carried out on English texts, with only a few researchers addressing Arabic texts. We have investigated the use of the K-Nearest Neighbour (K-NN) classifier, with an Inew, cosine, jaccard and dice similarities, in order to enhance Arabic ATC. We represent the datas...
complies with the regulations of the University and meets the accepted standards with respect to originality and quality. Word segmentation is an important task for many methods that are related to document understanding especially word spotting and word recognition. Several approaches of word segmentation have been proposed for Latin-based languages while a few of them have been introduced for...
Documents categorization is an important field in the area of natural language processing. In this paper, we propose using Latent Semantic Indexing (LSI), singular value decomposing (SVD) method, and clustering techniques to group similar unlabeled document into pre-specified number of topics. The generated groups are then categorized using a suitable label. For clustering, we used Expectation–...
Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media
The main aim of this thesis is to build adaptive language models of Arabic text that can achieve the best compression performance over existing models. Prediction by partial matching (PPM) language models has been the best performing over the other adaptive language models through the past three decades in term of compression performance. In order to get such performance for Arabic text, the ri...
We present a novel technique for Arabic morphological annotation. The technique utilizes diacritization to produce morphological annotations of quality comparable to human annotators. Although Arabic text is generally written without diacritics, diacritization is already available for large corpora of Arabic text in several genres. Furthermore, diacritization can be generated at a low cost for ...
Purpose: The present research has been carried out with the purpose of studying the analytic cataloging and classification (library of congress) of Arabic books in National Library & Archives of IR of Iran during 2001-2007. Methodology: In this research whose methodology is of documentary method, a list made by the researcher was used for data collection. The samples included 300 catalogues of...
In many applied natural language processing tasks, information is thrown out. For example, in speech recognition systems, prosodic information is commonly discarded; in information retrieval systems, a document is commonly treated as an unordered bag of words and syntactic information is thrown out; and in machine translation systems, pragmatic information (e.g., topic-comment structure and ref...
We propose in this paper a new document representation in Text Mining based on signal representation and spectral processing by Wavelets Transform. Our method gives a solution of syntactic and semantic descriptor dependency problem, without deleting information. This can be done by grouping dependent descriptors in clusters with a single representative. Thereafter each class is represented by a...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید