نتایج جستجو برای: text database

تعداد نتایج: 420490  

2004
Ludovic Lebart

The specific complexity of textual data sets (free answers in surveys, documentary data bases, etc.) is emphasized. Recent trends of research show that classification techniques (discrimination and unsupervised clustering as well) are widely used and have great potential in both Information Retrieval and Text Mining.

2010
Lisa Dalhuijsen Lieven van Velthoven

Out of dissatisfaction with currently available software, we built a music management program for large digital music libraries. We propose a system that solves some of the difficulties of organizing these libraries. The system introduces a new visual interaction style with the user’s music collection. A physics system coupled to album’s genre information allows the user to spatially order his ...

2016
Cyril Grouin

In this paper, we present the experiments we made to recover the original page layout structure into two columns from layout damaged digitized files. We designed several CRF-based approaches, either to identify column separator or to classify each token from each line into left or right columns. We achieved our best results with a model trained on homogeneous corpora (only files composed of 2 c...

Journal: :Data Knowl. Eng. 2002
Anthony Hunter

Structured text is a general concept that is implicit in a variety of approaches to handling information. Syntactically, an item of structured text is a number of grammatically simple phrases together with a semantic label for each phrase. Items of structured text may be nested within larger items of structured text. Much information is potentially available as structured text including tagged ...

2000
Uwe Quasthoff Christian Wolff

In this paper we describe a flexible and portable infrastructure for setting up large monolingual language corpora. The approach is based on collecting a large amount of monolingual text from various sources. The input data is processed on the basis of a sentencebased text segmentation algorithm. We describe the entry structure of the corpus database as well as various query types and tools for...

1999
Gaël Dias Spela Vintar José Gabriel Pereira Lopes Sylvie Guilloré

Various efforts have been made for the development of tools and methods dedicated to the automatic processing of multilingual terminology databases. For that purpose, multilingual parallel corpora have been used as a basis resource. However, most of the neologisms in technical and scientific domains are realised by multiword terms that are rarely identified in parallel corpora. In this paper, w...

2015
M. Malathi M. Srividya

word spotting is a technique which can extract the text from input image. Here, we implemented on scanned Tamil land documents. Using Gabor feature, we extract the feature values for the input image. The main goal is recognize the text from the document using K nearest neighbor classifier. The features were calculated and the features were combined. Using these features, we can classify and rec...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید