نتایج جستجو برای: text segmentation

تعداد نتایج: 227918  

2006
Luning Ji Qin Lu Wenjie Li Yi-Rong Chen

Automatic term extraction is the first step towards automatic or semi-automatic update of existing domain knowledge base. Most of the researches applied word segmentation as a preprocessing step to Chinese term extraction. However, segmentation ambiguity is unavoidable, especially in identifying unknown words for Chinese. In this paper, we discuss the effect and limitations of segmentation to C...

2013
Hai Zhao Masao Utiyama Eiichiro Sumita Bao-Liang Lu

Word segmentation has been shown helpful for Chinese-toEnglish machine translation (MT), yet the way different segmentation strategies affect MT is poorly understood. In this paper, we focus on comparing different segmentation strategies in terms of machine translation quality. Our empirical study covers both English-to-Chinese and Chinese-to-English translation for the first time. Our results ...

2013
Namisha Modi Khushneet Jindal

Gurumukhi script is a two dimensional composition of symbols with connected and disconnected diacritics. Handwritten Gurumukhi script has some complexities like connected, overlapped text lines. It is one of the major reasons for errors during the recognition process. Text line segmentation is a challenging job in unconstrained writer independent handwritten document image processing. There is ...

2010
Mike Tian-Jian Jiang Shih-Hung Liu Cheng-Lung Sung Wen-Lian Hsu

This paper proposes a novel feature for conditional random field (CRF) model in Chinese word segmentation system. The system uses a conditional random field as machine learning model with one simple feature called term contributed boundaries (TCB) in addition to the “BIEO” character-based label scheme. TCB can be extracted from unlabeled corpora automatically, and segmentation variations of dif...

Journal: :ELCVIA Electronic Letters on Computer Vision and Image Analysis 2014

Journal: :Computing and informatics 2022

Given a text, can we segment it into semantically coherent sections in an automatic way? Can detect the semantic boundaries, if know how many they are? determine distinct are text? These questions address this paper. To respond, use Bidirectional Encoder Representation from Transformer (BERT) to analyze text and evaluate function that call local incoherence, which expect show maxima at points w...

Journal: :IJPRAI 2005
Bing-Fei Wu Yen-Lin Chen Chung-Cheng Chiu

Text is commonly printed on a complex background. Segmenting text is an important part in document analysis. In the past some methods have been shown for the segmentation of texts with images. However, previous studies have not sufficiently addressed complex compound documents. This investigation presents an algorithm for the segmentation of text in various document images. The proposed segment...

2008
Bilan Zhu Masaki Nakagawa

This paper describes a method of on-line handwritten Japanese text recognition by improving segmentation quality. The method produces hypothetical segmentation points according to features such as distance and overlap between adjacent strokes. Moreover, it extracts multidimensional features from these hypothetical segmentation points and applies an SVM to the extracted features to produces segm...

2012
Archana A. Shinde

Optical Character Recognition (OCR) systems have been effectively developed for the recognition of printed script. The accuracy of OCR system mainly depends on the text preprocessing and segmentation algorithm being used. When the document is scanned it can be placed in any arbitrary angle which would appear on the computer monitor at the same angle. This paper addresses the algorithm for corre...

2004
Anna Esposito Guido Aversano

This paper describes several text independent speech segmentation methods. State-of-the-art applications and the prospected use of automatic speech segmentation techniques are presented, including the direct applicability of automatic segmentation in recognition, coding and speech corpora annotation, which is a central issue in today’s speech technology. Moreover, a novel parametric segmentatio...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید