text segmentation

نتایج جستجو برای: text segmentation

تعداد نتایج: 227918 فیلتر نتایج به سال:

A Comparative Study of the Effect of Word Segmentation On Chinese Terminology Extraction

2006

Luning Ji Qin Lu Wenjie Li Yi-Rong Chen

Automatic term extraction is the first step towards automatic or semi-automatic update of existing domain knowledge base. Most of the researches applied word segmentation as a preprocessing step to Chinese term extraction. However, segmentation ambiguity is unavoidable, especially in identifying unknown words for Chinese. In this paper, we discuss the effect and limitations of segmentation to C...

متن کامل

An Empirical Study on Word Segmentation for Chinese Machine Translation

2013

Hai Zhao Masao Utiyama Eiichiro Sumita Bao-Liang Lu

Word segmentation has been shown helpful for Chinese-toEnglish machine translation (MT), yet the way different segmentation strategies affect MT is poorly understood. In this paper, we focus on comparing different segmentation strategies in terms of machine translation quality. Our empirical study covers both English-to-Chinese and Chinese-to-English translation for the first time. Our results ...

متن کامل

Text Line detection and Segmentation in Handwritten Gurumukhi Scripts

2013

Namisha Modi Khushneet Jindal

Gurumukhi script is a two dimensional composition of symbols with connected and disconnected diacritics. Handwritten Gurumukhi script has some complexities like connected, overlapped text lines. It is one of the major reasons for errors during the recognition process. Text line segmentation is a challenging job in unconstrained writer independent handwritten document image processing. There is ...

متن کامل

Term Contributed Boundary Feature using Conditional Random Fields for Chinese Word Segmentation Task

2010

Mike Tian-Jian Jiang Shih-Hung Liu Cheng-Lung Sung Wen-Lian Hsu

This paper proposes a novel feature for conditional random field (CRF) model in Chinese word segmentation system. The system uses a conditional random field as machine learning model with one simple feature called term contributed boundaries (TCB) in addition to the “BIEO” character-based label scheme. TCB can be extracted from unlabeled corpora automatically, and segmentation variations of dif...

متن کامل

Methods for text segmentation from scene images

Journal: :ELCVIA Electronic Letters on Computer Vision and Image Analysis 2014

متن کامل

Semantic Segmentation of Text Using Deep Learning

Journal: :Computing and informatics 2022

Given a text, can we segment it into semantically coherent sections in an automatic way? Can detect the semantic boundaries, if know how many they are? determine distinct are text? These questions address this paper. To respond, use Bidirectional Encoder Representation from Transformer (BERT) to analyze text and evaluate function that call local incoherence, which expect show maxima at points w...

متن کامل

Multi-layer segmentation of complex document images

Journal: :IJPRAI 2005

Bing-Fei Wu Yen-Lin Chen Chung-Cheng Chiu

Text is commonly printed on a complex background. Segmenting text is an important part in document analysis. In the past some methods have been shown for the segmentation of texts with images. However, previous studies have not sufficiently addressed complex compound documents. This investigation presents an algorithm for the segmentation of text in various document images. The proposed segment...

متن کامل

On-line Handwritten Japanese Text Recognition by Improving Segmentation Quality

2008

Bilan Zhu Masaki Nakagawa

This paper describes a method of on-line handwritten Japanese text recognition by improving segmentation quality. The method produces hypothetical segmentation points according to features such as distance and overlap between adjacent strokes. Moreover, it extracts multidimensional features from these hypothetical segmentation points and applies an SVM to the extracted features to produces segm...

متن کامل

Text Pre-processing and Text Segmentation for OCR

2012

Archana A. Shinde

Optical Character Recognition (OCR) systems have been effectively developed for the recognition of printed script. The accuracy of OCR system mainly depends on the text preprocessing and segmentation algorithm being used. When the document is scanned it can be placed in any arbitrary angle which would appear on the computer monitor at the same angle. This paper addresses the algorithm for corre...

متن کامل

Text Independent Methods for Speech Segmentation

2004

Anna Esposito Guido Aversano

This paper describes several text independent speech segmentation methods. State-of-the-art applications and the prospected use of automatic speech segmentation techniques are presented, including the direct applicability of automatic segmentation in recognition, coding and speech corpora annotation, which is a central issue in todays speech technology. Moreover, a novel parametric segmentatio...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید