نتایج جستجو برای: historical fields

تعداد نتایج: 350717  

2016
Dan Garrette Hannah Alpert-Abrams

Historical documents frequently exhibit extensive orthographic variation, including archaic spellings and obsolete shorthand. OCR tools typically seek to produce so-called diplomatic transcriptions that preserve these variants, but many end tasks require transcriptions with normalized orthography. In this paper, we present a novel joint transcription model that learns, unsupervised, a probabili...

2012
Eva Pettersson Beáta Megyesi Joakim Nivre

Even though NLP tools are widely used for contemporary text today, there is a lack of tools that can handle historical documents. Such tools could greatly facilitate the work of researchers dealing with large volumes of historical texts. In this paper we propose a method for extracting verbs and their complements from historical Swedish text, using NLP tools and dictionaries developed for conte...

2003
Michael Droettboom Karl MacMillan Ichiro Fujinaga

This paper describes the Gamera framework for building custom document recognition systems. This open-source system is designed to support the testand-refine development cycle: an important style for developing recognition systems that work with difficult historical documents, since the solutions are often non-obvious. This paper explains the overall architecture of the system, in addition to d...

2015
Rafael C. Carrasco Isabel Martínez-Sempere Enrique Mollá-Gandía Felipe Sánchez-Martínez Gustavo Candela Romero Maria Pilar Escobar Esteban

The BVC section of the impact-es diachronic corpus of historical Spanish compiles 86 books —containing approximately 2 million words. About 27% of the words —providing a representative coverage of the most frequent word forms— have been annotated with their lemma, part of speech, and modern equivalent following the Text Encoding Initiative guidelines. We describe how this type of annotation can...

2010
Sai-Ming Li Mohammad Mahdian R. Preston McAfee

The standard business model in the sponsored search marketplace is to sell click-throughs to the advertisers. This involves running an auction that allocates advertisement opportunities based on the value the advertiser is willing to pay per click, times the click-through rate of the advertiser. The click-through rate of an advertiser is the probability that if their ad is shown, it would be cl...

2011
Sokratis Vavilis Ergina Kavallieratou Roberto Paredes Kostas Sotiropoulos

In this chapter, a binarization technique specifically designed for historical document images is presented. Existing binarization techniques focus either on finding an appropriate global threshold or adapting a local threshold for each area in order to remove smear, strains, uneven illumination etc. Here, a hybrid approach is presented that first applies a global thresholding technique and, th...

2016
Mathias Coeckelbergs Seth van Hooland

Providing useful and efficient semantic annotations is a major challenge for knowledge design of any body of text, especially historical documents. In this article, we propose Topic Modeling as an important first step to gather semantic information beyond the lexicon which can be added as annotations in the SHEBANQ. By laying out a case study, we discuss both noise and structure found in compar...

Journal: :Pattern Recognition Letters 2014
Raid Saabni Abedelkadir Asi Jihad El-Sana

0167-8655/$ see front matter 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.patrec.2013.07.007 ⇑ Corresponding author at: Department of Computer Science, Triangle Research & Development Center, Kafr Qarea, Israel. Fax: +972 4 6356168. E-mail addresses: [email protected] (R. Saabni), [email protected] (A. Asi), [email protected] (J. El-Sana). 1 These authors contribut...

2017
Christoph Dann Emma Brunskill René Kizilcec

In this project, we aim to estimate the effect of different teaching strategies in a tutoring system on student learning and how that effect varies across different groups of students. More specifically, we want to shed light on whether choosing exercise problems adaptively based on prior student performance is more effective at teaching elementary school students about fractions than non-adapt...

2008
Jyi-Shane Liu

In this paper, we report a databank development project in which structured textual data from historical documents are extracted to provide information access of higher data granularity. The availability of the databank opens up tremendous opportunities for research topics in government personnel systems that were limited by data acquisition difficulty in the past. The project demonstrates the ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید