Unsupervised Sumerian Personal Name Recognition
نویسندگان
چکیده
This paper describes an unsupervised named-entity recognition (NER) system to identify personal names in Sumerian cuneiform documents from the Ur III period. We are motivated by the needs of social and economic historians of that period to identify specific persons of importance and such historically relevant facts as can be discerned by the surviving texts. The work was confronted by the challenges posed by the fact that Sumerian is not a well understood language and the texts come down to us in damaged condition. We based our recognizer on the Decision List CoTrain algorithm, subjecting it experimentally to modifications to accommodate the nature of the data and narrower task it was originally devised for. We achieved 92.5% recall and 56.0% precision, results that are usable by the economic and social historian. We described the results of our work and suggest further applications of the techniques we have devised, also in the analysis of ancient Sumerian texts.
منابع مشابه
Enhancing Sumerian Lemmatization by Unsupervised Named-Entity Recognition
Lemmatization for the Sumerian language, compared to the modern languages, is much more challenging due to that it is a long dead language, highly skilled language experts are extremely scarce and more and more Sumerian texts are coming out. This paper describes how our unsupervised Sumerian named-entity recognition (NER) system helps to improve the lemmatization of the Cuneiform Digital Librar...
متن کاملUnsupervised Chinese Personal Name Recognition Using Search Session
Personal name recognition is an important part of named entity recognition in Web search query logs. An unsupervised method for Chinese personal name recognition in queries is proposed using search session. Based on seed personal names which are produced automatically by introducing Chinese surnames, a local expansion method is proposed by using search sessions in query logs;and by modeling the...
متن کاملUnsupervised Learning of Name Structure From Coreference Data
We present two methods for learning the structure of personal names from unlabeled data. The first simply uses a few implicit constraints governing this structure to gain a toehold on the problem — e.g., descriptors come before first names, which come before middle names, etc. The second model also uses possible coreference information. We found that coreference constraints on names improve the...
متن کاملCombine Person Name and Person Identity Recognition and Document Clustering for Chinese Person Name Disambiguation
This paper presents the HITSZ_CITYU system in the CIPS-SIGHAN bakeoff 2010 Task 3, Chinese person name disambiguation. This system incorporates person name string recognition, person identity string recognition and an agglomerative hierarchical clustering for grouping the documents to each identical person. Firstly, for the given name index string, three segmentors are applied to segment the se...
متن کاملComparison Between Unsupervised and Supervise Fuzzy Clustering Method in Interactive Mode to Obtain the Best Result for Extract Subtle Patterns from Seismic Facies Maps
Pattern recognition on seismic data is a useful technique for generating seismic facies maps that capture changes in the geological depositional setting. Seismic facies analysis can be performed using the supervised and unsupervised pattern recognition methods. Each of these methods has its own advantages and disadvantages. In this paper, we compared and evaluated the capability of two unsuperv...
متن کامل