Complementary Approaches to Representing Differences Between Structured Documents
نویسندگان
چکیده
Structured documents Documents can be represented as structures with a hierarchical arrangement of text and non-text nodes, where nodes are labelled by category names such as “paragraph” and “section”. Representing documents this way is a natural consequence of using the Standard Generalized Markup Language (SGML) to encode the content and form of documents [10, 11, 7]. SGML is widely used. HTML, the encoding used for World Wide Web documents, is an application of SGML [6]; although HTML is used to build hypertext networks of documents rather than hierarchies, each document is itself a hierarchy with explicitly coded links to build the network. The Text Encoding Initiative uses SGML to encode complex texts [21, 4, 2]. Even documents that are not simple hierarchies can be represented using SGML [3]. Formally, documents represented in SGML or HTML are trees with labelled nodes where the left to right ordering of the offspring of a node is significant. Any piece of text in the document can be treated as a single labelled node, all leaf nodes are text nodes (or empty) and any node with children is a structural or non-text node.
منابع مشابه
Comparing Experiential Approaches: Structured Language Learning Experiences versus Conversation Partners for Changing Pre-Service Teacher Beliefs
Research has shown that language teachers’ beliefs are often difficult to change through education. Experiential learning may help, but more research is needed to understand how experiential approaches shape perceptions. This study compares two approaches, conversation partners (CONV) and structured language learning experiences (SLLE), integrated into a course in language acquisition. Partici...
متن کاملشاخص هیرش(h-Index): چالشها و ابزارهای مکمل
Introduction: J.E.Hirsch introduced H Index to scientometric domain , representing actual scientific profitability quality of a researcher’s publications and is equal to effective index for researchers. Literature review:Studies showed that documents database have provided different H scores for each researcher and for different scientific fields. Despite the utmost advantages of H-index it als...
متن کاملگذر از رویکرد دیسیپلینی به اجتماعی در برنامه درسی تعلیمات اجتماعی دوره راهنمایی (تحلیل وضعیت موجود و مطلوب از دید دبیران)
The purpose of this paper was to study existing and desirable status of social educations in secondary school. This was descriptive survey and statistical population included all teachers in secondary schools in Orumiyeh. Survey methodology was census conduction. Structured questionnaire on properties of social educations curriculum in secondary schools was applied in order to study teachers’ v...
متن کاملA Model for Representing and Retrieving Heterogeneous Structured Documents Based on Evidential Reasoning
Documents often display an internal structure; they are composed of components. For example, a journal contains several articles, which themselves contain paragraphs, tables, etc. With structured documents, the retrievable units should be the document components as well as the whole document. The components of a structured document can be of different types: various media, located in a number o...
متن کاملStructured Documents for Representing Services in Internet Marketplaces
Internet Marketplaces are a new computational paradigm with a huge economic potential. These markets extend the concept of electronic commerce by putting an emphasis on simple access and combination of data and computational services. Among the diierent market services ooered, service repositories are vital for the success of the Internet Marketplace. Service repositories not only act as interf...
متن کامل