Towards the quantification of the smenantic information encoded in written language
نویسنده
چکیده
Written language is a complex communication signal capable of conveying information encoded in the form of ordered sequences of words. Beyond the local order ruled by grammar, semantic and thematic structures affect long-range patterns in word usage. Here, we show that a direct application of information theory quantifies the relationship between the statistical distribution of words and the semantic content of the text. We show that there is a characteristic scale, roughly around a few thousand words, which establishes the typical size of the most informative segments in written language. Moreover, we find that the words whose contributions to the overall information is larger, are the ones more closely associated with the main subjects and topics of the text. This scenario can be explained by a model of word usage that assumes that words are distributed along the text in domains of a characteristic size where their frequency is higher than elsewhere. Our conclusions are based on the analysis of a large database of written language, diverse in subjects and styles, and thus are likely to be applicable to general language sequences encoding complex information.
منابع مشابه
Towards the Quantification of the Semantic Information Encoded in Written Language
Written language is a complex communication signal capable of conveying information encoded in the form of ordered sequences of words. Beyond the local order ruled by grammar, semantic and thematic structures affect long-range patterns in word usage. Here, we show that a direct application of information theory quantifies the relationship between the statistical distribution of words and the se...
متن کاملCognitive Task Complexity and Iranian EFL Learners’ Written Linguistic Performance across Writing Proficiency Levels
Recently tasks, as the basic units of syllabi, and the cognitive complexity, as the criterion for sequencing them, have caught many second language researchers’ attention. This study sought to explore the effect of utilizing the cognitively simple and complex tasks on high- and low-proficient EFL Iranian writers’ linguistic performance, i.e., fluency, accuracy, lexical complexity, and structura...
متن کاملThe Important Role of Lesson Plan on Educational Achievement of Iranian EFL Teachers' Attitudes
Lesson plan is a written description of education process in which it is shown what, when, where and with which method learners should learn and how they should be assessed. Lesson plan is one of the key factors in the educational process. According to the literature available, unfortunately few studies have been conducted on this issue in the context of Iran. Therefo...
متن کاملThe Important Role of Lesson Plan on Educational Achievement of Iranian EFL Teachers' Attitudes
Lesson plan is a written description of education process in which it is shown what, when, where and with which method learners should learn and how they should be assessed. Lesson plan is one of the key factors in the educational process. According to the literature available, unfortunately few studies have been conducted on this issue in the context of Iran. Therefo...
متن کاملStudents’ Attitude Towards English Language Learning: The Case of Iranian Junior High-School Students and Prospects Course-books
Although a surfeit of studies have examined the students’ attitude towards foreign and / or second language both inside and outside Iran, it seems scanty studies have been devoted to evaluate Prospect-trained students’ attitude towards English. This quantitative study investigated the students’ attitudes towards English language learning among 80 junior high school students in Ahvaz, Iran. Thes...
متن کامل