On Counting Meaningful Units in Texts

نویسنده

  • Maurice Gross
چکیده

L'analyse syntaxique automatique, première étape d'une procédure d'interprétation fine des textes par ordinateur, a recours à des outils comme les grammaires et les dictionnaires. Ces outils, tels qu'ils sont actuellement disponibles, ne sont pas suffisants. Ils doivent en effet prendre une forme électronique qui impose des révisions majeures de leur forme et contenu. Nous présentons une méthodologie linguistique qui a permis de construire des outils électroniques à large couverture des langues. Ces nouveaux outils mettent en évidence des unités linguistiques signifiantes, ce qui conduit à une modification substantielle de l'analyse du contenu des textes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Published with Permission From: Foundation for Rehabilitation Information

Objective: To explore how content analysis can be used together with linking rules to link texts on assessment and intervention to the International Classification of Functioning , Disability and Health – version for children and youth (ICF-CY). Methods: Individual habilitation plans containing texts on assessment and intervention for children with disabilities and their families were linked to...

متن کامل

Using content analysis to link texts on assessment and intervention to the International Classification of Functioning, Disability and Health - version for Children and Youth (ICF-CY).

OBJECTIVE To explore how content analysis can be used together with linking rules to link texts on assessment and intervention to the International Classification of Functioning, Disability and Health - version for children and youth (ICF-CY). METHODS Individual habilitation plans containing texts on assessment and intervention for children with disabilities and their families were linked to ...

متن کامل

Prosodic Cues as Basis for Restructuring

In most of the cases spontaneaously uttered units of speech (e.g. in face-to-face dialogues) contain performance phenomena like repairs, breaking offs, omissions and others that motivate a restructuring procedure which allows storage or further processing of the input. In our view, this restructuring procedure is based on the segmentation of the input into a set of functional (i.e. meaningful) ...

متن کامل

Semantic Clustering and Convolutional Neural Network for Short Text Categorization

Short texts usually encounter data sparsity and ambiguity problems in representations for their lack of context. In this paper, we propose a novel method to model short texts based on semantic clustering and convolutional neural network. Particularly, we first discover semantic cliques in embedding spaces by a fast clustering algorithm. Then, multi-scale semantic units are detected under the su...

متن کامل

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008