NLP-Based Scripting For CALL Activities

نویسندگان

  • G. Antoniadis
  • S. Echinard
  • Olivier Kraif
  • T. Lebarbe
  • M. Loiseau
  • C. Ponton
چکیده

This article focuses on the development of Natural Language Processing (NLP) tools for Computer Assisted Language Learning (CALL). After identifying the inherent limitations of NLP-free tools, we describe the general framework of Mirto, an NLP-based authoring platform under construction in our laboratory, and organized into four distinct layers: functions, scripts, activities and scenarios. Through several examples, we explain how Mirto's architecture allows to implement state-of-the-art NLP functions, integrate them into easily handled scripts in order to create, without computing skills, didactic activities that could be recorded in more complex sequences or scenarios. 1 CALL: Conjugating NLP and language didactics It is generally reckoned that computer science can prove itself to be a great aid in language learning, when in fact, most often computer scientists and didactics experts do not agree on the notion of “language”. For the former, it corresponds to a sequence of codes, while for the latter it is a system of forms and concepts. This divergence can easily be explained, when considering the fact that computer science, by definition, can only consider and process the form of the language independently of any interpretation, while, for language didactics, the form only exists through its properties and the concepts it is supposed to represent. The consequences of these diverging approaches are “visible” in the great majority of language learning software. Many an imperfection of the latter’s stem from the divergence mentioned above. Most language learning software are thought and implemented as computer products, only able to take into account a language form deprived of all semantics, or with extremely poor semantics. Caricaturely, rules as basic as that of the interpretation of the space remain ignored, which leads to unfortunate learning situations. For instance, if the learner answers “la casa” (sequence containing two spaces), his or her answer will not be accepted for the expected answer was “la casa” (sequence with one space). The pedagogical consequences of this poor “space processing” are obvious; the software teaches that the sequence of two spaces is not part of the language, and also, that all word preceded or followed by a space has nothing in common with the same word without the space! This down-toearth example of the “spacebar syndrome” characterizes, in our opinion, the deficiencies of today’s language learning software. As (Chanier, 1998) and (Brun & al., 2002) point it out, and as (Antoniadis & Ponton, 2002) and (Antoniadis, 2004) have shown it, only the use of NLP methods and techniques allows to consider and process language as a system of forms and concepts. Considering them might lead to answers for two of the issues of existent CALL software. The first concerns the rigidity of software: the data (instructions, examples, expected answers...) is to be predefined and, a few exceptions aside, can neither be modified nor enriched. Answer handling processes are intimately connected to this data. They are thus unable to consider new entries, unless they were explicitly anticipated. The second problem concerns the inability of CALL software to adapt the course to the learners. Two types of courses are generally proposed. The first, the more classic, offers a predefined linear activity sequence. Whatever his (or her) answers and expectations, the learner will do (and do over) the same activities, using the same data. The second type of course offered is a “free” progression within a scenarized environment. It is the case of exploration software in which the learner is given a mission in a given environment (virtual reality). The dialogue, grammar or other activities are predefined, but will be performed in an order which will depend on the learner’s mission completion process. This latter type of course, despite allowing a wider field of action for the learner (order of the mission, choice of activities...) does not offer real personalization or adaptation of the activities to the learner. Indeed, the course of action is independent of his or her answers for each stage, out of the incapacity of evaluating them. Last, we should bring to the reader’s attention that if the order in which the learner is confronted to the activities can vary according to his (or her) mission, the content of each activity remains invariable and will remain the same, whenever included in the course. The last problem, which partly derives from the first two, characterizes current CALL software. As didactic products, this software should, a priori, be solely designed according to didactic solutions, expressed without constraints using pedagogical concepts. Now, current learning software are in fact computer products which require their users (language teachers, with little or no computing knowledge) to manipulate concepts and notions, which, a priori, do not belong to their language learning set of problems. Thus, instead of expressing pedagogic answers thanks to tools of their own discipline, they are forced to look for computerized solutions, which connect as much as possible with their own models or pedagogic aims. They might even have to give up on some pedagogical solutions, for they are unable to express them in a computer understandable way or because computer science is not able to handle them. To our knowledge, language didactics is presently able to imagine open pedagogic scenarios with exercises adapted according to each learner, examples changing when repeating the same activity within a given session, appropriate texts chosen to illustrate pedagogical contexts and, open and variable learning situations... Computer science is (and will be) unable to take into consideration these aspects with its own set of problems. Resorting to other knowledge (linguistics and language didactics) and to their modeling is essential. The use of NLP tools can constitute a way to resort to linguistic knowledge; the collaborative work of language didactics and NLP experts ought to provide answers concerning language didactics knowledge. The problems that we have just presented explain, in our opinion, about the nature of language learning software so far. They were thought and implemented as computing problems and products which only use the aspects of language didactics that computer science is able to consider. The pedagogical solutions are often altered or truncated so that they can be computed. This approach), and also most of CALL software deficiencies, stem from computer science’s narrow view of language (a simple sequence of codes. Our approach towards the development of language learning software is radically different from those mentioned above. We consider that language learning software is above all a didactic product, a program which provides a didactic solution to a problem of language didactics, without altering, neither the solution nor, a fortiori, the problem. The design of such software requires that we should be able to adapt the possibilities of computer science to the implementation of pedagogical solution previously determined. In this approach, considering language properties, which are invariably present in every pedagogic solution concerning languages, is a must-have. Considering NLP methods, techniques and products only are capable of satisfying this condition, then a language learning software should be defined as the adaptation of NLP possibilities to the predefined didactic aims of language learning. In our opinion, such an approach is the only way to offer to language didactics experts not only tools that would not narrow the scope of treatment of their set of problems, but also tools with pedagogical added-value, capable of widening the set of problems of their discipline. The use of NLP in the design of CALL software is not a new idea; systems like ELEONORE (Renié, 1995), ALEXIA (Chanier & Selva, 2000), or the EXILLS platform (Brun & al., 2002) resort to NLP methods and use NLP resources. Nevertheless, such examples remain marginal and concern non commercial products. Paradoxically, CALL and NLP, two fields centered on language, still seem to be ignoring each other. Most of the time, not using NLP is justified through the added cost resulting from its use. But more than the often-invoked extra cost, it is the lack of NLP culture, which should be held responsible for its absence. In the line of the systems mentioned above, the Mirto platform (Antoniadis & Ponton, 2004) (Forestier, 2002) is aiming at providing a global answer to the problems of CALL software, through an NLP approach on the one hand and on the other hand a collaborative work with didactics experts. More than a finished product, Mirto seeks to be a tool for the creation of didactic solutions for language learning. We present in the rest of the paper the aspects of the system, which describe our approach and its implementation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bluima: a UIMA-based NLP Toolkit for Neuroscience

This paper describes Bluima, a natural language processing (NLP) pipeline focusing on the extraction of neuroscientific content and based on the UIMA framework. Bluima builds upon models from biomedical NLP (BioNLP) like specialized tokenizers and lemmatizers. It adds further models and tools specific to neuroscience (e.g. named entity recognizer for neuron or brain region mentions) and provide...

متن کامل

Use of NLP Tools in CALL System for Arabic

This article focuses on the development of Natural Language Processing (NLP) tools for Computer Assisted Language Learning (CALL). First, we have developed some NLP tools: a labelled dictionary of Arabic (as complete as possible), a generator for morphological derivatives, a Conjugator and a morphological analyzer for Arabic. Second, we used these tools to create a number of educational applica...

متن کامل

NLP and CALL: integration is working

In the first part of this article, we explore the background of computer-assisted learning from its beginnings in the early XIX century and the first teaching machines, founded on theories of learning, at the start of the XX century. With the arrival of the computer, it became possible to offer language learners different types of language activities such as comprehension tasks, simulations, et...

متن کامل

Using NLP Technology in CALL

This paper outlines the research and guiding research principles of the (I)CALL group at Dublin City University, Ireland. Our research activities include the development of (I)CALL systems targeted at a variety of user groups including advanced Romance language learners, intermediate to advanced German learners, primary and secondary school students as well as students with L1 learning disabili...

متن کامل

The future of natural language processing for biomedical applications

The idea to convey researchers who applied natural language processing (NLP) methods to the medical domain and others who applied such methods to the bio-informatics domain is shared at different places. The European Commission will edit a white paper on potential synergies between medical informatics and bio-informatics before the end of 2002. The American Medical Informatics Association (AMIA...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004