Donnerstag, 11. September 2008

The Role of Interlinguistic Modelling and Machine Translation in the Experiment of Cybernetic Pedagogics for an Introduction to the Chinese Language

The Role of Interlinguistic Modelling and Machine Translation in the Experiment of Cybernetic Pedagogics for an Introduction to the Chinese Language (An Abstract)

By QIAO Yi


Cybernetic Pedagogics is the guiding line of the modelling approach of instruction; the related experiment involves psychocybernetics, educational objectivation, informational transformation (based upon Language Oriented Instruction), effect-measurement and lecture-planning.


Interlinguistics and Machine Translation may play powerful roles in the modelling of cybernetic language pedagogics. The argument can be envisaged from three points of views, namely, theoretical, empirical and in project designing. The efforts personally made by the author in such categories can be seen in details as follows:



1) Theoretical domain -

Interlinguistic modelling of the Chinese language structure; String transformation for machine translation (cf. the appendix)

2) Empirical domain -

Objectivation of lecturing, measurement and statistics in accordance with educational cybernetics

3) Project designing -

Computer assisted instruction: reading and writing Chinese characters (hànzì);
Prototype of Machine Translation projects for the databank ISD (International Scientist Document); Multilingual machine dictionary.

Interlinguistics in our case relates the application of the international language ILo as a language modelling for the Chinese language structure. The purpose of the modelling is to facilitate the language studies by means of contrastive observation. In this approach, not only ILo (as well as German and English) functions as the enunciating language but also effort is constantly made to translate, compare, contrast and transfer between the language being learned and the languages of the modelling orientation. In this way, we have established this interlinguistic modelling, whose conception involves in eight categories, namely:

1) lexical classification;
2) grammatical enunciation by steps;
3) systematical transplantation;
4) paraphrase by morphemes;
5) contrastive transformation;
6) comparative rules and regulations;
7) structural disambiguity;
8) modelling partially by computer.

The crux of the modelling is to work out the various blocks concerning the Chinese language structure within a confined number of basic lexicon by means of lexical classification, morphemic paraphrase and some other related approaches.

The choice of grammatical entities is based upon the following criteria:

1) features and practical uses of various parts of speech;
2) features of important syntactic structures;
3) various sentence patterns;
4) word-order;
5) morphology ( such as case markers etc.);
6) lexical strutures;
7) pragmatics;
8) general properties of the language structures.

It has been made available several dozens of such blocks in this approach. Each block has got its own headline with a specific grammatical topic, and the three components involved:

1) Discription of a specific grammatical topic ;
2) Paraphrase of each modeled sentence in Chinese characters (Hz), phonetic equivalence Pinyin (Py), morphemic transcription by the intermediate language (IL), and the translation into the international language (ILo);
3) Summary in the form of tabulation or in tree structure.




Machine translation is related with language cybernetics, whose principle is valid both in computer assisted language instruction and in the objectivation of lecturing.

The approach used by the author in machine translation is based on relative analysis and independent synthesis. The relative analysis involves in dictionary look-up to form the primary string and the grammatical parsing to form the intermediary string. The task of synthesis is rearrangement of word order for the target language by means of string transformation according to the transformational information conveyed by the intermediary language. (see appendix : String Transformation for Machine Translation)

With the same approach, the author has developed Machine Translation projets for French-Chinese and ILo-Chinese concerning the databank ISD with the programming languages Basic, Cobol and Prolog. The automatic analysis of lexical and syntactical units by machine translation are valid for the computer assisted language instruction (CALI).
Objectivation of instruction is benefited quite a lot from machine translation, for the principle of string transformation is valid in CALI as well. It is well known that the instruction system Q consists of teaching manner (B) and teaching means (M), in short Q=(B,M). The conception of objectivation implies the substitution of human teaching system by non-human system Q as in the case of computer assisted instruction, where the teaching manner B should follow a precise algorithm in the form of computer programming.
The author's practice in the Institute for Business-Organisation and Information Technology (InBIT) is an empirical realization in educational cybernetics. The project developed is a computer-aided training system for both reading and writing of Chinese characters. The basis of this project is a multilingual machine dictionary of Chinese characters, Pinyin, ILo, German and English, within a lexicon for the learning system P. Upon these data there is an overall program to stere the instruction algorithm.

As an empirical probation of educational cybernetics, in addition to what has been mentioned in case of lecture objectivation, we have made our efforts also in the methodological development of our courses, for example, in the overall planning, the selection of teaching elements, didactics and efficiency measurement, etc.

In case of the measurement of lecturing increment, tests in the form of multiple selection normally take place for each lesson learned. It is preferrably to make notes for each student (No.S) and each question (No.D), in our situation altogether 14 students and 20 questions for each lesson, and see if each question answered by each student is correct (R) or wrong (F). By making a sum of the symbol F along both row (questions) and column (students), we can see explicitly which student is the least competent in that test and which question is the most unfamiliar to the students.

With the test in the form of multiple choice, we make difference between competence (internal state) and behavior (note). If a student gives wrong answers 6 times out of 20, he gets his note as his behavior R=7/10. Provided b=1/4 (in case that the multiple choice is among four possibilities A,B,C,D for each question), his competence (C) is then obviously:

C=(R-b)/(1-b)=60%

Based upon this, his prossible number of wrong answers by his competence might likely be more than that by his behavior; in this given case," the number of wrong answers in the test" F divided by (1-b) is 8.

For the statistics of lecturing increment, valid is the fomular W=Fv/Fn. (Here Fv is the number of mistakes made in the test before lecturing and Fn is that in the test after lecturing.) For those who needn't to have the former test because of their lacking of former knowledge at all, we have the equition Fv=D-D*b to evaluate, namely Fv is 15 provided D is 20. And if, for instance, Fn=5, then W=15/5=3. In case of, let's say, Fv=10 by the former test (only for those who have some former knowledge), and Fn=5, then W=10/5=2. The empirical average value of the lecturing increment by our experiment is approximately 3.

According to the fomular i(t)=Cv*T, the information learned depends upon the speed of learning (which is related with age) and the time consumed. The speed reaches its utmost arround the age of 21. We might compare a child of 10 and an adult of 40. In case that the information amount (let's say, in foreign language learning) received by a child is more than that by an adult, what we could conclude is that the child consumes more time in learning (he may as well be learning even while playing with a native speaker of the language!) than the adult, for it is known that Cv(40)=0,5 bit/sec is still some quicker than Cv(10)= 0,4 bit/sec. (cf. Frank 1985). It becomes obvious that we can hardly compare the speeds of learning simply in terms of the infomation amount learned non-continuously within a broad spans of period. With this understanding we've come to a rational insight into the language training of adults. It is advisable to encourage them to learn with increased degrees of learning facilitation "*" , which can be calculated according to the following formula:

* = **Cv(A)I
For raising the value of * , it is much more essential and practical to reduce the subjective instruction information "I" than to increase the speed Cv(A) or the efficiency * of learning.



Appendix :
String Transformation for Machine Translation

A SLIT project report presented to University Paderborn (Summary)


It is well known that there are three major procedures in machine translation, namely analysis (A), transformation (T) and synthesis (S). Transformation can be performed independently or incorported with analysis or synthesis. In P.R. China, the major concern for Machine translation is to render various foreign languages into Chinese, the usual approach used is relative analysis and independent synthesis (AT - S). According to the MT algorithm of the project TMFC, the relative analysis involves lexical and grammatical parsings. The task of synthesis is the rearrangement of word order for the target language.
The general layout of the project is composed of following subroutines. For lexical parsing there are three of them: the consultation of the major machine dictionary with segmentation of morphological endings, look-up of sub dictionary of word groups, and consultation of sub-dictionaries of homograph, polysemy and separate syntagmas. For grammatical analysis, four subroutines are available. These involve the treatment of verbs, punctuations and conjunctions, the processing of nouns, retreatment of conjunctions and syntactic analysis. Synthesis of the target language is performed by the rearrangement of word order with both indirect and direct syntagmas of sentences.
To deal with the lexical sequence translated, there may be three stages for string transformation. Bythe consultation of machine dictionaries, the preliminary string is formed to render the surface structure of the source sentence. The intermediary string is based upon the grammatical analysis, by which each syntagma is given a transformational constituent to present the deep structure. The consequence of such transformation performed by synthesis is the terminal string, whose sequence, as can be seen in the generative tree, is ready for the output of the target discourse.
The key argument of this approach is the transformational analysis, by which contrastive features of the language pair are worked out in terms of the transformational constituents. Four domains are involved in such constituents: the information for word order shifting, occurrence of case markers, syntactic function and valency stratum. Each syntagma is indicated by the sequence number of the central word, which may be embraced by brackets to reveal the contraction ever occurred. Based upon such shifting of strings, the transformational tree is constructed. Finally, the string contracted consists of just three parts: the subject, the predicate and the object. With leaves of the terminal strings, it is easy to have the generative tree constructed. The procedure of the synthesis is known in its branches. Such foliage is virtually the semantic entities of the Chinese text in its right order.

Labels:

0 Kommentare:

Kommentar veröffentlichen

Abonnieren Kommentare zum Post [Atom]

<< Startseite