Kazakhstan-American free university the problem of testing in grammar assessment

К содержанию номера журнала: Вестник КАСУ №2 - 2010
Автор: Елаков В.В.

The study of grammar has had a long and important role in history of foreign language teaching. Among all words used in a classroom there is the only word that usually makes the students shudder: “test”. There is hardly a person who would claim that a student likes tests and finds them very motivating. Many scientists consider that the role of tests is very useful and important, especially in language learning. It is a means to show both the students and the teacher how much the learners have learned during a course.

Language testing today reflects current interest in teaching genuine communication, but it also reflects earlier concerns for scientifically sound tests. Testing during the last century was basically intuitive, or subjective and dependent on the personal impression of teachers. Grammar skills can be measured by means of grammatical tests. Today testing becomes popular in checking knowledge in different spheres. Many scientists have been studying tests and the process of testing for a long time. For example, Harold S. Madsen studied testing techniques, James Purpura studied grammar assessment, and J. Alderson studied test construction. (Madsen, 1999; Purpura, 2002; Alderson, 2001)

Test is any standardized procedure for measuring sensitivity or memory or intelligence or aptitude or personality etc; "the test was standardized on a large sample of students". Testing in education is a systematic way of studying the ability of an individual or a group to solve problems or perform task under controlled conditions. It is an attempt to measure a person’s knowledge, intelligence or other characteristics in a systematic way. Teacher gives the test to see how well students have learned a particular subject or grammar. Tests imply the use of any training or supervising exercises resulted in sections of training to aspects of language and kinds of speech activity. To pick up corresponding exercises, it is necessary to correlate carefully the purposes and objects of testing, its criteria and parameters.

Productive competences testing, like speaking exams, requires active creative answers, while receptive competences, like multiplechoice reading tests, tend to rely on recognition, with students simply choosing the letter of the best answers. Tests of language subskills measure the separate components of English, such as vocabulary, grammar and pronunciation. Communication skills tests, on the other hand, show how well students can use the language in actually exchanging ideas and information.

Another set of contrasting tests is that of norm-referenced and criterion-referenced exams. Norm-referenced tests compare each student with his classmates. But criterion-referenced exams rate students against certain standards, regardless of how other students do. Still another pair of categories is that of discrete-point and integrative tests. In discrete-point tests, each item tests something very specific such as a preposition or a vocabulary item. Integrative tests are those that combine various language skills.

Tests cannot be avoided completely, for they are inevitable elements of learning process. They are included in curriculum at schools and are to check the students’ level of knowledge and what they are able to do; they could be accomplished at the beginning of the academic year and at the end of it; the students could be tested after working on new topics and acquiring new vocabulary. Moreover, the students are to face the tests in order to enter any foreign university or define the level of their English language skills.

It is often conventionally assumed that tests are mostly used for assessment. The test gives a score which is assumed to define the level of knowledge of the tester. This may be in order to decide whether the student is suitable for a certain job or admission to an institution, has passed a course, etc. But in fact testing and assessment overlap only partially: there are other ways of students’ assessment and there are certainly other reasons for testing. Test may be used as a means to:

- give the teacher information about where the students are at the moment, to help decide what to teach next;

- give the students information about what they know, so that they also have an awareness of what they need to learn or review;

- motivate students to learn or review specific material;

- get a noisy class to keep quiet and concentrated;

- get students to make an effort, which is likely to lead to better results and a feeling of satisfaction;

- give students tasks which themselves may actually provide useful review or practice, as well as listening;

- provide students with a sense of achievement and progress in their learning (Ur, 2003).

However, very often the test itself can provoke the failure of the students to complete it. With the respect to the linguists, we are able to state that there are two main causes of the test being inaccurate:

- Test content and techniques;

- Lack of reliability.

The first one means that the test design should response to what is being tested. First, the test must content the exact material that is to be tested. Second, the activities, or techniques, used in the test should be adequate and relevant to what is being tested. They should not frustrate the learners, but, on the contrary, facilitate and help the students take a test successfully.

The next one implies that one and the same test given at a different time must score the same points. The results should not be different because of the shift in time. For example, the test cannot be called reliable if the score received after the test completed for the first time differs from that completed for the second time. Furthermore, reliability can fail due to the improper design of a test (unclear instructions and questions, etc.) and due to the ways it is scored.

The tests can facilitate the students’ acquisition process and function as a tool to increase their motivation; however, too much of testing could be disastrous changing entirely the students’ attitude towards learning the language, especially if the results are usually dissatisfying. Assessment is an important aspect for the teacher and the students, and it should be correlated in order to make it “go hand in hand”.

The test should be valid and reliable. They should test what was taught, taking the learner’s individual pace into account. Moreover, the instructions of the test should be unambiguous. Validity deals with what is tested and degree to which a test measures what is supposed to measure. Reliability shows that the results of the test will be similar and will not change if one and the same test will be given on various days. The test should be practical, or in other words, efficient. It should be easily understood by the examinee, easily scored and carried out. It should not last for eternity, for both examiner and examinee could become tired, for instance during a five-hour non-stop testing process. Moreover, while testing the students the teachers should be aware of the fact that as well as checking their knowledge the test can influence the students negatively. Therefore, teachers ought to design such a test that could encourage students. The test should be a friend, not an enemy. Thus, the issue of validity and reliability is very essential in creating a good test. The test should measure what it is supposed to measure, but not the knowledge beyond the students’ abilities. Moreover, the test will be a true indicator whether the learning process and the teacher’s work is effective (Alderson, 1995).

Language consists of its vocabulary structure and its grammar structure. The vocabulary doesn’t constitute the language, but acquire the tremendous significance when it comes to be governed by the grammar of any language. Grammar, the structural glue, the “code” of language, is arguably at the heart of language use, whether this involves speaking, listening, reading or writing (Azar, 1998).

For the past fifty years, grammar competence has been defined in many cases as morphosyntactic form and tested in either a discrete-point, selected-response format – a practice initiated by several large language testing firms and emulated by classroom teachers, or in a discrete - point, limited-production format, typically by means of the cloze or some other gap-filling tasks. These tests have typically been scored right / wrong with grammar accuracy as the sole criterion for correctness. Tests of this kind are appropriate for certain purposes and make sense, for example, in situation where individual grammatical forms are emphasized, such as in form-focused instruction. However, we must recognize that separate tests of explicit grammatical knowledge provide only a partial measure of grammar competence, and scores from these tests might be related to those produced from more comprehensive measures of grammar competence.

In recent years, the assessment of grammar competence has taken an interesting turn in certain situations. Grammar has been assessed in the context of language use under the rubric of testing speaking or writing. This has led, in some cases, to examinations in which grammatical knowledge is no longer included as a separate and explicit component of language in the form of separate subtest. In other words, only the students’ implicit knowledge of grammar as well as other components of communicative language ability (e.g. topic, organization, register) is measured (Purpura, 2002). Even with the sudden increase of research since the middle 1980s on grammar teaching and learning, there still remains a surprising lack of consensus on:

- Grammatical knowledge;

- What type of assessment task might best allow teachers ad testers to infer that grammatical knowledge has been acquired;

- How to design task that elicit students’ grammatical knowledge for some specific assessment purpose, while at the same time proving reliable and valid measures of performance (Purpura, 2002).

Tests are an attempt to test what is called expectancy grammar. The context is now at the paragraph or discourse level and meaning comes into play. In spite of some claims to the contrary, cloze techniques do not test the ability to use the language. There have been other suggestions for how to test communicative grammar but none of them have gained wide acceptance. This may be because it is impossible to measure communicative grammar directly. Rea Dickens says that in order to measure communicative grammar a test must have five characteristics.

- The test must provide more context than only a single sentence.

- The test taker should understand what the communicative purpose of the task is.

- He or she should also know who the intended audience is.

- He or she must have to focus on meaning and not only form to answer correctly.

- Recognition is not sufficient. The test taker must be able “to produce grammatical responses” (Trasher, 2000).

The term assessment is generally used to refer to all activities teachers use to help students learn and to gauge student progress. Though the notion of assessment is generally more complicated than the following categories suggest, assessment is often divided for the sake of convenience using the following distinctions:

- formative and summative;

- objective and subjective;

- referencing (criterion-referenced, norm-referenced);

- informal and formal (Alderson, 2001).

Grammar tests are designed to measure student proficiency in matter ranging from inflection to syntax. Syntax involves the relationship of words in a sentence, including matters such as word order, connectives. There are several reasons for testing grammar. Much English Language teaching has been based on grammar; and unlike various measures of communicative skills, there is general agreement on what to test. Assessing the grammatical accuracy of a piece or written or spoken discourse cannot be done fairly using a “count the number of mistakes” approach. But this does not mean that it cannot be done. The test takers who opted to sacrifice fluency for accuracy would be rated highly on the accuracy side but would be given a low rating for the difficulty of the task that they attempted.

The test taker who took the opposite course (sacrificed accuracy for fluency) would get a lower rating in accuracy but would be rated highly on the difficulty side. Taking both fluency and accuracy into consideration will be fairer than looking accuracy alone, but how can we fairly assess grammatical accuracy? Should we merely count the number of errors? Or should we consider both the number and the severity of the errors?

Assessing grammar by examining the accuracy of what the test taker produces in a test of writing or speaking is not the only approach that can be taken. Multiple-choice items can be used to measure the ability to decide whether a grammar structure is correct or not. Such items have been criticized because they do not measure the ability to produce grammatically correct structures and therefore are claimed to be inauthentic language tasks.

The ability to recognize mistakes in grammar is a skill we utilize even when speaking our native language, so multiple-choice items that tap this skill cannot be called inauthentic tasks. Developing the same skill is one of our tasks in learning a second or a foreign language. The skill allows us to monitor our production of the spoken language or proofread what we write and know when to make appropriate repairs or restatements. It is also possible to test grammar using fill in the blank type items. In this sort of item the test taker must write in the best word to fill the blank and put it in the appropriate grammatical form (Trasher, 2000). Grammar items, such as auxiliary are spotted and counted. As for vocabulary exams, either passive or active skills can be checked. Also grammar can be tailored to beginners or advanced learners in testing grammar. We can do a good job of measuring progress in a grammar class, and we can diagnose student needs in this area (Madsen, 1999).

In developing grammar assessment, teachers first articulate the purposes of test, consider the constructs and identify the situational domain in which they would like to make inferences about the test-takers’ grammatical ability. As the goal of grammar assessment is to provide a measurement as useful as possible of students’ grammatical ability, teachers need to design test task in which the variability of students’ scores is attributed to the differences in their grammatical ability, and not to uncontrolled or irrelevant variability resulting from the types of tasks or the quality of the tasks that teachers have put on our tests. As all teachers know, the kinds of tasks they use in test and their quality can greatly influence how students will perform. In order words, specifically designed tasks will work to produce the types of variability in test scores that can be attributed to the underlying constructs giving the contexts in which they are measured.

It is always easier to correct someone else’s written work than one’s own. Using this generalization as a starting point, we can propose three peer correction strategies that are useful when the teacher wants students to focus on correcting specific grammatical errors during the editing process.

We can suggest that paragraphs or short papers by students be used whenever possible and that papers be selected because they illustrate frequent errors types, such as substitution of the present perfect for the simple past tense or overuse of the infinitive to + verb after modals. To ensure maximum focus, the teacher may even correct all other errors and ask students to do specific problem-solving correction activities, such as, “Find two verbs that need an s to show the present tense”, “Find four nouns that should take the definite article”, etc. We recommend using explicit grammatical terminology with the students, especially in classes where students already know it, although example errors and corrections or informal circumlocutions can also be used. All exercises presuppose that grammar points focused on the activity have already been covered.

An alternative of this use of students’ paragraph or essay as sources of peer correction materials is to prepare a composite essay or test for group correction that illustrates similar errors from several students’ written work. This avoids embarrassment by focusing on common problems. We can use this procedure, for example, to practice correction of tense and modal errors in conditional sentences with good results.

In large classes in which students write a lot, the teacher cannot correct everything. Instead, they can take an individualized approach by using a method called “the blue sheet”. In this approach the teacher attaches a blue sheet to the paragraph, essay, and test, lists two obvious structural errors made by the student, and refers the student to pages and exercises in the class grammar text pertinent to the two errors. Students do the assigned exercises when they get their blue sheets; the teacher then corrects them before the students rewrite their passages. The same approach could be used to identify and correct specific grammatical errors that the teacher has detected with some frequency in the students’ work.

Even in smaller classes, not every error needs to be corrected on every paper. In fact, such overkill tends to discourage students and thus impedes progress in the long run. The best results are achieved by focusing on one or two at a time, at least in the beginning.

An individualized checklist encourages students to focus when they edit and correct their own work. Its use is the next logical step in the grammar editing process and works best if, at first, short pieces of writing are used. When the teacher returns the first draft, grammar errors are underlined and major areas of difficulty are listed or checked off on the attached checklist. In this way, each student is aware of the errors they could correct as well as the location of the errors.

Students should consult with the teacher or tutor if they do not understand what the errors are or how to correct them. Each student should keep these checklists and all the drafts of each writing assignment together in a notebook or folder. The teacher can have individual conferences and, where necessary, refer students to additional exercises for those grammar areas in which errors are most persistent. Student soon become adept at making such targeted corrections. As they progress, they should be asked to correct several structures already covered in class in passage in which the errors are only minimally indicated; for example, by underlining. Once the class (or the student) does not get very good at editing by this means, neither the location nor the error type should be specified.

Some students respond better to a less judgmental correction procedure in which the teacher or tutor merely specifies rewordings for sentences or phrases that contain grammatical errors on an attached sheet. The best results are achieved if the teacher moves gradually from much focused correction procedures to less focused ones. Whichever error correction strategy is used, it is imperative that student incorporate the corrections and become aware of major grammatical problems. Over a period of time, these correction strategies, combined with systematic grammar instruction, have a positive effect on the accuracy of the writing produced by ESL students (Witbeck, 1976).

Finally, the teacher should not give the tasks studied in the classroom for the test. It can be explained it by the fact, that when testing we need to learn about the students’ progress, but not to check what they remember. Designing a test is not so fearful and hard as many teachers think. When working out grammar tests, the teacher has to make decisions about such factors as grading, the degree of control, and the degree of realism.

BIBLIOGRAPHY

1. Alderson J. The test of English grammar. – Cambridge, 2001.

2. Alderson J., Clapham C. Language Test Construction and Evaluation. – Cambridge, 1995.

3. Azar B. S. Understanding and using English Grammar. – Oxford, 1998.

4. Madsen S.H. Techniques in testing. – USA, 1999.

5. Purpura J. E. Assessing Grammar. – Oxford, 2002.

6. Trasher R. Test theory and test design. – International Christian University, 2000.

7. Ur P. A course of language teaching. – Cambridge, 2003.

8. Witbeck J. Text editing and grammar correction. – Cambridge, 1976.

К содержанию номера журнала: Вестник КАСУ №2 - 2010