Proficiency tests TOEFL and IELTS tests as an instrument for assessing language skills

К содержанию номера журнала: Вестник КАСУ №2 - 2011
Автор: Эстерлейн Яна Александровна

In the modern world, the studying and knowing foreign language is significant. But knowing foreign language means not just possession of the speaking skills. It is possession of all language skills, such as listening, reading, writing and speaking which we need in each and every field of life for successful business. We need listening and reading skill for getting new information, but speaking and writing for sharing the information which we own. According to it the four basic skills are related to each other by two parameters: the mode of communication - oral or written and the direction of communication - receiving or producing the message. Listening and reading are receptive skills while speaking and writing are productive skills. It is worth-mentioning that both receptive and productive skills do not take place simultaneously. Listening precedes speaking and reading precedes writing. Making the best use of receptive and productive skills depends upon the level of the proficiency, which should be assessed during the studying year and at the end of the year with the help of different tasks, exams and tests formats. Such well-known standardized English-language testing programs as TOEFL (The Test of English as a Foreign Language) and CFC (Cambridge First Certificate) tests, which college and universities use to gauge the language skills of prospective international students, can serve as assessment for language abilities.

Gear said: “TOEFL and CFC tests are an examination that intends to evaluate the level of the English language of a foreign speaker.” Moreover, it is commonly one of the aspects included into the entrance exams of any university in the USA, European Union and England. TOEFL test as CFC test consists of four different parts: listening comprehension that occupies approximately 35 minutes and consists of three parts, structure and written expression with time limit 25 minutes composed of two tasks and reading comprehension is 55 minutes, consisting of several passages. The differences between CFC and TOEFL tests are: when TOEFL test consists of just four parts, CFC includes a speaking part more. The difference could be found in the sequence of them, for example if CFC test will start with reading first, TOEFL test will deal with listening. The types of tasks and activities implied in the test differ as well. Moreover, each part of each test will include a various range of tasks, i.e. each part of TOEFL test will mainly be composed of two tasks, whereas CFC will classically contain four different activities, but in any case both of these types of tests involve the four skills: reading, listening, speaking and writing, the assessing of which will help to understand the level of student’s or job applicant’s language proficiency.

There are a lot of reasons for a comprehensive assessment of all four English language skills; we will point just several of them, which are the main in this field:

1. Users of English language proficiency tests like the CFC and TOEFL tests may sometimes be more interested in some language skills (speaking, for instance) than others. However, what they value most often is a person’s ability to communicate in English in a variety of contexts that is likely to involve the use of multiple language skills either singly or in combination.

2. A more accurate estimate of a person’s skill in any specific area (speaking, for example) can be attained by testing skills not only in that area but in related areas as well. Because the four aspects of language are inextricably intertwined, a measure of ability in a related domain (e.g., listening) can, when used in conjunction with a measure of the target ability (e.g., speaking), add nuance/depth and accuracy to the measurement of the target ability.

3. The four skills are strongly correlated, but not to the degree that a measure of one can substitute perfectly for a measure on another. They are distinct enough, both logically and empirically, that they have to be measured separately. Failing to measure all of these important aspects of proficiency, therefore, may leave critical gaps in a test taker’s language proficiency profile.

4. Related to point 2 above is that, for most kinds of decision making, more information is almost always better than less. More trustworthy decisions are possible when additional relevant information is used to supplement initially available information, whether that decision concerns language abilities or other types of skills.

5. Standardized tests are almost always fairer to those who take them when multiple methods and multiple question formats are used. Some people perform better on some types of test questions than on others, and so it is appropriate to use a variety of methods and question types to assess critical abilities. Obtaining more information about test takers is not only valuable to the test user but also fairer to the test taker.

6. There are long-term societal consequences of testing English-language skills selectively. What is tested can affect what is taught as well as what is learned. Selective testing can result in greater attention paid to some language skills than others, resulting in uneven profiles of proficiency in overall communications skills. Testing all four skills is not only fairer to individuals, but it benefits society as well.

It is important, however, to test each of these four skills individually because each is a critical aspect of communicative competence. Furthermore, direct evidence of specific individual skills can provide at least indirect evidence of other skills, because they are strongly related with each other.-Listening, reading, writing and speaking are distinct, and each contributes uniquely to an individual’s overall communicative ability. When test scores are used to make consequential decisions, the use of several sources of information provides better decisions than does a more selective use of information. Moreover, assessment is fairer to test takers if they are allowed to demonstrate their skills in multiple ways - with different tests, different methods and different question formats. Comprehensive testing also encourages broader and more general teaching and learning of language skills by test takers. All of the reasons given here are consistent with the trend toward more comprehensive, integrated testing of language skills as seen in many prominent language testing programs.

Also there are a lot of others tasks which can be used by the teachers for skills assessment, but for every type of assessment each teacher or test developers should be guided by the cornerstones of good testing practice when constructing or choosing their tasks, tests or exams.

The first cornerstone is validity the main point of which to be clear about what to assess and to ensure that assessing not something else. Also assessment must have some degree of reliability i.e. that it is consistent and that under the same conditions and with the same performance by students` assessment produces the same or at least similar results. Practicality is another important feature the main point of which to be practical in terms of physical resources such as tape-recorders and photocopies and teacher’s attitude to the assessment must not be too time-consuming, in terms of class hours and of teacher’s own time outside the class. Washback effect or the influence of assessment on both teaching and learning after examinations is another cornerstone, which can cause the stress situation or can have positive emotions if all tasks were forward-thinking, communicative and taken from real life situations (authenticity). Transparency is next feature, which answer the questions: Are expectations clear to students? Do students and teachers have access to information about the test/assessment? And the final element is accountability. As professionals, teachers should be able to provide learners, parents, institutions and society in general, with clear indications of what progress has been made and if it has not, why that is so and also should be able to explain the rationale behind the way assessment takes place and how conclusions are drawn, rather than hiding behind a smoke screen of professional secrecy.

If teacher follows all these cornerstones during the writing or choosing the test, task or exam the assessment will provide information for improvement when learning is less than satisfactory. Through practice in assessment, faculty become better able to understand and promote learning, and increase their ability to help students themselves become more effective, self-assessing, self-directed learners. Simply put, the central purpose of assessment is to empower both teachers and their students to improve the quality of learning in the classroom and in teaching language is to hold under control the language skills of each student, both receptive and productive.

The students passively receive and process the information through the receptive skills. A hard and tough competition always goes on among the students for the achievement of ultimate success. The student, who are vigilant, curious and having thirst of knowledge make the best use of their receptive skills. They are good listener and untiring readers, they are fond of being with learned personalities for listening their lectures and they like to spent their maximum time in reading books, so as to enrich their knowledge, till the could be able to produce wonderful things at their own.

For assessment of reading most language teachers use component subskills, because it is not possible to observe reading behavior directly. They normally focus on certain important skills which can be divided up into major and minor (or contributing) reading skills.

Major reading skills include skimming for gist, scanning for specific details, and establishing overall organization of the passage; reading carefully for main ideas, supporting details, author’s argument and purpose, relationship of paragraphs, fact through opinion. Information transfers from nonlinear texts.

Minor reading skills include understanding at the sentence level: syntax, vocabulary, cohesive markers; at inter-sentence level: reference, discourse markers; also the understanding components of nonlinear texts includes the meaning of graph or chart labels, keys, and the ability to find and interpret intersection points.

But for assessment of listening abilities according to Buck (2001) teachers can use three major approaches: discrete point, integrative and communicative approaches. The discrete-point approach identified and isolated listening into separate elements. Some of the question types that were utilized in this approach included phonemic discrimination, paraphrase recognition and response evaluation. An example of phonemic discrimination is assessing students by their ability to distinguish minimal pairs like ship/sheep. Paraphrase recognition is a format that required students to listen to a statement and then select the option closest in meaning to the statement. Response evaluation is an objective format that presents students with questions and then four response options. The underlying rationale for the discrete-point approach stemmed from two beliefs. First, it was important to be able to isolate one element of language from a continuous stream of speech. Secondly, spoken language is the same as written language, only it is presented orally.

The integrative approach “attempts to assess a learner’s capacity to use many bits at the same time, whereas discrete items attempt to test knowledge of language one bit at a time.” (Oller, 1979:37) Proponents of the integrative approach to listening assessment believed that the whole of language is greater than the sum of its parts. Common question types in this approach were dictation and cloze.

The third approach, the communicative approach, with the help of which the listener must be able to comprehend the message and then use it in context. Communicative question formats must be authentic in nature.

A number of issues make the assessment of listening different from the assessment of other skills. Buck (2001) has identified several issues that need to be taken into account. They are: setting, rubric, input, voiceovers, test structure, formats, timing, scoring and finding texts.

The main rule for comprehensive assessment of reading and listening skills lies in the choice of the texts or audio/video records. They should be carefully chosen to fit the purpose of assessment and the level of the students. Such factors as length, density and readability should be taken into consideration. All teachers should avoid texts and records with controversial or biased material because they can upset students and affect the reliability of test results. Ninety percent of the vocabulary in a prose passage should be known to the students (Nation, 1990). They can be purpose written, taken directly from authentic material or adapted. The best way to develop good reading and listening assessments is to constantly be on the watch for appropriate and authentic material from newspapers, magazines, brochures, instruction guides, news, films – anything that is a suitable source of real texts and audio/video records. Other ways to find material on particular topics are to use an encyclopedia written at an appropriate readability level or to use an Internet search engine. Whatever the source, cite it properly.

Productive skills such as speaking and writing also should be assessed by teacher, because the students possessing efficient productive skills are able to produce something: an essay, a book, a research paper or a speech.

For speaking assessment is good to start with a simple task that puts students at ease so they can perform better. Often this takes the form of asking the students for some personal information or interview, when teacher asks students or student asks another student. Also teacher may ask student to describe a photograph or item, to narrate a story from given a series of pictures or cartoon. Next exercise which teacher can use for checking speaking abilities called information gap activity, when one student has information the other lacks and vice versa. Students have to exchange information to see how it fits together. Negotiation is also very useful task, when students working together may have different opinions. They have to reach a conclusion in a limited period of time. During the role plays, students are given cue cards with information about their “character” and the setting. They should imagine a situation and play. Oral presentations strive to make students impromptu instead of rehearsed.

Based on Bygate’s categories, Weir (1993) divides oral skills into two main groups: speaking skills that are part of a repertoire of routines for exchanging information or interacting, and improvisational skills such as negotiating meaning and managing the interaction. The routine skills are largely associated with language functions and the spoken language required in certain situations. By contrast, the improvisational skills are more general and may be brought into play at any time for clarification, to keep a conversation flowing, to change topics or to take turns. In circumstances when presentation skills form an important component of a program, naturally they should be assessed. However, avoid situations where a student simply memorizes a prepared speech. Decide which speaking skills are most germane to a particular program and then create assessment tasks that sample skills widely with a variety of tasks. While it is possible to assess speaking skills on an individual basis, most large exam boards opt to test pairs of students with pairs of testers. Within tests organized in this way, there are times when only one student speaks and other times when the students interact in a conversation. This setup makes it possible to test common routine functions as well as a range of improvisational skills. For reliability, interlocutors should work from a script so that all students get similar questions framed in the same way. In general, the teacher or interlocutor should keep in the background and only intercede if truly necessary.

For assessing student written proficiency firstly teacher should decide what kind of marking scale he chooses. The foreign language assessment literature generally recognizes two different types of writing scales: holistic marking and analytical marking. Selecting the appropriate marking scale depends upon the context in which a teacher works. This includes the availability of resources, amount of time allocated to getting reliable writing marks to administration, the teacher population and management structure of the institution. Reliability can be increased by using multiple marking, which reduces the scope for error that is inherent in a single score.

Holistic Marking Scales according to McNamara is where the scorer “records a single impression of the impact of the performance as a whole”. In short, holistic marking is based on the marker's total impression of the essay as a whole. Holistic marking is variously termed as impressionistic, global or integrative marking. Experts in holistic marking scales recommend that this type of marking is quick and reliable if 3 to 4 people mark each script. The general rule of thumb for holistic marking is to mark for two hours and then take a rest grading no more than 20 scripts per hour. Holistic marking is most successful using scales of a limited range (i.e. from 0-6).

Analytical Marking Scales according to Hamp-Lyons is where “raters provide separate assessments for each of a number of aspects of performance”. In other words, raters mark selected aspects of a piece of writing and assign point values to quantifiable criteria (Coombe & Evans, 2001). In the literature, analytical marking has been termed discrete point marking and focused holistic marking. Analytical marking scales are generally more effective with inexperienced teachers. These scales are more reliable for scales with a larger point range.

For test reliability, it is recommended that clear criteria for grading be established and that rater training in using these criteria takes place prior to marking. The criteria can be based on holistic or analytical rating scales. However, whatever scale is chosen, it is crucial that all raters adhere to the same scale regardless of their personal preference. The best way to achieve inter-rater reliability is to practice. As always, assessment should first and foremost reflect the goals of the course. In order for assessment to be fair for students, they should have plenty of opportunities to practice a variety of different skills of varying lengths. In other words, language tests should be shorter and more frequent, not just a "snapshot" approach at midterm and final exams.

On the basis of said before we can make a conclusion that assessment is a major component of the possessing language skills and learning process, which involves student and teachers in the continuous monitoring. It provides faculty with feedback about their effectiveness as teachers, and it gives students a measure of their progress as learners. Assessments are created, administered, and analyzed by teachers themselves on questions of teaching and learning that are important to them, the likelihood that instructors will apply the results of the assessment to their own teaching is greatly enhances. But it is very important to make a clear distinction between assessment and evaluation, because assessment is the observation of students in the process of learning, the collection of frequent feedback on students’ learning, and the design of modest classroom experiments that provide information on how students learn and how students respond to particular teaching approaches. Assessment helps individual teachers obtain useful feedback on what, how much, and how well their students are learning. In a short assessment is feedback from the student to the instructor about the student’s learning. Evaluation on the other hand is feedback from the instructor to the student about the student’s learning, because during evaluation teachers use methods and measures to judge student learning and understanding of the material for purposes of grading and reporting. Evaluation involves looking at all the factors that influence on the learning process, such as syllabus objectives, course design, materials, methodology, teacher performance and assessment. Assessment and evaluation are often linked, because assessment is one of the most valuable sources of information about what is happening in a learning environment.

REFERENCES

1. Bachman L. F., Davidson F., Ryan K., Choi I.-C. An investigation into the comparability of two tests of English as a foreign language: The Cambridge-TOEFL comparability study. Cambridge, England: Cambridge University Press, 1995.

2. Coombe, Christine, Keith Folse and Nancy Hubley. A Practical Guide to Assessing English Language Learners. Ann Arbor, MI: University of Michigan Press, 2007.

3. Everson P. The importance of four skills in English education. Presentation at the Global Talent Cultivation Symposium, Seoul, Korea, 2009, February.

4. Gary Buck Assessing Listening Cambridge Language Assessment Series. (series editors) Cambridge: Cambridge University Press, 2001.

5. McNamara, Tim. Language Testing. Oxford Introductions to Language Study, (ed.) H.G. Widdowson. Oxford: Oxford University Press, 2000.

6. Michael Milanovic and Cyril Weir. Studies in Language Testing Series. (Series Editors).Cambridge: Cambridge University Press, 1993.

7. Thomas A. Angelo and K. Patricia Cross, Jossey-Bass. A Handbook for College Teachers, 2nd Edition, 1993.

8. http://www.league.org/

К содержанию номера журнала: Вестник КАСУ №2 - 2011