CHAPTER TWENTY-TWO
CLASSROOM ADMINISTRATION AND SCORING
22.1 Objectives
By the end of this chapter, you should be able to:
i. know test administration and its procedures
ii. define marking scheme and its preparations
iii. explain procedures for scoring
iv. convert scores to grades
v. carryout item analysis
22.2 Introduction
As you will recall, we had discussed the test construction as one of basic step in testing. In pursuance of reliable and valid judgments of every student that participant in a classroom test, they need to discuss test administration and techniques of scoring and grading. The purpose of this chapter is to explain test administration and different techniques of scoring and grading scores of students.
22.3 Test Administration
Refers to the array of actions designed for an examination program to decrease measurement error and improve the chance of a fair, valid, and accurate evaluation. Specifically, proper standardized processes increase measurement by enhancing test reliability and consistency. Exam administration processes are designed to enhance uniformity, assure test security, and protect the fairness and reliability of examination outcomes. These processes are followed before to, during, and following testing. Prior to testing, the test manual should be created to cover all regulations governing the administration of the examination. In addition, training should be provided so that all test administrators are aware of their duties in the administration of tests. Before delivering a test, it is also appropriate to identify and assign the required testing rooms. All exam materials should be adequately protected.
During testing, test administrators should be present in the testing environment (hall, etc.) in order to give technical support for any issues that may emerge. During operational testing, it is also the responsibility of the test administrators to ensure that all test security rules and procedures are adhered to. All testing irregularities (such as student misbehavior, copying the answers of others, accessing banned materials for aid, and the use of electronic devices, etc.) that may occur during testing must be disclosed and regulations must be followed throughout the duration.
After testing is complete, all secure and unsecured testing materials must be returned to a secure location for reconciliation. Before leaving school for the day, the test administrator should pack and secure test materials in the safe storage place. All the processes before, during, and after fluid test administration should be considered and followed to ensure uniformity and test security.
22.4 Scoring and Evaluation of Classroom Examination
As a course lecturer or topic instructor, one of your primary responsibilities is to score (or mark) and accurately evaluate student responses. This is the case since the final evaluation of the pupils depends on their scores (or grades). On occasion, incorrect scoring and grading systems have resulted in incorrect student evaluations. The objective of this section is to teach several approaches for scoring and grading student responses.
22.4.1 What is a scoring system?
It is a system that awards points for accurate answers or performance in an examination or competition. It is a strategy or set of standards used to evaluate the written output of students. This allows students to access examination papers to determine how the grading method was used.
22.4.2 Marking Scheme Needs
Designing and employing a marking scheme has a variety of advantages for both the instructor and the student. This includes
i. All students who took the same exam are treated equally since the marking scheme offers a standard basis for measurement.
ii. Since all students are treated similarly, the marking scheme enables a more objective approach of grading them.
iii. If the course lecturer or subject instructor who created the exam is unavailable, another expert can utilize the marking system to grade the pupils.
iv. a decent marking system can serve as a basis for question preparation if it is created in advance of the exam.
The adoption of a marking scheme enables students to self-evaluate and also facilitates peer evaluation.
The use of a marking system explains exactly how a student's grade is determined and how each mark is accounted for.
22.4.3 Designing Marking Scheme
The following steps can be taken to create a grading scheme for either an essay or an objective type test:
§ Write a sample response to each question, if the topic allowed.
§ Calculate the maximum score for the full examination. This may be corrected. However, tests are often graded on a scale of 100 percent.
§ Simplify every option as much as possible. This is the case when questions are of equal weight and responses are also graded equally. However, if the questions are not weighted equally, the marks per question will vary. This should be specified on the exam question paper for the student's benefit.
§ When assigning points, each question's format should be taken into consideration. If a question comprises component components, the weight of each component will determine how many points are awarded.
§ Aim to make your grading rubric accessible to non-specialists in the subject
§ Aim to design your grading plan such that anyone may grade supplied answers and reach a consensus within two marks.
§ Allow for consequential marks; for instance, if a candidate makes an early error but subsequently recovers and proceeds correctly, allow for some marks to be awarded for guaranteeing right steps even if the final answer is substantially incorrect.
§ Pilot your marking system by displaying it to others for their feedback and revisions, which should be incorporated into the final design.
§ Consider what others have done in the past so that you may write your own marking scheme with experience.
§ Learn from your past errors. No marking scheme is flawless; as soon as you begin applying it to scripts, you will begin to modify it, make a note of any challenges you have sticking to it, and record them for future time.
Student Exercise
1. What does test administration entail?
2. Outlines test administration techniques
3. Explain system marking
4. enumerate the main components of an effective marking method in mathematics instruction.
5. List five arguments for using a marking scheme
22.5 Points on the Essay Exam
Generally, essay questions in mathematics are regarded as the classic style of inquiry, which may need brief or extensive responses. A question of the essay kind demands students to articulate theorems or formulas, apply them, and arrive at an accurate solution, as the work may need. There are two sorts of grading methods that may be applied to essay tests given to students: holistic and analytical.
i. Holistic Scoring System: a method introduced in the 1960s and adopted by the people in the mid-1970s. A holistic scoring system is one that offers an examinee with a single score based on the overall quality of their work (or performance). It provides pupils with a single overall grade for the whole paper. This technique evaluates and integrates a single score or grade issued by the examiner. The approach merely verifies if the candidate is actually answering the questions and completing the task criteria.
ii. The Analytic Scoring Method is a technique for evaluating student work that requires providing a different score to each task dimension. It is most frequently employed when evaluating how well pupils do on individual aspects of a full product or performance. This technique examines several aspects of problem-solving, including as formula or theorem formulation, value sorting, application, and calculation. In addition, the technique provides a more detailed description of the examinee's performance than a holistic score. When thorough feedback is necessary, analytics-based scoring is preferred over holistic scoring. The approach provides detailed feedback that enables examiners (teachers) to determine which aspects of an essay examination kids excel and struggle with. This would benefit teachers in carrying out subsequent exercises.
22.6 Objective Test Points
Depending on the type of objective exam, many methods can be used to score objective tests. Manual scoring, scoring using a stencil, and machine scoring are all methods for determining the test score.
i. Manual Scoring
The columns of responses on each examinee's test are compared to the columns of answers on the master copy as part of this scoring procedure.
ii. Stencil Scoring
By making holes where the correct answers should be on a response sheet that is still blank, a score stencil is produced using this method. Each answer sheet is covered with the stencil to score, and the response checks that fall into the holes are counted. Each exam paper is scanned after this scoring process to make sure there are no multiple-answer errors.
iii. Machine Scoring
Using a machine with a computer and other suitable scoring equipment to score the test elements for big classes, particularly mathematics classes, is another approach for grading objective tests.
22.7 Conversion of Scores to Grades
Typically, a grade is determined by the amount of points earned. Grading is the awarding of grades to pupils' scores. Grading is the process of comparing measurement findings to a'score of reference used' to determine the value of those data. Strong evidence demonstrates that despite the fact that schools strive to implement honest and fair grading procedures, their actual practices differ greatly from institution to institution. Grading is essentially an exercise in professional judgment by educators especially mathematical instructors. It entails gathering evidence on students' success or performance throughout a given time period, such as a term, semester, year, or the full program. This procedure translates many sorts of descriptive information and performance measurements into grades or marks that describe pupils' achievements. Although some mathematics instructors differentiate between grades and marks, the majority consider them to be interchangeable. Both terms refer to a system of symbols, phrases, or numbers used to denote varying degrees of success or performance. They might be letter grades such as A, B, C, D, E, and F; symbols such as CO, NA, +A, B+, and C+; phrases such as satisfied, pass, merit, excellent, good, and fail; or numbers such as 4,3,2 and 1. Below are examples of grading practices at various colleges.
Raw Score Letter Grade
70 and above A
60---69 B
50---59 C
49---40 D
39 and below E
Some institutions used grades that correspond to different scores as described below:
Score Grade Description
70 and above A Excellent
60 - 69 B Good
50 - 59 C Fair
40 - 49 D Pass
39 and below E Fail
This description assists in interpreting student performance in the prescribed examination. Other form of grading systems is used as also shown below:
Score Grade Description
70 and above A Excellent
60 - 69 B Good
50 - 59 C Merit
45 - 49 D Pass
40 - 44 E Low Pass
39 and below F Fail
Some schools use this grade system to inform the performance of students. These grades tend to simplify the reporting of performance although they do not reveal the strengths and weaknesses of the student.
Score Grade Description Grade Point
70 - 100 A Distinction 5
60 - 69 B Credit 4
50 - 59 C Merit 3
45 - 49 D Pass 2
40 - 44 E Low Pass 1
0 - 39 F Fail 0
These grade systems present an easy and simple interpretation of students’ performance in various examinations as case may be.
22.8 Item Analysis
Item analysis, also known as internal consistency, is the evaluation of test quality by assessing student replies to specific examination questions. The purpose is to evaluate the quality of the objects and the inspection as a whole. The objective of test item analysis is to determine and enhance the test's validity and reliability. It focuses primarily on four factors: test score reliability, item complexity, item discrimination, and distracting information.
22.8.1 Test Rating Reliability
Sometimes, reliability refers to the constancy of a measurement. It is the degree to which a test assesses what it measures consistently, or the amount to which it can yield the same scores on several times. A test is considered dependable if it consistently yields the same result. Typically, the consistency of a measurement is stated numerically as the reliability coefficient. There are four ways to determine the degree of test score reliability: Test-re-test, Split-half, Parallel Test, and Kuder Richardson's Formular k20.
22.8.2 Index of Difficulty
Sometimes, difficulty index is referred to as the ease %. It is the proportion of pupils in the higher and lower groups who properly answered a question. In other words, it is the proportion of pupils who properly answered a certain exam question. If the difficulty index is high, the question is simple; conversely, the lower the difficulty index, the more challenging the question item.
Index of adversity Formula (p) = n/N or U+L/N
n = Number of students who answered a test question correctly OR Number of students picking the right answer in the upper and lower groups
N = Total Number of Students Who Responded to the Exam Question
Level of Difficulty
Index Range Difficulty Level
0.00 – 0.20 Very Difficulty
0.21 – 0.40 Difficulty
0.41 – 0.60 Average/Moderately
0.61 – 0.80 Easy
0.81 – 1.00 Very Easy
Let look at an example, in computing an item difficulty if in a class of 56 if 15 students answered correctly to a test item: 15/56 = 0. 3
While if the spitted class into upper and lower half members using other approach is shown in Table below:
Table 24.1 Responses for 46 students in a class
Responses A B C D E
Upper Half 8 9 5 2 4
Lower Half 0 6 4 10 8
If B* stands as the correct answer in the given item, then
The difficulty index (p) = U+ L / N = 9+6/56 = 15/56 =0.3. Difficulty index is used to determine which item is too easy or too difficult. If an item is too easy or too difficult, it should be discarded or modified if it is to be used in further. The ideal difficulty index is average index, which range from 0.41 to 0.60.
22.8.3 Discrimination Index
This index measures the item's ability to distinguish between students who performed well on the exam as a whole and those who did not. The index helps determine how well the test item distinguishes between smart and average students in the same class. Positive discrimination, negative discrimination, and zero discrimination are the three forms of discrimination.
Positive discrimination occurs when more pupils in the upper group correctly answered the question than those in the lower group.
When more pupils in the lower group correctly answered the question than those in the higher group, this is known as negative discrimination.
Zero Discrimination occurs when there are equal numbers of students in the upper and lower groups who properly respond to the test item are equal.
Level of Discrimination
Index Range Discrimination Level
0.19 and below Poor item, should be removed or modified if needed
0.20 – 0.29 Marginal item, needs some revision
0.30 – 0.39 reasonably good item but possibly for improvement
0.40 and above Very good item
Discrimination Index formula (D) = CUG – CLG/N
D= Discrimination index value
CUG = number of students selecting the correct answer in the upper group
CLG = number of students selecting the correct answer in the lower group
N = the total number of students that are involved in test
Using the above example to compute the discrimination index as:
(D) = CUG – CLG/N = 9-6/56 = 3/56 =0.05
A test with many poor questions will give a false impression of the situation. Usually, a discrimination index of 0.4 and above is acceptable but item which discrimination negatively is bad.
22.8.4 Distractor Index
This is the power of distraction or disruption of the students from guessing the correct. Distractor index is the test used for the incorrect alternatives in the multiple-choice type of test, aimed to measure the degree of closeness of the wrong options and the reliability of each questions answer choices. The index normally reads negative because it is indices that differentiate between the performance of brilliant and dull students. A positive distractor index is accounted as poor distractor.
To compute the distractor, index the formula is given as:
(dt) = BUG – BUL /n
dt = Distractor index
BU = number of the upper group who choose the wrong options or fail the test item.
BL = number of the lower group who choose the wrong options or fail the test item.
N = total number of students in each group
Using the table below to compute the distractor index
Table Tallying Responses for Mathematics Test
Responses A B C D E
Upper 2 5 1 11 1
Lower 3 6 1 8 2
If alternative D is the correct answer then we have: dt) = BUG – BUL\ n
= 9 -12/20 = -3/20 = -0.15
Student Activity
1. Explain the following method of scoring i. holistic ii. Analytic
2. What is item analysis
3. Enumerate three indices involved in test analysis
4. What is positive discriminating index
5. Compute difficulty index. Discrimination index and distractor index for the responses given that B is the correct answer
Alternatives A B C D E
Upper 2 10 0 0 3
Lower 4 5 4 0 2
6. Ten students have taken a mathematics test of ten objective questions as shown in the table below:
Table 2: Maths Test Scores
Question 1 2 3 4 5 6 7 8 9 10
Upper 4 5 4 4 5 4 4 4 2 4
Lower 2 5 4 1 3 2 1 2 2 2
a. Which question was the easiest?
b. Which question was the most difficult?
c. Which item has the poorest discrimination?
d. Which questions would you eliminate first (if any) why? Calculate the Difficulty Index(di) and Discrimination index
7. Enumerate and explain the methods of scoring objective test.
22.9 Summary
The major points raised in this chapter include the following:
§ Test administration which means the range of activities developed for an examination in order to help reduce measurement errors and increase fair, validity and reliability. This covered from activities from beginning, during and after the test.
§ Marking scheme as an outline of the expected answers together with the mark allotted to each question.
§ Two scoring methods in essay test and these are holistic and analytic
§ Manual, Stencil and Machine are scoring methods in scoring an objective test
§ Grades tend to simplify the reporting of performance although they do not reveal the strengths and weakness of the student. Grading is the process of comparing the measurement results with as ‘score of reference used’ that results form of value.
§ Item analysis is an act of analyzing student responses to individual examination questions with the intention of evaluating examination quality.
§ Item analysis focused on four major items: score reliability, item difficulty, discrimination and distracter.
References
National Teachers’ Institute (1990). NCE/DLS on Course on Education Cycle 3
National Teachers’ Institute & National Open University of Nigeria (2010). General Education
National Teachers’ Institute (1990). PGDE by DLS on PDE 7