# Education

## When is a 65 not a 65? Change the curve and watch scores plummet

### Contents

24 June 2003

### The problem

Grades on the New York State Regents Physics Exam have experienced significant declines that began when the first of the new format exams appeared in June 2002. It is my contention that this decline is not the result of any increase in difficulty, but instead is due primarily to a change in scoring practices that eliminated a generous bonus and replaced it with a slight penalty. Questions have *not* gotten significantly harder. Students have *not* gotten weaker. From June 1992 to January 2002, a student who answered 57% of the questions correctly was awarded a passing score of 65. To receive the same score on the June 2003 exam, a student would have to answer 67% of the questions correctly. If the new exams were graded using the old system for converting raw points into a final mark, average scores and passing rates would have stayed essentially constant.

Average exam score and percent passing for eleven years worth of Regents Physics Exams as recorded in my personal class records. The dark red line is for students enrolled in regular physics classes. The dark green line is for students enrolled in Advanced Placement classes. Performance is obviously lower in 2002 and 2003 among both populations of students.

### The numbers

Let's begin with a comparison of the old and new exam formats.

#### Old exam format (1992–2002)

Old exams were divided into three parts. Parts I and III consist of multiple choice and free response questions, respectively, taken from the five core units of the syllabus. Part II consists of multiple choice questions taken from the six optional units.

- Required Multiple Choice: 55 questions worth 65 points. The "number right" are used to determine the number of "credits" according to the scoring table on the front of each student's
*Answer Paper*. This "curve" added as much as 20 points to all scores except those near 0 (no answers correct) and 55 (all answers correct). - Optional Multiple Choice: Choose 2 out of 6 groups, 10 questions per group, one point per correct answer, for a total of 20 points.
- Required Free Response: Several questions worth 15 points total.

Total points: 100

No further adjustments are made.

100

#### New exam format (2002–present)

New exams are divided into four sections. All sections are mandatory. None are optional.

- 35 multiple choice questions worth one point each.
- This section is split into two parts.
- 12 multiple choice questions worth one point each.
- 18 points of free response questions.

- 20 points worth of "extended" free response questions.

The logic behind this division is unclear. Sections A and B1 could be combined into one group of multiple choice questions and sections B2 and C could be combined into one group of free response questions with no loss of coherence. In essence, this exam consists of 47 multiple choice questions worth one point each and 38 points worth of free response questions for a total of 85 points.

Total points: 85

This number is adjusted using the raw to scaled score conversion chart on the last page of the *Scoring Key and Rating Guide*, which varies from exam to exam. On the June 2003 exam, this "curve" subtracted as much as 3 points from all scores except those near 0 (no answers correct) and 85 (all answers correct).

100

There are several obvious changes:

- There are fewer multiple choice questions and more free response questions. It is generally considered good pedagogic practice to use questions that force students to generate their own answers. This is a positive change.
- There are no more optional portions. Balancing the difficulty of the optional groups for every exam was tricky. Students choosing a more difficult group were essentially punishing themselves. Eliminating them made for a more fair test.
- The method of converting raw to scaled scores — commonly referred to as the "curve" — lowers instead of raises grades. This is what this essay is about, so let's compare the curves before and after the exam was overhauled to see the effect.

#### Old curve

A scoring page exactly like the one above was used for all exams from June 1992 to January 2002 [curve-page-2002-01.pdf].

As shown in the graph above, scores on exams prior to June 2002 were adjusted upwards. For example, a student who answered 13 of the 55 multiple choice questions on Part I correct (25%) had a bonus of 20 points added to his or her grade. This was standard practice from June 1992 to January 2002.

Using the "old" scoring page, a student who answered 31 of the 55 multiple choice questions in Part I correctly would be awarded 45 points. Assuming a similar performance on the remaining parts of the exam, our hypothetical student would likely earn 20 of the 35 remaining points possible in Parts II and III combined. (31 out of 55 is approximately equal to 20 out of 35.) This student earned 51 out of 90 or 57% of the possible raw score points, but was awarded a scaled score of 65.

Informally stated, "In the old days a 57 would get you a 65".

#### New curve

A scoring page similar to the one above has been in use since June 2002 [curve-page-2003-06.pdf].

As shown in the graph above, scores on exams from June 2002 to the present have been adjusted downwards. For example, a student who earned 55 of 85 points (65%) on the entire exam had a penalty of 3 points subtracted form his or her grade. Again, this is only strictly true for the June 2003 exam. Adjustments now vary from exam to exam.

Using the "new" scoring page from the June 2003 exam, a student who earned 57 of the 85 possible raw points (67% correct) would receive a passing score of 65. In January 2003 and August 2002 it was 56 raw points (66% correct). In June 2002 it was 58 (68% correct). In all four cases, students now must do better than 65% correct to earn a score of 65.

Informally stated, "Nowadays, you need a 67 to get a 65".

### The repairs needed

What would happen if the old curve was applied to the new exam? Let's adjust scores so that a student who answered 25% of the multiple choice questions correct had 20 points added to their score and decrease the adjustment in a linear fashion like that shown in the graph above right.

Distribution of grades on the June 2003 Regents Physics Exam for students in my classes. The red filled graph shows the actual grade distribution determined using the current scoring system. The blue filled graph shows the hypothetical grade distribution had the exam been graded using the earlier scoring system. The vertical line shows the passing mark of 65.

Scoring the new format exams according to the old format rules would restore average scores and passing rates to their pre-2002 values. Once again, these values are taken from my personal records. It is assumed that a similar rebound would occur for all schools statewide.

Here are the mathematical details of how the graphs that appear in the analysis above were generated.

#### Old curve

The graph showing the "old curve" was produced using the data in this text file [scale-2002-01.txt]. The first two columns came directly from the *Answer Paper* [curve-page-2002-01.pdf] provided to each student.

- Number Right: The number of Part I multiple choice questions the student answered correctly.
- Credits: The number of points the student was awarded on Part I.
- Linear Scale: Part I is worth 65 points, but there are only 55 questions. If there was no curve then the credits awarded would be found by multiplying the "Number Right" by 65/55.
- Adjustment: The difference between the "Credits" actually awarded to the student and the credits that would have been awarded had the exam been graded with a "Linear Scale".
- Straight Percent: Calculated using the basic principle taught in middle school mathematics — take the number of points earned, divide by the total number of points possible (55) and multiply by 100%

#### New curve

The graph showing the "new curve" was produced using the data in this text file [scale-2003-06.txt]. The first two columns came directly from the *Scoring Key and Rating Guide* for the June 2002 Regents exam [curve-page-2003-06.pdf]. The third and fourth columns were calculated.

- Raw Score: The number of points the student received on all parts of the exam totaled.
- Scaled Score: This is the grade that gets reported.
- Straight Percent: Calculated using the basic principle taught in middle school mathematics — take the number of points earned, divide by the total number of points possible (85) and multiply by 100%
- Adjustment: The difference between the "Scaled Score" that the student received and the calculated "Straight Percent". On an exam without a curve, the adjustment would be zero.

Use the following formula to calculate the score that a student would have received on the new format exam had it been scaled using the old format curve:

"Multiple Choice"*5/6 + (25 + 5/6)

Where:

- "Multiple Choice" is the number right on Part A and Part B1, combined. Multiplying this number by 5/6 and adding 25+5/6 scales the 47 multiple choice questions up to a maximum of 65 points and adds a bonus of 20 points to a student who answered 25% of these questions correctly. This formula is not valid for students who answered less than 25% of the multiple choice correctly, however, as it would add more than 20 points to their scores. Since none of my students scored lower than this, it wasn't a factor in the reanalysis of their grades.
- "Free Response" is the number of points awarded on Part B2 and Part C, combined. The multiplication factor 35/38 scales the 38 points awarded on these two sections on the new exam to the 35 points that would have been awarded on Part II and Part III of the old exam.

To see the effect that this hypothetical re-grading had on my students see this text file [histograms.txt]. The first 85 entries are for students in the regular physics class and the last 26 are for the students in the AP class. The data columns are as follows:

- "mc": The number of multiple choice questions answered correctly on Part A and Part B1.
- "fr": The number of points awarded on the free response questions on Part B2 and Part C.
- "tot": The total points on all four parts of the exam.
- "new": The scaled score determined from the total points on the
*Scoring Key and Rating Guide*on the new format exam. - "old": The score calculated according to the formula above.

### Postscripts

#### November 2003: Exam grades to be revised

The following paragraphs summarize the actions to be taken by the Board of Regents regarding the Physical Setting: Physics exams offered in 2002 and 2003. Basically, the grades on these exams will be adjusted upwards to bring them in line with more reasonable expectations of performance for students in an introductory, high school general physics course.

New cut scores for passing (65) and passing with distinction (85) will be established, based on the standard setting done in early November. We will use these scores to issue new conversion charts for the June and August 2002 and January and June 2003 examinations, as well as for future administrations of this test beginning with January 2004. These revised conversion charts will reflect how the passing scores would have been set for a first-level course, consistent with the Regents Work Group on Physics recommendations and the work of the Physical Setting/Physics Standard Setting Committee. The conversion charts for past administrations will be issued by December 2003.

School districts are required to record the revised score on each student's permanent record. However, districts have discretion in how they apply these scores for transcripts, final class grades, class ranks, and other reports. This approach responds to concerns raised by districts regarding scores for students who graduated in 2002 and 2003. Many school districts have indicated that they do not want to be required to reevaluate class grades and ranks for students who have already graduated and would like to record new grades as pass/fail. The pass/fail option is available to districts.

James A. Kadamus, Deputy Commissioner; Office for Elementary, Middle, Secondary and Continuing Education; New York State Department of Education.

#### January 2004: Physics conversion charts

What a wonderful feeling it is to be proven right (or at least to be agreed with in a significant manner). The state department of education agrees with me that the Regents Physics Exams were unfairly graded from June 2002 to June 2003.

January 2004 physics conversion charts

At its October 2003 meeting, the Board of Regents accepted the policy recommendation of the Regents Work Group on Physics concerning future examinations. Accordingly, in November 2003, the Department organized a standard setting panel of 27 New York State teachers and other professionals to set new passing (65) and passing with distinction (85) scores in accordance with the Board’s directive to redefine the Regents physics course as a "first-level" high school course in physics comparable to other "first-level" high school sciences.... After the standard setting process, the Department, in consultation with psychometric contractors, established the new conversion charts for the June and August 2002 and January and June 2003 examinations.

James A. Kadamus, Deputy Commissioner; Office for Elementary, Middle, Secondary and Continuing Education; New York State Department of Education.