Education

The SAT: Aptitude or demographics?

Introduction
- About the test
Validity
Coachability
- The little green book that lies
- How to beat the test
Biases
- Income
- Race
Implications
- A word about test score decline
- Hope will be kept within reasonable bounds
Postscripts
Endnotes
Works cited

5 May 1992

Introduction

We tend to be better at concocting excuses for standardized tests than we are at making sense of their results. Faced with the baffling complexity of human thought, we look for "objective" methods in hopes of a direct route to its assessment. No one would presume to describe a student's mind in a single sentence; but we are confident that a number can say it all. Every year, 1.7 million students subject themselves to the Scholastic Aptitude Test at the request of college admissions officers looking for just such a short cut (Milwaukee Journal). Faced with a diverse array of applicants, 92% of American colleges require SAT scores on the assumption that they provide a method of equating students with differing academic backgrounds but identical grade point averages (Hartnett & Feldmesser 4). Educational Testing Service (ETS), which produces the SAT and four other entrance exams¹, claims that the test measures not just how capably individuals answer analogy and geometry questions, but how capably they will perform in the academic world. A review of the literature tells a different story. The SAT is not a measure of how successful one will be in college, but how well one conforms to the demographics of the group that did well on the first exam.

About the test

The original SAT was offered in 1926. The format was much the same then as it is now: two 30-minute "verbal" sections on "vocabulary, verbal reasoning, and reading comprehension;" two 30-minute "math" sections on "arithmetic, algebra, and geometry;" and an additional 30-minute "experimental" section (verbal or math) used to equate the exam with previous versions of itself and to pre-test questions that might appear on future exams² (CEEB 1991 3). The experimental section is not identified on the test and does not count towards a candidate's score. Scores range in value from 200 to 800 points for both the math and verbal sections. Between 1926 and 1941, the scores were readjusted to produce an average of 500 points. All versions of the test subsequent to 1941 are equated to one another using the experimental section. The first SAT was developed by Carl Campbell Brigham for the College Entrance Examination Board (CEEB) who had previously participated in the development of the "Army-Alpha" intelligence tests. The College Board, the Carnegie Foundation, and the American Council on Education — all of which still exist — consolidated their standardized testing programs under the Educational Testing Service in 1947 (adapted from Owen and Nairn).

Validity

Statistics

Colleges derive predictions of applicants' performance from regression equations based on the performance of the previous year's students. The ability to predict freshman grades is the backbone of the SAT's claim to measure aptitude. In the world of psychometrics, the valid aptitude test is one that can predict a person's performance. A perfect prediction would be 100% accurate. A physical measurement, such as that provided by a thermometer or a ruler, generally delivers accuracy of 95% or more. The SAT, according to figures compiled by Ford and Campos of ETS, ranges in accuracy from 8 to 15% in the prediction of freshman grade point average³ (11). This means that, on the average, for 88% of the applicants (though it is impossible to know which ones) an SAT score will predict their grade rank no more accurately than a pair of dice. The best known record of prediction by the SAT, reported in a 1978 ETS survey of studies, was at a New Jersey college where the 1978 SAT-Verbal would have been matched as a predictor by random chance only 59% of the time. The worst result was reported at a university in Indiana where chance would have predicted grades as well as the 1972 SAT-Verbal 99.96% of the time (Breland & Minsky 149, 153).

As dismal as these self-reported results are, the actual results are lower still. ETS has been known to take liberties with correlation coefficients for use in defending the SAT. Slack and Porter, in a 1980 Harvard Educational Review article, showed that Ford and Campos consistently misreported validity calculations in an apparent effort to make the SAT look better. Ford and Campos found average predictive accuracies of 16% for SAT-Verbal, 12% for SAT-Math, and 25% for high school record (11). But when Slack and Porter redid the arithmetic they found actual values of 14%, 10%, and 27% respectively (165). The errors were quite systematic, and always in favor of the SAT. Previous grades are thus about twice as good as the SAT at predicting academic achievement.

Although the SAT is an inferior predictor relative to high school grades, it can increase the accuracy of prediction when used in combination with them. This has been the main justification for requiring the tests for admissions. Data from several validity studies, however, indicate that inclusion of SAT scores improves prediction by an average of only 5% or less (Nairn 66). The major reason that the benefits are so low is that the SAT provides redundant information. Gottfredson and Crouse argued that at least 90% of the decisions to admit or reject a student are the same whether the SAT is used in conjunction with high school rank or not. "SAT scores and high school rank," they said, "are moderately correlated with each other [from 0.4 to 0.5] so that outcomes predicted from high school rank alone have a part-whole correlation of at least 0.8 with outcomes predicted from rank plus SAT" (368).

Marginal as they are, the predictions of first year grades are the test's most accurate forecasts. Correlations between scores and grades in later years, and overall college average, are lower still. One study found that the ability of college admission tests to predict grades declined consistently from one semester to the next throughout eight semesters (Humphreys). The virtual disappearance of the aptitude test's ability to predict beyond the freshman year has been explained by some commentators as a result of the nature of advanced study. Multiple choice testing predominates introductory courses, they argue, but intermediate and advanced courses demand a broader range of performance.

An even better standard of scholastic success is staying in school. Alexander W. Astin suggested that:

In a very practical sense, the student's ability to stay in college is a more appropriate measure of his success than is his freshman GPA. Although it is true that good grades will help him gain admission to graduate school, to win graduate fellowships, and even to secure certain types of jobs, they are irrelevant to any of these outcomes if the student drops out of college before completing his degree requirements (14–15).

Astin found that using SAT scores to predict who will graduate resulted in 3.2% of perfect prediction for men and 2.9% for women (17–18). This means that for over 95% of the cases, random selection would predict the odds of remaining in school as well as the SAT. "Whether or not the student will drop out of college after the freshman year," Astin noted, "can be predicted with only a low degree of accuracy" (20).

Crouse and Trusheim, in their book The Case Against the SAT, conducted the most detailed statistical analysis of the SAT's predictive shortcomings. Using data from the National Longitudinal Study (NLS) of the high school class of 1972, they calculated the number of additional correct admissions using high school rank (HSR) alone and with the SAT. With four different measures of undergraduate success, they calculated that using the SAT in admissions adds between 0.1 and 2.7 additional correct forecasts per 100 applicants (see Table 1).

Table 1: Additional admissions correctly forecasted compared to random selection per 100 applicants (adapted from Crouse & Trusheim 54–56)
Test	Freshman Year with		Bachelors Degree with
Test	GPA > 2.5	GPA > 3.0	GPA > 2.5	GPA > 3.0
HSR alone	9.2	14.0	6.1	5.9
HSR+SAT	11.9	16.2	6.2	6.1
Difference	2.7	2.2	0.1	0.2

Statistics on test scores and college success, however, can never reveal what might have been. In a rare practical experiment Williams College admitted 358 students, ten percent of each year's new admissions, over a ten year period who would otherwise have been rejected by the school's normal test score and grade requirements. The identities of the "ten percenters" were kept secret from faculty and students. They were subjected to the same academic requirements as other students and received no special aid. In 1976 the results were announced — 71% of the "ten percenters" graduated, compared with the school average of 85%. In one graduating class, the class president, president of the college council, and the president of the honor society were all "ten percenters" (New York Times).

Predictors that are better than the SAT

The qualities students need to succeed over the long haul were examined in a 1972 study of over 300 professors by ETS psychologist Jonathan R. Warre. His findings were as follows:

What does it take to succeed in college? Motivation was the quality most frequently cited by over 3400 college teachers during a recent study of academic performance. The teachers mentioned students' academic commitment and interest even more often than intellectual ability as characteristics of their best students (qtd in Nairn 71).

Similarly, a study by the American College Testing Program reported that "variables such as motivation and a student's background" discriminated between average students and those who dropped out of college, while academic data such as SAT scores and College Board Achievement Tests, did not discriminate between these groups (Nicholson). In general, the best predictors of creative output in adulthood are participation during youth in independent, "self-sustaining" ventures (Joekel 6). According to research summarized by ETS in 1979, "the best predictor of accomplishment in college" is not the SAT but "accomplishment in the same area in high school, as measured by simple check lists of nonacademic achievements" (Baird qtd in Nairn 77).

In a truly bizarre experiment sponsored by ETS, Dr. John R. P. French of the University of Michigan reported high correlations of "achievement orientation" with uric acid levels in the blood (0.66). According to Dr. French:

[We] were able to predict four and a half years in advance which high school students would go on to college and which would not. We were also able to predict which ones would drop out of college, and if we took into account IQ, how long before they dropped out.... We hope to do some studies of serum uric acid in the selection of executives (31).

Considering the SAT's poor predictive ability, one has to wonder exactly what takes priority at ETS. Such criticisms are not isolated. A former ETS executive once described the company as "an educational country club [designed to] pamper over-priced researchers who sit all day and contemplate their psychometric navels⁴" (Owen 8).

There is no correct answer

To get an idea of what ETS really thinks about the accuracy of the SAT, consider its principle method of detecting cheating. ETS' scoring machines are programmed to set aside the answer sheets of students who score suspiciously higher or lower in taking the SAT for the second time. In order to set off the machines, there has to be a 150 point difference on either half of the test or a 250 point combined difference between the first and the second times a test is taken (Milwaukee Journal). When a paper is set aside, ETS investigators check for inconsistencies in handwriting or signatures and can recreate the seating arrangements to look for indications of collaboration or spying. In most cases cheating is not found and the results are allowed to stand. If this isn't cheating then what is? Does ETS think that scholastic aptitude is so volatile that it can grow or shrink 25% in three months⁵? If aptitude was really an innate, unlearnable thing and if the SAT really measured it, then any change over 34 points — the SAT's standard error of measurement — should be suspect (CEEB 1965 21).

That ETS would allow such a wide variance is, to me, an indication of the exam's misguided construction. The SAT is not built from content specifications or on a model of human reasoning, but rather from statistical guidelines. This results in a circular reasoning where the right answer is the one that the students who perform best on the test chose most. Questions are designed solely on their ability to discriminate high scorers from low ones. When new questions are written they are "pretested" to see if they conform to these requirements. A different thirty minute section of each SAT consists of untried questions that don't count towards the test-taker's score. How students respond to these items determine whether or not they will be used on real SATs. An item writer for ETS explained the process:

It was all very pragmatic. It wasn't... theoretical or anything. I had always known this to be true, but it had never been presented to me with such force. There is no Platonic correct answer to any of these questions; it's all determined by the statistical performance of the question as it relates to other questions. If students who do well on the exam generally tend to pick the same answer then it must be pretty good (Owen 79).

Coachability

The little green book that lies

It's interesting to note that when the first SAT was administered, the idea of multiple choice testing was almost unknown in American schools. Many of the students who took the test were hesitant to guess when they weren't certain of an answer. "They felt that guessing was not only risky but even immoral, equating it with cheating" (Owen 93). This led one ETS researcher to suggest that students should learn "how to behave effectively when taking a test" (qtd in Nairn 95). If one could be taught effective test taking might it not also be possible to teach superior test taking? The official line at ETS was no.

In 1976, the Federal Trade Commission responded to ETS' long standing wish for a government investigation of the coaching schools. Their claim was that the aptitude the SAT measured was acquired over years — promises of significant results (over 100 points) in six weeks were false advertising. In Effects of Coaching on Scholastic Aptitude Test Scores the College Board reported that:

Despite variable factors from one study to another, the net result across all studies is that score gains directly attributable to coaching amount, on the average, to fewer than 10 points — a difference of such small magnitude... that it is unreasonable to expect it to affect college admissions decisions. The magnitude of the gains resulting from coaching vary slightly, but they are always small regardless of the coaching method used or the differences in the students coached (4).

Unfortunately for ETS, the plan backfired. The test preparation schools were not cited with fraudulent advertising — ETS was. The initial FTC report found that coaching courses, on the average, raised scores more than 100 points on both the verbal and math sections⁶ (Nairn 102). "Contrary to [the] explicit claims of ETS/CEEB," said Albert Kramer Director of the Bureau of Consumer Protection, "coaching can be effective..." (Levine 5).

How ETS got into this situation is beyond me. The original 1968 report, Effects of Coaching on Scholastic Aptitude Test Scores, had become known within the company as "The Little Green Book That Lies" (Owen 89). By the time of the investigation, it had been known for four years that the math portion was vulnerable to coaching (Pike & Evans). In 1978, Lewis W. Pike of ETS summarized the results of previous coaching studies — including several that ETS had neglected to cite in its earlier summaries. Published in 1978, he concluded that the SAT-Math was clearly coachable and that the SAT-Verbal probably was, but that no comprehensive study of the latter had been attempted (Pike). A few weeks later, Lewis Pike was fired. The following year, the College Board issued a new official statement on coaching, published under this headline: "Board reaffirms its position that 'coaching' for SAT is not likely to improve students' scores" (Owen 100). One really has to doubt the objectivity of ETS in assessing its own products.

How to beat the test

In order for a test question to make it onto a real SAT it has to have certain statistical characteristics. A question is used only if high scoring students tend to get it right and low scoring students tend to get it wrong. If low scoring students do as well as high scoring students, then the question will have an unacceptably low discrimination, and ETS will either have to rewrite it or discard it. The key to a successful test preparation course lies in its ability to address the circular logic in writing questions on a statistical foundation. David Owen, in his book None of the Above, describes just one such coaching school: The Princeton Review. In their method, test candidates are taught how to recognize the incorrect alternatives to a question. As Owen reported:

Adam Robinson [one of Princeton Review's founders] calls this average test taker Joe Bloggs. When Joe Bloggs takes the SAT, he scores 450. When ETS lays a trap for him, he steps in it. Princeton Review students learn how to avoid these traps by learning to understand how Joe Bloggs thinks. When Princeton Review students come to a hard question they don't understand, they ask themselves: What would Joe do here? Then they do something else (124).

Owen's reported average score increases for Princeton Review students were 185 points on either portion of the SAT. Roughly 30% of the students experienced gains in excess of 250 points (122).

At Princeton Review, candidates are also taught how to find the experimental section. Since it doesn't count towards the final score, they fill it out at random and save themselves the trouble⁷0. This effectively sabotages the SAT's statistics on future versions of the exam. When they answer the questions at random, they reduce the difficulty of future SATs by making the pretest questions look harder than they are. Remarked John Katzman, another founder of Princeton Review, "The SAT is bullshit" (Owen 140).

Biases

The triumph of Princeton Review over the SAT reveals an inherent problem with a test based on a statistical model. If the psychometricians can't define what aptitude is — outside of saying that apt people have it — what exactly are they measuring? It might pay to examine some of the characteristics of the people ETS identifies as less apt.

Income

If the SAT is an extremely weak predictor of academic potential it is a moderate predictor of family income. Average scores are proportional to family income: students from families with higher incomes tend to receive higher scores⁸0. Estimates of the correlation between SAT score and family income vary from 0.23 to 0.40 (Crouse & Trusheim and Doermann, respectively). This ranking by income prevails not just when large groups are averaged together but also among applicants within the same institution.

A table from Crouse & Trusheim's book The Case Against the SAT (reprinted below) indicates that SAT scores differentiate people not only by income but also by their parents' role in the economic system. The average scores of the children of professionals are higher than the children of white collar workers, which in turn, are higher than the children of blue collar workers. High school rank, which is a better measure of academic achievement than SAT scores, shows no such correlation.

Table 2: Correlations of SAT and high school rank (HSR) with socioeconomic background (Crouse & Trusheim 126)
Test	Family Income	Father's Occupation	Father's Education	Mother's Education
SAT	0.286	0.238	0.296	0.269
HSR	0.029	0.043	0.085	0.067

If SAT scores really measure a person's scholastic aptitude, then that aptitude is distributed according to parental income.

Some have used such studies to indicate the academic superiority of the upper classes. Further investigation reveals the folly of such assumptions. An American Council on Education study of 36,581 students in 55 colleges concluded that: "The income of a student's parents has no relationship to freshman GPA, either before of after controlling for high school grades, academic aptitude, and college selectivity" (Astin 1971 14). Similarly, an ETS study of 15,535 college bound students found that actual accomplishments outside the classroom did not correlate with income either:

Although educational ambitions were significantly related to accomplishments in several areas, family income was not [one of them]. That is, students from families with different incomes did not significantly differ in the number or level of accomplishments they reported (23).

Not only do the children of the wealthy score unusually high on the SAT, they also have, by virtue of their wealth, increased access to test preparation materials and coaching schools — tuition for the Princeton Review course was $500 in 1984. The FTC investigation found that those candidates who had taken advantage of coaching were heavily concentrated in the upper income brackets: In 1978, 41% of the coached students came from the top income bracket of $30,000 or more (Levine qtd in Nairn 98). As Owen said of Princeton Review students, "[They] simply don't take the same test.... The effect would be the same if ETS randomly selected a thousand white, wealthy students each year, gave them the answers to the SAT in advance, and then denied that it had done so" (139).

Race

In addition to its socioeconomic bias, the SAT is also prejudiced against non-whites. For example:

An ETS study of students at integrated colleges found that, in the six schools for which the information was available, "while SAT score means for blacks were lower than those for their white counterparts, their mean high school ranks were higher" (Davis & Temp 2).
An extensive study by Alexander Astin found that "dropout rates of black students attending white colleges... are slightly lower than is predicted from grades and test scores" (Astin 1970 92).
Goldman and Widawski in a study of four University of California institutions reported that "blacks and Chicanos are clearly not benefited by the use of the SAT in selection of college students — at least at the institutions we have investigated. In every instance, far fewer black and Chicano students would be selected when the SAT is used than when it is not" (196). Use of the SAT at the four universities would make inadmissible 12–15% of the Chicano and 14–43% of the black applicants who would have been admitted on the basis of high school GPA (192–3).

ETS' reply to such claims — some of them from their own researchers — is that the SAT does not discriminate against particular groups per se, but rather that it reflects the fundamental inequality of American society. The SAT is no more responsible for these inequalities than a thermometer is responsible for a fever. It is one thing to use test scores to illuminate disparity, but it is something else entirely to restrict opportunities with them. The real crime of the SAT is that it disguises this disparity as a morally neutral difference in aptitude. Daniel Seligman, associate managing editor of Fortune magazine, had this to say on the subject:

ETS tests persist in showing some people to be smarter than others. And if some people are smarter than others, there might actually be some justification for an economic system in which some people have more money and authority than others.... The really interesting question is not whether rich people are smarter, but why they are. Is it because of their superior environments or their superior genes? The answer... is 'both, obviously' (84).

Wealthy whites don't see SAT results as proof that the poor are mistreated, they see them as proof that mistreatment of the poor is fair.

Implications

A word about test score decline

The decline of test scores with age has long been a feature of standardized aptitude and intelligence tests. The American College Testing Program openly admitted the age discrimination in its college admissions test: "Age groups are combined for prediction [by ACT scores]; however, this procedure leads to consistent underprediction of the grades of older students, and thus to bias against them" (ACT 23). If claims about what these tests measure are taken at face value, they show that adults decline in aptitude as soon as they pass their early twenties. If we buy into the whole notion that the SAT measures "verbal and mathematical abilities... developed over many years both in and out of school" then on the average most people lose ability shortly after high school (CEEB 1991 3). What's really happening, however, is that as the test taker advances in the performance and skill-oriented job world he moves farther and farther away from the test-oriented school world. The tendency of aptitude tests to penalize people without recent practice in test-taking skills has its greatest impact on candidates returning to school after several years in the job market. Those who weren't able to go on to higher education immediately out of high school, displaced workers looking for additional education, and homemakers returning to school are all penalized.

A lot of press over the last 20 years has been devoted to the consistent decline in average SAT scores beginning in 1963 and their miraculous turnaround in 1982. Every SAT since has been normed against the test administered in April 1941. Whatever else has changed in the world, the SAT remains, according to a College Board publication, an "unchanging standard" (Advisory Panel 8). Various reasons have been proposed — from increased electives and "diminished seriousness" to television and fluoride in drinking water — but none of them are based on a test-oriented model⁹ (Advisory Panel 46–48). Given that the test is a better predictor of status quo demographics than of scholastic aptitude I would imagine that any statistically significant changes are directly attributable to demographic changes in the population of students that take the test. In much the same manner that scores decline with age, they decline as the demographics of the test-takers move away from the white, Anglo-Saxon, upper class norm of 1941 and towards a multiethnic, economically heterogeneous sample.

Hope will be kept within reasonable bounds

Carl Campbell Brigham, the creator of the first SAT, was a firm believer in tying advancement and opportunity to "merit." Unfortunately, he was also a bigot. His only book, A Study of American Intelligence, "proved", through army intelligence statistics that Catholics, Greeks, Hungarians, Italians, Jews, Poles, Russians, Turks and — especially — Negroes were innately less intelligent than Germanic and Scandinavian peoples. "We... face here," he wrote, "a possibility of racial admixture here that is infinitely worse than that faced by any European country today for we are incorporating the Negro into our racial stock, while all of Europe is comparatively free from this taint" (209). By carefully sampling the mental power of the nation's young people, he hoped to identify and reward those citizens whose racial inheritance had granted them superior intellectual powers. The SAT was to be the currency of merit in a new American social order based on an aristocracy of aptitude — or meritocracy.

Henry Chauncey, founder and first president of ETS, inherited the meritocratic ideal and continued to preach its virtues:

To many the prospect of measuring in quantitative terms what have previously been considered intangible qualities is frightening, if not downright objectionable. Yet, I venture to predict that we will become accustomed to it and find ourselves better off for it. In no instance that I can think of has the advance of accurate knowledge been detrimental to society.... Educational and vocational guidance, personal and social adjustment most certainly should be greatly benefited. Life may have less mystery but it will also have less disillusionment and disappointment. Hope will not be a lost source of strength, but it will be kept within reasonable bounds (qtd in Nairn 4).

The Scholastic Aptitude Test is a clever attempt to conceal aristocracy and racism behind the cover of science and objectivity. Students, parents, teachers, and administrators who submit themselves to the SAT and believe the lies that ETS tells them are not acting in their own best interest. The SAT is a tool for the privileged to maintain the status quo. Like the razor wire surrounding a gated community, the "reasonable bounds" of the SAT serve to isolate the well-to-do from the rest of society and ensure that the wealthy and powerful are the only ones with access to the wealth and power.

Postscripts

November 1997: Race and class intelligence gaps groups narrowed

Cornell University News Service. Research News Release: 6 November 1997.

Intelligence test scores among racial and socio-economic segments of American society are not growing ever wider, contrary to arguments in The Bell Curve, but are, in fact, converging, say Cornell University psychologists Wendy M. Williams and Stephen J. Ceci, based on analyses of national data sets of mental test scores. This is contrary to often-reported arguments that Americans are getting dumber because low-IQ parents are outbreeding high-IQ parents (EurekAlert!).

October 1996: The SAT no longer measures aptitude

Complete text of an email reply from an ETS representative.

Subject: re: SAT Inquiry
   Date: Wed, 16 Oct 96 15:38:24 EDT
   From: sat_agent3 <[email protected]>
     To: [email protected] 

Thank you for contacting College Board Online.

SAT is now an acronym for Scholastic Assessment Test.
The name change had been in effect since March 1994.
If we can be of further assistance, please contact us.

Well you wouldn't know it from reading any ETS publications. They seem ashamed to admit what it is they actually measure with their products (see footnote 9).

February 1996: Steve Jobs: The voice of underrepresented billionaires

Quote from an interview with Steve Jobs, co-founder of Apple computers.

It's a political problem. The problems are sociopolitical. The problems are unions. You plot the growth of the NEA [National Education Association] and the dropping of SAT scores, and they're inversely proportional. The problems are unions in the schools. The problem is bureaucracy. I'm one of these people who believes the best thing we could ever do is go to the full voucher system.

I have a 17-year-old daughter who went to a private school for a few years before high school. This private school is the best school I've seen in my life. It was judged one of the 100 best schools in America. It was phenomenal. The tuition was $5,500 a year, which is a lot of money for most parents. But the teachers were paid less than public school teachers — so it's not about money at the teacher level. I asked the state treasurer that year what California pays on average to send kids to school, and I believe it was $4,400. While there are not many parents who could come up with $5,500 a year, there are many who could come up with $1,000 a year.

If we gave vouchers to parents for $4,400 a year, schools would be starting right and left. People would get out of college and say, "Let's start a school." You could have a track at Stanford within the MBA program on how to be the businessperson of a school. And that MBA would get together with somebody else, and they'd start schools. And you'd have these young, idealistic people starting schools, working for pennies.

They'd do it because they'd be able to set the curriculum. When you have kids you think, What exactly do I want them to learn? Most of the stuff they study in school is completely useless. But some incredibly valuable things you don't learn until you're older — yet you could learn them when you're younger. And you start to think, What would I do if I set a curriculum for a school?

God, how exciting that could be! But you can't do it today. You'd be crazy to work in a school today. You don't get to do what you want. You don't get to pick your books, your curriculum. You get to teach one narrow specialization. Who would ever want to do that? (Wolf 158)

How exciting indeed! You're right Steve, I am paid too much. I can't wait to start working for pennies. I feel so deprived teaching only one narrow specialization. Throw some more work my way. Unions? They just get in the way, don't they. All these teachers with more education than you demanding reasonable salaries and decent working conditions. The temerity! Send a couple of MBAs our way, Steve. We need their leadership and insightful knowledge. How did I ever manage to teach without a Stanford grad manning the whip? Thank you, Steve. Thank you for solving our nation's educational problems. Please excuse my tears of joy.

Endnotes

These are the Preliminary Scholastic Aptitude Test/National Merit Scholarship Qualifying Test (PSAT/NMSQT), the Law School Admission Test (LSAT), the Graduate Record Exam (GRE), and the Graduate Management Admission Test (GMAT).
Another section, the Test of Standard Written English (TSWE), was recently added to the SAT test battery. Since it is fairly new, I had trouble finding enough material from which to draw conclusions. As I do not know how this exam is used, I will not consider it in this paper.
The statistic used in this report for describing the accuracy of a predictor is what Nairn calls the "percentage of perfect prediction." It is the "improvement in accuracy of prediction over prediction by chance" and is calculated by taking the square of the correlation coefficient (r²) and multiplying by 100% (416).
It is estimated that only 6.5% of the $17.00 test fee actually goes toward test development. This leaves ETS, a tax-exempt corporation, with a lot of money to burn; some of it on what one researcher called "stupid or crazy" research activities (qtd in Owen 235).
The SAT is offered seven times a year during the regular academic calendar of most schools. Realistically speaking this is the shortest time possible between two exams allowing for scoring and entrance deadlines.
These figures were for the students who had already enrolled in a test preparation course. Since family income and SAT scores are slightly correlated and test preparation fees attract students from upper income brackets in disproportionately high numbers they do not represent the across-the-board gains expected. Were all the SAT candidates to enroll in a coaching school, the actual increases would average 25 points (Levine 5).
When Owen took the exam he found the experimental section and filled it out "in a handsome zigzag pattern" (Owen 151).
This pattern has been detected in several studies among them: Admissions Testing 1975, Ramist & Arbeiter, Doermann, and Crouse & Trusheim.
One reason for this might be that those researchers who don't buy into the structural theories — like the "disintegration" of the American family that we hear so much about — probably agree with John Katzman that "the SAT is bullshit." Another study of the SAT would likely draw more attention to the exam than to itself and thus defeat its purpose. Alan Nairn's original work on the test, before the Nader report, is nearly 20 years old and yet we hear more talk of standardized testing than ever before. The SAT is still around and the NTE (which presumably stands for the National Teacher Exam although its Bulletin of Information never explicitly states this) is fast becoming the next cure for our educational "ills."

Works cited

"Academic Dragnet: SAT Cheaters Likely to Get Caught." Milwaukee Journal 30 April 1992: A2.
"Selecting College Material." New York Times 4 April 1976: E7.
Advisory Panel on the Scholastic Aptitude Test Score Decline. On Further Examination: Report of the Advisory Panel on the Scholastic Aptitude Test Score Decline. New York: College Entrance Examination Board, 1977.
American College Testing Program. Highlights of the ACT Technical Report. Iowa City: American College Testing Program, 1973.
Astin, Alexander W. "Racial Considerations in Admissions." The Campus and the Racial Crisis. Eds. David C. Nichols & Olive Mills. Washington, DC: American Council on Education, 1970: 113–141.
Astin, Alexander W. Predicting Academic Performance in College. New York: Free Press, 1971.
Breland, Hunter M. & Minsky, Shula. Population Validity and College Entrance Measures. Princeton, NJ: Educational Testing Service, 1978.
Brigham, Carl C. A Study of American Intelligence. Princeton, NJ: Princeton University Press, 1923.
College Entrance Examination Board. Effects of Coaching on Scholastic Aptitude Test Scores. New York: College Entrance Examination Board, 1965.
College Entrance Examination Board. Registration Bulletin, 1991–92, SAT and Achievement Tests. New York: College Entrance Examination Board, 1991.
Crouse, James & Trusheim, Dale. The Case Against the SAT. Chicago: University of Chicago Press, 1988.
Davis, Junius A. & Temp, George. "Is the SAT Biased Against Black Students?" College Board Review no. 81 (Fall 1971): 2–23.
Doermann, Humphrey. "Lack of Money: A Barrier to Higher Education." Barriers to Higher Education. New York: College Entrance Examination Board, 1971: 130–147.
Educational Testing Service. NTE Programs, 1991–92, Bulletin of Information. Princeton, NJ: Educational Testing Service, 1991.
Ford, Susan F. & Campos, Sandy. Summary of Validity Data from the Admissions Testing Program Validity Study Service. New York: College Entrance Examination Board, 1977.
French Jr., John R. P. "Quantification of Organizational Stress." Managing Organizational Stress: Proceedings of the Executive Study Conference: November 29–30, 1967. Princeton, NJ: Educational Testing Service, 1968: 5–41.
Goldman, Roy D. & Widawski, Melvin H. "An Analysis of Types of Errors in Selection of Minority College Students." Journal of Educational Measurement 13, no. 3 (Fall 1976): 185–200.
Gottfredson, Linda S. & Crouse, James. "Validity Versus Utility of Mental Tests: Example of the SAT." Journal of Vocational Behavior 29 (1986): 363–78.
Hartnett, R. & Feldmesser, D. "College Admissions Testing and the Myth of Selectivity: Unresolved Questions and Needed Research." AAHE Bulletin 32 (March 1980): 3–6.
Humphreys, Lloyd G. "Race and Sex Differences and Their Implication for Educational and Occupation Equality." Educational Theory 26 (1976): 135–146.
Joekel, Ronald G. "Student Activities and Academic Eligibility Requirements." NASSP Bulletin 69 no. 483 (October 1985): 3–9.
Levine, Arthur E. Effects of Coaching on Standardized Admission Examinations: Revised Statistical Analyses of Data Gathered by Boston Regional Office of the Federal Trade Commission, Bureau of Consumer Protection. Washington, DC: US Government Printing Office, 1979.
Nairn, Allan & Associates. The Reign of ETS: The Corporation That Makes Up Minds. Washington, DC: Ralph Nader Institute, 1980.
Nicholson, E. Predictors of Graduation from College. Iowa City: American College Testing Program, 1973.
Owen, David. None of the Above: The Myth of Scholastic Aptitude. Boston: Houghton Mifflin, 1985.
Pike, Lewis W. & Evans, Franklin R. Effects of Special Instruction, for Three Kinds of Mathematics Aptitude Items. New York: College Entrance Examination Board, 1972.
Pike, Lewis W. Short Term Instruction, Testwiseness, and The Scholastic Aptitude Test: A Literature Review with Research Recommendations. New York: College Entrance Examination Board, 1978.
Ramist, Leonard, & Arbeiter, Jean. Profiles, College Bound Seniors: 1983. New York: College Entrance Examination Board, 1983.
Seligman, Daniel. "The Rich are Different." Fortune 5 May 1980: 83–86.
Slack, Warner V. & Porter, Douglas. "The Scholastic Aptitude Test: A Critical Appraisal." Harvard Educational Review 50 (1980): 154–75.
Wolf, Gary. "Steve Jobs: The Next Insanely Great Thing." Wired. 4.02 (February 1996): pages unknown.