"Polarized" Accountability Debate Dominates Education Research Conference

Go To Main Index

Source

Top

TITLE I REPORT
The Comprehensive Information Source on Title 1 and Compensatory Education
 

 

 

"Polarized" Accountability Debate Dominates Education Research Conference

By Julie A. Miller and John Brighton

Return To Top

New Orleans — Standards-based accountability improves student performance by providing incentives and focusing attention and resources on the children and schools that are failing.

Alternatively: high-stakes assessment is a plot perpetrated by the political elite on disadvantaged children, who are being pushed out of school, denied real opportunity and subjected to a test-focused curriculum so narrow as to be meaningless. The gains claimed by proponents are shallow or illusory.

Also online:
blobul2e.gif (79 bytes) Texas Testing, Chicago Retention Policy Scrutinized at AERA
blobul2e.gif (79 bytes) Survey Finds States' Test Accommodations Give Little Real Help to LEP Students

Both views were well represented as the intertwined topics of standards, accountability and assessment — particularly crucial to educators in the Title I schools that are ground zero of the accountability trend — thoroughly dominated this year’s American Educational Research Association conference. It was held here April 24-28, attracting a record crowd of over 13,000.

The overpowering dominance of a single topic was unusual for this annual, massive gathering of academics, as was the high ratio of policymakers and policy advocates mixing with the policy wonks, the passionate pitch of many discussions and the large number of sessions that were really forums for a point of view rather than presentations of research findings.

Return To Top

"I’ve been to a lot of AERA meetings, and I’ve never seen assessment so high on the agenda, and I’ve never seen it so polarized," said Anne Lewis, a veteran education writer who moderated one forum. She suggested a sarcastic way to summarize varying perspectives on the standards movement: "We’re on a roll; the dice are loaded; it’s a crapshoot, anyway."

The State of Play

While increased attention to accountability is clearly a national trend, there is great variation in who is being held accountable and on what terms.

"It’s very muddy out there," said Margaret Goertz, a University of Pennsylvania professor and one of the directors of the Center for Policy Research in Education, who outlined the state of play revealed by an ongoing survey of state accountability policies and a more in-depth study of eight states and 23 districts. This is partially due to Title I accountability requirements, she said, as states "trying to respond to the Title I deadlines" are in a "period of transition"

One national trend is greater use of high-stakes tests for students, such as graduation tests. "The focus is supposed to be on holding schools accountable for student performance," Goertz said, but "there’s a shift to holding students accountable."

"Districts seem to be forgotten in this," she said. "Although Title I calls for district level accountability, they will be the last to come on line."

"I found it interesting how states are defining success," Goertz said, noting that some set absolute targets for student performance on tests, while others measured growth or called for narrowing "performance gaps" between schools or groups of students. This last concept is especially prevalent in accountability systems closely focused on Title I, she said.

Five of the eight states studied by CPRE in 1998 and 1999 have what Goertz called "state defined" accountability systems, and districts in those states uniformly based their own accountability and school-improvement efforts on the state requirements. In these states, she found, Title I accountability and the state system were one and the same. However, three states allowed districts to set their own benchmarks, and researchers found that this resulted in "separate accountability policies for Title I schools," which were held accountable under state policies tied to the federal requirements that did not apply to other schools.

Another key difference is that "consequences" for poor performance were far more likely to be imposed in state-defined systems, as districts left to set their own goals were more likely to use performance data to target support efforts than to impose sanctions. Where "formal consequences" existed, Goertz said, they tended to affect students and principals, not teachers.

Accountability Fallout

Return To Top

That does not mean teachers’ don’t feel pressured. Multiple AERA presentations focused at least in part on the effect of accountability measures on educators. For example, Jane Clark Lindle of the University of Kentucky reported that teachers in schools she studied found it "professionally insulting" to have their work reduced to a level of "acceptable or unacceptable," while being given rewards for good performance under the state program was not as gratifying as a "thank you" from one student.

However, her study and others conclude that the threat of accountability sanctions often has surprisingly little effect on what goes on in classrooms.

James Spillane, a Northwestern University professor who studied implementation of Michigan’s accountability system in the early 1990’s, said he found that teachers altered the curriculum to focus more on areas covered by the test, and did some explicit test preparation, but teachers "don’t change their basic pedagogy."

"There is some evidence that the curriculum narrows, but not as much as was feared," said Robert Floden, a Michigan State University professor who is working on the CPRE accountability study. "Teachers do talk a lot about the results of tests. It’s a step forward, but hardly a strong commitment to talk about best practices."

Heinrich Mintrop of the University of Maryland, who presented findings from three researchers’ studies of responses to accountability in Kentucky, Maryland and California, said that schools tend to respond in two ways: adopting "packaged" programs, for instruction in a subject or in areas such as parent involvement; or adopting "structural" changes in areas such as school governance. "Does accountability actually focus these schools?" he asked. "We saw schools adopt a mean of 46 activities in a year; that’s not a clear focus."

Kentucky schools were somewhat more likely to adopt programs that affect the classroom, he said, adding that the presence in many schools there of "distinguished educators" seemed to have "a significant effect."

Mintrop and colleague James Cibulka also contributed a paper concluding that being placed on a state list of Maryland schools eligible for "reconstitution" had little impact overall. The schools on that list showed performance trends similar to those in other schools. Part of the problem is concentration, they concluded; all but two of these schools were in Baltimore or Prince George’s County, a Washington suburb with a high minority population. In Baltimore, in particular, they concluded, so many schools were on the list that it quickly ceased to be a meaningful label. The researchers reported that teachers did not believe the state performance goals were realistic for schools serving disadvantaged students, and this discouraged them from trying to meet the goals. In addition, they wrote, the teachers measure their own success by their ability to help individual students.

Another session focused wholly on Prince George’s County’s effort to avoid state intervention by reconstituting six schools on its own. A group of University of Maryland researchers collected data on all these schools and studied three in depth. They painted a bleak picture. These schools may have been in a performance rut, the researchers said, but they turned into confused institutions with no coherent plans and virtually no additional support from the district. About 75 percent of the new staffs consisted of brand-new teachers, who found themselves focusing on nothing but test preparation and day-to-day survival.

Return To Top

"The teachers talked about here to place the trash can so it isn’t disruptive," said Donna Redmond Jones. "There was not much discussion about how to shape instruction so they might engage students."

Two of the schools involved increased their test scores, the researchers said, which made the initiative look somewhat successful, but even those schools achieved that distinction by focusing all their attention on test preparation, not with a more comprehensive plan.

On The Attack

That finding reflects one of the strongest arguments made by critics of high-stakes accountability in general, who contend that accountability initiatives have a profound negative effect on low performing schools when the stakes are especially high and public scrutiny especially intense.

"The kind of schooling you get with test-driven reform is not going to prepare kids for the real world even if it does produce some test score gains," said Monty Neill, executive director of the Center for Fair and Open Testing (FairTest).

"There’s been a displacement of the regular curriculum," said Linda McNeil, a Rice University researcher who has been a vocal critic of the Texas accountability system, at one of four sessions she participated in. "There are kids who read no prose all year except short excerpts."

"Schools are giving them a phony curriculum to get the scores up," she said. "Even if you pass, it’s a ticket to nowhere."

In addition, McNeil charged, in places like Texas and Chicago, "there are consultants telling educators how to focus on kids who are on the bubble. They say you’re never going to make it if you focus on kids at the bottom, you have to forget about them."

(The Texas accountability system and Chicago’s tough testing-and-retention policy came in for particularly strong scrutiny at AERA. See our sidebar.)

Research papers focusing on the adverse consequences of high-stakes assessment began showing up at AERA a few years ago. But this year, the opponents of high-stakes testing came in force to the education establishment’s premier gathering to stage attacks on the conventional wisdom supporting standards-based reform. Many of these presenters spoke with fervor about what they view as an attack on disadvantaged students and their teachers.

"I’ve heard in corporate offices and Chamber of Commerce lunches, ‘If we didn’t have so many black teachers, we wouldn’t have to do this, if we didn’t have these special interest groups, we wouldn’t have to go to such a centralized system. If it were not for these kids, we could teach the regular curriculum,’" McNeil said.

Return To Top

"The data are being reworked by people with a great deal of prominence in organizations like this, who are receiving a great deal of money to make this look as good as possible while students’ lives are being destroyed," said George Schmidt, a former Chicago teacher who was forced to resign after a publication he edits published some questions from the city’s standardized tests.

Schmidt spoke at a session where FairTest spotlighted local efforts to fight high-stakes tests.

Since her school district in Upper Arlington, Ohio, began participating in high-stakes tests, her children’s progressive school has "lost true reading assessment, math connected to real applications, multi-age classrooms, thematic curricula, field trips, district-developed holistic assessments," said parent activist Mary O’Brien, as well as "inclusion, tolerance, a sense of community and trust."

The Harvard Civil Rights Project organized a session featuring research papers on high-stakes testing that the organization plans to publish in book form, similar to a collection of papers on Title I that the group published in an effort to influence the ongoing reauthorization process.

Economist Henry Levin argued that standardized test scores bear little correlation with later job performance. Neill contended that if high-stakes testing improved performance, states with graduation tests would show greater improvement in the National Assessment of Educational Progress. McNeil contributed a paper on Texas.

"I am troubled by the one-sided nature of this discussion," said William Taylor, co-chair of the Citizen Commission on Civil Rights, rising at the Harvard session to speak as "a supporter of standards-based reform" and illustrating the division among liberals on these issues.

"What is the substitute for standards-based reform?" asked Taylor, whose organization pushes for stronger enforcement of Title I accountability on the theory that holding schools and districts responsible for student performance will force attention to the needs of disadvantaged children.

Assessment Abuse

Not all the criticism came from liberal activists and their academic allies. A number of experts on testing complained vociferously that the accountability trend has caused tests, especially norm-referenced, multiple-choice exams, to be misused in dangerous ways.

States and school districts frequently "engage in practices that are plainly inconsistent with the well publicized principles of the testing profession," said Jay P. Heubert, a professor at Columbia University’s Teachers College, who directed a study on testing for the National Research Council that produced the report "High Stakes."

Return To Top

Chicago officials said they "know the Iowa [Test of Basic Skills] is not valid for promotion purposes," Heubert said, but changing tests "would look bad. We can be credible in the eyes of the people who know something or in the eyes of the readers of the Chicago Tribune."

Speaker after speaker — including representatives of test publishers — pointed out that norm-referenced tests are not designed for accountability purposes. They are designed to separate students into performance strata, and questions that too many students can answer get dumped. They are not comparable to each other, and individuals’ scores can vary according to which specific version of a test they get. They test predominantly basic skills, not higher-order skills. Results tell more about students’ socioeconomic status than about a school’s effectiveness. A nationally distributed test cannot possibly be truly aligned with state or local standards and curricula.

"Test publishers make all the right noises about looking at other outcomes, but they also say you can use standardized tests to judge the effectiveness of educational programs," said W. James Popham, a professor emeritus at UCLA and an expert on testing who has in recent years become a high-profile critic of high-stakes assessment. Standardized tests are not suitable for that purpose, he argued, and "are not designed to detect the effects of instruction, even first-rate instruction."

Popham lent his expertise and sarcastic wit to six AERA sessions, including one of several sessions that focused on the likelihood that educators will try to cheat on tests and whether it is possible to prevent it.

Exposing students to items from a test or very similar questions is cheating in Popham’s book, but he argued that it is to be expected when high stakes are attached to tests and teachers perceive it as the only way to raise scores. The real problem, he said, is that test developers rarely offer teachers a clear enough explanation of the body of knowledge being sampled by their tests so that they can boost scores by teaching appropriate concepts. Showing a testmaker’s description of the skills and objectives being measured on a test, he said: "If I were a teacher, I wouldn’t know what to do with that. It’s a no-win situation."

At more than one session, Daniel Koretz (left), a senior social scientist for the RAND Corp. and a professor of research, measurement, and evaluation at Boston College, pointed out that there will always be performance variances on tests, and greater variance on difficult tests. "The belief that we have a lot of low achievers and it’s because of diversity is wrong," he said, offering graphs that plotted student scores on an international mathematics examination, showing that the range of scores in the U.S. was hardly remarkable, and actually narrower than in some other countries.

"I’m not arguing for giving up on kids, but I’d argue that the way to help is to know where they are," Koretz said. "If standards and assessment don’t acknowledge this, reforms will be harmful to the kids on the bottom."

Common Ground

Return To Top

Even supporters of standards-based accountability agree with some of the criticisms leveled by testing opponents. At many sessions, particularly those set up as forums for multiple points of view, experts expressed concern about ways they think accountability efforts, especially as regards high-stakes tests, are being misused and misdirected.

"We’ve seen an expansion of the policy [of standards-based accountability] at the same time we are recognizing its flaws," said Eva L. Baker (left), director of UCLA’s Center for the Study of Evaluation, warning against "retreating into the fantasy that this is all going to work out."

"It seems like simple stuff, but we’re not following our own rules and we don’t fight very hard for them," said Marshall S. Smith, a professor at Stanford University who fought to implement standards mandates during more than seven years as a top education official in the Clinton administration. "We don’t know how to do student accountability well."

In particular, critics and proponents of standards-based accountability agree that:

• Tests, especially norm-referenced tests, are being used for purposes they were not designed for, raising troubling questions of validity and fairness.

Standards should be set by experts, but are all too often set by politicians with non-academic agendas.

"We have negotiated our standards," Smith (left) said. "We have not tried to develop them in a reasonable way."

• There is inherent tension between the principle of "high standards" and the principle of "standards for all students," and it is difficult to develop a single standards-based accountability system that can meet often conflicting sets of demands.

"We have created a paradox," Baker said at the forum Lewis moderated. "We want broad coverage [of content] but at the same time, we want focused growth. I don’t think we can have both."

In addition, she and others noted, setting a single standard effectively means "setting the standard highest for students who are struggling the most and teachers who are struggling the most."

"When the goal is to make 25 percent improvement over the next three years, that’s unrealistic and it creates a lot of anger," said Allan Odden, a professor at the University of Wisconsin—Madison. In some cities, he said, the stated standards are so high "it would result in holding back the vast majority of kids, and that kind of system is not going to be viable."

• Accountability systems should include more than one measure, and more than one kind of measure. In particular, it isn’t fair or valid to base major decisions, such as whether or not a student can graduate, on a single test score.

Return To Top

"Just about everybody in the world except a few superintendents are against single measures," Smith said in commenting on the Chicago policy. "I’m surprised nobody has sued."

• If students are held accountable for their performance on tests, they must be given the "opportunity to learn" the material, including access to adequate facilities and well-trained teachers.

"In a state system where states control resources, responsibility is a two-way street," said Jacob Adams of Vanderbilt University, arguing that many states have held students and educators responsible for performance without providing any help.

From Consensus to Practice

However, while experts of various philosophical stripes may agree on principles for the valid use of tests, that does not mean they will be quickly adopted.

"The problem," said Sharon Lewis, director of research for the Council of the Great City Schools, at one testing forum, "is that decisions on how assessments are used are not being made by educators."

"States are not accountable to professional testing standards," said Richard Duran of the University of California — Santa Barbara. "Their accountability is to politics and satisfying their constituencies," he said, and some have been "irresponsible."

Return To Top

At the district level, some LEA’s have created sophisticated systems of their own, but most will follow the lead of their states.

"These folks are not typical of what is happening in California, unfortunately," said Fred Tempes, a former state official who now works for the WestEd regional laboratory in Oakland, at a session where representatives of three districts outlined how they responded to the challenge of creating accountability systems while the state policy was in transition.

"I’m encouraged by our colleagues who have taken the high road," he said, but it will become increasingly difficult to maintain high local standards "as the specter of state sanctions are held out two years down the road."

At another session on district assessment practices, educators from several states expressed similar fears that multiple measures will be pushed out by concerns about too much testing and fear that additional standards could cause more schools to be targeted for state sanctions. "Unless we can empower data consumers with knowledge and skills they do not yet possess and demonstrate the predictive validity of our districtwide assessments, the district testing program will likely disappear," wrote Alisabeth H. Hohn and William R. Veitch of the Colorado Springs, Colo. district.

Educators from Long Beach, Calif., offered a very specific example of the pressure exerted by state assessments. The district developed graduated frameworks that allowed a student who scored very well on the state-mandated norm-referenced test to be considered "proficient" despite a too-low score on a local test in the same subject, and vice versa. As educators learned more about the state’s plans, especially plans to sanction "low performing" schools, "the resolve of the district to stick to the high road in setting standards wavered," they wrote. The district decided to deem "proficient" any student who scored above the 50th percentile on the state test, regardless of how poorly they did on the local exam, while still allowing a good score on the latter test to "compensate" for a poor score on the Stanford 9.

Return To Top

Standards-based accountability can meet stiff resistance at the local level for other reasons, said Gerry Postlewait, superintendent of schools in Horry County, S.C. Moving "from the bell curve to a district with expectations of achievement by all" requires changes in educators’ "belief system," the structure of schooling and the mission of the schools, she said.

"When you talk about spending more money on poor children whose families have been disenfranchised, you find out very quickly where the community’s values are," Postlewait said. "When you try to go to a system that brings all kids to a certain level, you find out that the assumptions aren’t where you thought they were."

Federal Mandates?

After civil rights groups began arguing that high-stakes testing has a discriminatory impact on minority students, the U.S. Education Department’s Office for Civil Rights drafted guidelines on appropriate assessment practices. The guidelines, which were controversial, largely reflect the consensus of experts on what represents "proper" uses of tests. In particular, the guidelines call for the use of valid, multiple measures that are "educationally justified."

"The ultimate question is ‘Does it make educational sense?’" said Arthur L. Coleman, a former deputy assistant secretary for civil rights who spoke at a session on the guidelines. "If you have a ‘yes’ answer to that question, you are likely to be on the safe side of the line from a legal standpoint."

OCR has investigated the Texas system and is currently investigating complaints of alleged discrimination in connection with high-stakes assessments in Nevada and North Carolina, as well as Chicago’s student-retention policies.

However, others argued that the OCR guidelines are unenforceable, citing the outcome of legal challenges to the Texas tests. OCR ruled that they were not discriminatory, and the state pledged to ensure that all students received the necessary instruction to pass the test.

Critics then took the issue to court, where U.S. District Judge Edward C. Prado ruled in January that the assessment system "is not perfect, but the court cannot say that it is unconstitutional." He wrote that while the test adversely affects minority students, the gap is closing and the plaintiffs did not show that the adverse impact could be avoided while still meeting "the state’s articulated legitimate goals."

"The idea of ‘educational necessity’ is not defined in the law," said Heubert, an expert on the legal aspects of student testing, who also contributed to the Civil Rights Project’s session. "What happens is that courts typically defer to the educators."

Coleman said OCR considered giving the force of regulation to "Standards for Educational and Psychological Testing," guidelines published by the AERA and the American Psychological Association that are considered the "gold standard" for the testing profession. But officials were uncomfortable about interfering with the "educational judgement" of schools and districts, Coleman said.

Return To Top

"There aren’t obvious answers to this," he said.

In some respects, ED’s enforcement of Title I accountability requirements may prove to be the most promising avenue to prod states toward assessment systems that experts view as fair and valid. The Title I guidelines call for multiple-measure systems, help for low performing schools and tests that are deemed valid by experts for the purpose states are using them for. It remains to be seen how strictly those rules are enforced. ###

Much of the research discussed at the AERA conference is of potential interest to our readers, and our reporting will continue in the next issue.

 

 


Home Page

Current Issue     Archives      Subscribe     More Info on Title I Report
Documents     Data      News Updates     Links    Sample Stories     Listserv  
Search

Send mail to webmaster@TitleI.com  with questions or comments about this web site.
Copyright © 1998-2002 Small Axe Educational Communications Inc.