One of the central pillars of evaluation is assessing the quality of something, often described as its merit. Along with worth (value) and significance (importance), assessing the merit of a program, product or service is one of the principal areas that evaluators focus their energy.
However, if you think that would be something that’s relatively simple to do, you would be wrong.
This was brought home clearly in a discussion I took part in as part of a session on quality and evaluation at the recent conference of the American Evaluation Association entitled: Who decides if it’s good? How? Balancing rigor, relevance, and power when measuring program quality. The conversation session was hosted by Madeline Brandt and Kim Leonard from the Oregon Community Foundation, who presented on some of their work in evaluating quality within the school system in that state.
In describing the context of their work in schools, I was struck by some of the situational variables that came into play such as high staff turnover (and a resulting shortage among those staff that remain) and the decision to operate some schools on a four-day workweek instead of five as a means of addressing shortfalls in funding. I’ve since learned that Oregon is not alone in adopting the 4-day school week; many states have begun experimenting with it to curb costs. The argument is, presumably, that schools can and must do more with less time.
This means that students are receiving up to one fifth less classroom time each week, yet expecting to perform at the same level as those with five days. What does that mean for quality? Like much of evaluation work, it all depends on the context.
Quality in context
The United States has a long history of standardized testing, which was instituted partly as a means of ensuring quality in education. The thinking was that, with such diversity in schools, school types, and populations there needed to be some means to compare the capabilities and achievement across these contexts. A standardized test was presumed to serve as a means of assessing these attributes by creating a benchmark (standard) to which student performance could be measured and compared.
While there is a certain logic to this, standardized testing has a series of flaws embedded in its core assumptions about how education works. For starters, it assumes a standard curriculum and model of instruction that is largely one-size-fits-all. Anyone who has been in a classroom knows this is simply not realistic or appropriate. Teachers may teach the same material, but the manner in which it is introduced and engaged with is meant to reflect the state of the classroom — it’s students, physical space, availability of materials, and place within the curriculum (among others).
If we put aside the ridiculous assumption that all students are alike in their ability and preparedness to learn each day for a minute and just focus on the classroom itself, we already see the problem with evaluating quality by looking back at the 4-day school week. Four-day weeks mean either that teachers are creating short-cuts in how they introduce subjects and are not teaching all of the material they have or they are teaching the same material in a compressed amount of time, giving students less opportunity to ask questions and engage with the content. This means the intervention (i.e., classroom instruction) is not consistent across settings and thus, how could one expect things like standardized tests to reflect a common attribute? What quality education means in this context is different than others.
And that’s just the variable of time. Consider the teachers themselves. If we have high staff turnover, it is likely an indicator that there are some fundamental problems with the job. It may be low pay, poor working conditions, unreasonable demands, insufficient support or recognition, or little opportunity for advancement to name a few. How motivated, supported, or prepared do you think these teachers are?
With all due respect to those teachers, they may be incompetent to facilitate high-quality education in this kind of classroom environment. By incompetent, I mean not being prepared to manage compressed schedules, lack of classroom resources, demands from standardized tests (and parents), high student-teacher ratios, individual student learning needs, plus fitting in the other social activities that teachers participate in around school such as clubs, sports, and the arts. Probably no teachers have the competency for that. Those teachers — at least the ones that don’t quit their job — do what they can with what they have.
Context in Quality
This situation then demands new thinking about what quality means in the context of teaching. Is a high-quality teaching performance one where teachers are better able to adapt, respond to the changes, and manage to simply get through the material without losing their students? It might be.
Exemplary teaching in the context of depleted or scarce resources (time, funding, materials, attention) might look far different than if conducted under conditions of plenty. The learning outcomes might also be considerably different, too. So the link between the quality of teaching and learning outcomes is highly dependent on many contextual variables that, if we fail to account for them, will misattribute causes and effects.
What does this mean for quality? Is it an objective standard or a negotiated, relative one? Can it be both?
This is the conundrum that we face when evaluating something like the education system and its outcomes. Are we ‘lowering the bar’ for our students and society by recognizing outstanding effort in the face of unreasonable constraints or showing quality can exist in even the most challenging of conditions? We risk accepting something that under many conditions is unacceptable with one definition and blaming others for outcomes they can’t possibly achieve with the other.
From the perspective of standardized tests, the entire system is flawed to the point where the measurement is designed to capture outcomes that schools aren’t equipped to generate (even if one assumes that standardized tests measure the ‘right’ things in the ‘right’ way, which is another argument for another day).
Speaking truth to power
This years’ AEA conference theme was speaking truth to power and this situation provides a strong illustration of that. While evaluators may not be able to resolve this conundrum, what they can do is illuminate the issue through their work. By drawing attention to the standards of quality, their application, and the conditions that are associated with their realization in practice, not just theory, evaluation can serve to point to areas where there are injustices, unreasonable demands, and areas for improvement.
Rather than assert blame or unfairly label something as good or bad, evaluation, when done with an eye to speaking truth to power, can play a role in fostering quality and promoting the kind of outcomes we desire, not just the ones we get. In this way, perhaps the real measure of quality is the degree to which our evaluations do this. That is a standard that, as a profession, we can live up to and that our clients — students, teachers, parents, and society — deserve.
Image credit: Lex Sirikiat