Tag: quality

education & learningevaluation

The Quality Conundrum in Evaluation

lex-sirikiat-469013-unsplash

One of the central pillars of evaluation is assessing the quality of something, often described as its merit. Along with worth (value) and significance (importance), assessing the merit of a program, product or service is one of the principal areas that evaluators focus their energy.

However, if you think that would be something that’s relatively simple to do, you would be wrong.

This was brought home clearly in a discussion I took part in as part of a session on quality and evaluation at the recent conference of the American Evaluation Association entitled: Who decides if it’s good? How? Balancing rigor, relevance, and power when measuring program quality. The conversation session was hosted by Madeline Brandt and Kim Leonard from the Oregon Community Foundation, who presented on some of their work in evaluating quality within the school system in that state.

In describing the context of their work in schools, I was struck by some of the situational variables that came into play such as high staff turnover (and a resulting shortage among those staff that remain) and the decision to operate some schools on a four-day workweek instead of five as a means of addressing shortfalls in funding. I’ve since learned that Oregon is not alone in adopting the 4-day school week; many states have begun experimenting with it to curb costs. The argument is, presumably, that schools can and must do more with less time.

This means that students are receiving up to one fifth less classroom time each week, yet expecting to perform at the same level as those with five days. What does that mean for quality? Like much of evaluation work, it all depends on the context.

Quality in context

The United States has a long history of standardized testing, which was instituted partly as a means of ensuring quality in education. The thinking was that, with such diversity in schools, school types, and populations there needed to be some means to compare the capabilities and achievement across these contexts. A standardized test was presumed to serve as a means of assessing these attributes by creating a benchmark (standard) to which student performance could be measured and compared.

While there is a certain logic to this, standardized testing has a series of flaws embedded in its core assumptions about how education works. For starters, it assumes a standard curriculum and model of instruction that is largely one-size-fits-all. Anyone who has been in a classroom knows this is simply not realistic or appropriate. Teachers may teach the same material, but the manner in which it is introduced and engaged with is meant to reflect the state of the classroom — it’s students, physical space, availability of materials, and place within the curriculum (among others).

If we put aside the ridiculous assumption that all students are alike in their ability and preparedness to learn each day for a minute and just focus on the classroom itself, we already see the problem with evaluating quality by looking back at the 4-day school week. Four-day weeks mean either that teachers are creating short-cuts in how they introduce subjects and are not teaching all of the material they have or they are teaching the same material in a compressed amount of time, giving students less opportunity to ask questions and engage with the content. This means the intervention (i.e., classroom instruction) is not consistent across settings and thus, how could one expect things like standardized tests to reflect a common attribute? What quality education means in this context is different than others.

And that’s just the variable of time. Consider the teachers themselves. If we have high staff turnover, it is likely an indicator that there are some fundamental problems with the job. It may be low pay, poor working conditions, unreasonable demands, insufficient support or recognition, or little opportunity for advancement to name a few. How motivated, supported, or prepared do you think these teachers are?

With all due respect to those teachers, they may be incompetent to facilitate high-quality education in this kind of classroom environment. By incompetent, I mean not being prepared to manage compressed schedules, lack of classroom resources, demands from standardized tests (and parents), high student-teacher ratios, individual student learning needs, plus fitting in the other social activities that teachers participate in around school such as clubs, sports, and the arts. Probably no teachers have the competency for that. Those teachers — at least the ones that don’t quit their job — do what they can with what they have.

Context in Quality

This situation then demands new thinking about what quality means in the context of teaching. Is a high-quality teaching performance one where teachers are better able to adapt, respond to the changes, and manage to simply get through the material without losing their students? It might be.

Exemplary teaching in the context of depleted or scarce resources (time, funding, materials, attention) might look far different than if conducted under conditions of plenty. The learning outcomes might also be considerably different, too. So the link between the quality of teaching and learning outcomes is highly dependent on many contextual variables that, if we fail to account for them, will misattribute causes and effects.

What does this mean for quality? Is it an objective standard or a negotiated, relative one? Can it be both?

This is the conundrum that we face when evaluating something like the education system and its outcomes. Are we ‘lowering the bar’ for our students and society by recognizing outstanding effort in the face of unreasonable constraints or showing quality can exist in even the most challenging of conditions? We risk accepting something that under many conditions is unacceptable with one definition and blaming others for outcomes they can’t possibly achieve with the other.

From the perspective of standardized tests, the entire system is flawed to the point where the measurement is designed to capture outcomes that schools aren’t equipped to generate (even if one assumes that standardized tests measure the ‘right’ things in the ‘right’ way, which is another argument for another day).

Speaking truth to power

This years’ AEA conference theme was speaking truth to power and this situation provides a strong illustration of that. While evaluators may not be able to resolve this conundrum, what they can do is illuminate the issue through their work. By drawing attention to the standards of quality, their application, and the conditions that are associated with their realization in practice, not just theory, evaluation can serve to point to areas where there are injustices, unreasonable demands, and areas for improvement.

Rather than assert blame or unfairly label something as good or bad, evaluation, when done with an eye to speaking truth to power, can play a role in fostering quality and promoting the kind of outcomes we desire, not just the ones we get. In this way, perhaps the real measure of quality is the degree to which our evaluations do this. That is a standard that, as a profession, we can live up to and that our clients — students, teachers, parents, and society — deserve.

Image credit:  Lex Sirikiat

behaviour changeevaluationinnovation

Beyond the Big and New: Innovating on Quality

The newest, biggest, shiny thing

The newest, biggest, shiny thing

Innovation is a term commonly associated with ‘new’ and sparkly products and things, but that quest for the bigger and more shiny in what we do often obscures the true innovative potential within systems. Rethinking what we mean by innovation and considering the role that quality plays might help us determine whether bigger and glossy is just that, instead of necessarily better. 

Einstein’s oft paraphrased line about new thinking and problems goes something like this:

“Problems cannot be solved with the same mind set that created them.”

In complex conditions, this quest for novel thinking is not just ideal, it’s necessary. However genuine this quest for the new idea and new thing draws heavily upon widely shared human fears of the unknown it is also framed within a context of Western values. Not all cultures revere the new over what came before it, but in the Western world the ‘new’ has become celebrated and none more so than through the word innovation.

Innovation: What’s in a word?

Innovation web

Innovation web

A look at some of the terms associated with innovation (above) finds an emphasis on discovery and design, which can imply a positive sense of wonder and control to those with Westernized sentiments. Indeed, a survey of the landscape of actors, services and products seeking to make positive change in the world finds innovation everywhere and an almost obsessive quest for ideas. What is less attended to is providing a space for these ideas to take flight and answer meaningful, not trivial, questions in an impactful way.

Going Digital Strategy by Tom Fishburne

Going Digital Strategy by Tom Fishburne

I recently attended an event with Zaid Hassan speaking on Social Labs and his new book on the subject. While there was much interest in the way a social lab engages citizens in generating new ideas I was pleased to hear Hassan emphasize that the energy of a successful lab must be directed at the implementation of ideas into practice over just generating new ideas.

Another key point of discussion was the overall challenge of going deep into something and the costs of doing that. This last point got me thinking about the way we frame innovation and what is privileged in that discussion

Innovating beyond the new

Sometimes innovation takes place not only in building new products and services, but in thinking new thoughts, and seeing new possibilities.

Thinking new thoughts requires asking new or better questions of what is happening. As for seeing new possibilities, that might mean looking at things long forgotten and past practices to inform new practice, not just coming up with something novel. Ideas are sexy and fun and generate excitement, yet it is the realization of these ideas that matter more than anything.

The ‘new’ idea might actually be an old one, rethought and re-purposed. The reality for politicians and funders is often confined to equating ‘new’ things with action and work. Yet, re-purposing knowledge and products, re-thinking, or simply developing ideas in an evolutionary manner are harder to see and less sexier to sell to donors and voters.

When new means better, not necessarily bigger

Much of the social innovation sector is consumed or obsessed with scale. The Stanford Social Innovation Review, the key journal for the burgeoning field, is filled with articles, events and blog posts that emphasize the need for scaling social innovations. Scaling, in nearly all of these contexts, means taking an idea to more places to serve more people. The idea of taking a constructive idea that, when realized, benefits as many as possible is hard to argue against, however such a goal is predicated highly upon a number of assumptions about the intervention, population of focus, context, resource allocations and political and social acceptability of what is proposed that are often not aligned.

What is bothersome is that there is nowhere near the concern for quality in these discussions. In public health we often speak of intervention fidelity, intensity, duration, reach, fit and outcome, particularly with those initiatives that have a social component. In this context, there is a real threat in some circumstances of low quality information lest someone make a poorly informed or misleading choice.  We don’t seem to see that same care and attention to other areas of social innovation. Sometimes that is because there is no absolute level of quality to judge or the benefits to greater quality are imperceptibly low.

But I suspect that this is a case of not asking the question about quality in the first place. Apple under Steve Jobs was famous for creating “insanely great” products and using a specific language to back that up. We don’t talk like that in social innovation and I wonder what would happen if we did.

Would we pay more attention to showing impact than just talking about it?

Would we design more with people than for them?

Would we be bolder in our experiments?

Would we be less quick to use knee-jerk dictums around scale and speak of depth of experience and real change?

Would we put resources into evaluation, sensemaking and knowledge translation so we could adequately share our learning with others?

Would we be less hyperbolic and sexy?

Might we be more relevant to more people, more often and (ironically, perhaps) scale social innovation beyond measure?

 

 

Marketoonist Cartoon used under license.

 

 

 

art & designcomplexitydesign thinkingevaluationsystems thinking

Design Thinking, Design Making

Designing and thinking

Designing and thinking

Critics of design thinking suggest that it neglects the craft of products while advocates suggest that it extends itself beyond the traditional constraints of design’s focus on the brief. What separates the two are the implications associated with making something and the question: can we be good designer thinkers without being good design makers?

A review of the literature and discussions on design thinking finds a great deal of debate on whether it is a fad, a source of innovation salvation, or whether it is a term that fails to take the practice of design seriously.  While prototyping — and particularly rapid prototyping — is emphasized there is little attention to the manner in which that object is crafted. There are no standards of practice for design thinking and the myriad settings in which it could be applied — everything from business to education to the military to healthcare — indicate that there is unlikely to be a single model that fits. But should there be some form of standards?

While design thinking encourages prototyping there is remarkably little in the literature on the elements of design that focus on the made product. Unlike design where there is at least some sense of what makes a product good or not, there are no standards for what ought to emerge from design thinking. Dieter Rams, among the most vocal critics of the term design thinking, has written 10 principles for good design that can be applied to a designed product. These principles include a focus on innovation, sustainability, aesthetics, and usability.

These principles can be debated, but they at least offer something others can comment on or use as foil for critique. Design thinking lacks the same correlate. Is that a good (or necessary) thing?

Designing for process and outcome

Unlike design itself, design thinking is not tied to a particular product profile; it can be used to create physical products as easily as policies and programs. Design thinking is a process that is centred largely on complex, ambiguous problems where success has no pre-defined outcome and the journey has no set pathway. It is for this reason that concepts like best practices are inappropriate for use in design thinking and complex problem solving. Design thinking offers a great deal of conceptual freedom without the pressure to produce a specific outcome that might be proscribed by a design brief.

Yet, design thinking is not design. Certainly many designers draw on design thinking in their work, but there is no requirement to create products using that way of approaching design problems. Likewise, there is little demand for design thinking to produce products that would fit what Dieter Rams suggests are hallmark features of good design. Indeed, we can use design thinking to create many possible futures without a requirement to actually manifest any of them.

Design requires an outcome and one that can be judged by a client (or customer or user or donor) as satisfactory, exemplary or otherwise. While what is considered ‘good design’ might be debated, there is little debate that if a client does not like what is produced that product it is a failure on some level*. Yet, if design thinking produces a product (a design?), what is the source of the excellence or failure? And does it matter if anything is produced at all?

Herein lies a fundamental dilemma of design and design thinking: how do we know when we are doing good or great work?

Can we have good design thinking and poor design making?

The case of the military

Roger Martin, writing in Design Observer, highlighted how design thinking was being applied to the US Army through the adaptation of its Field Operations Manual. This new version was based on principles of complexity science and systems thinking, which encourage adaptive, responsive unit actions rather than relying solely on top-down directives. It was an innovative step and design thinking helped contribute to the development of this new Field Manual.

On discussing the process of developing the new manual (FM-05) Martin writes:

In the end, FM5-0 defines design as “a methodology for applying critical and creative thinking to understand, visualize, and describe complex, ill-structured problems and develop approaches to solve them” (Page 3.1), which is a pretty good definition of design. Ancker and Flynn go on to argue that design “underpins the exercise of battle command within the operations process, guiding the iterative and often cyclic application of understanding, visualizing, and describing” and that it should be “practiced continuously throughout the operations process.” (p. 15-16)

The manual’s development involved design thinking and the process in which it is enacted is based on applying design thinking to field operations. As unseemly as it may be to some, the US Army’s application of design thinking is notable and something that can be learned from. But what is the outcome?

Does a design thinking soldier become better at killing their enemy? Or does their empathy for the situation — their colleagues, opponents and neutral parties — increase their sensitivities to the multiplicities of combat and treat it as a wicked problem? What is the outcome in which design thinking is contributing to and how can that be evaluated in its myriad consequences intended or otherwise? In the case of the US Army it might not be so clear.

Craft

One of terms conspicuously absent from the dialogue on design thinking is craft. In a series of interviews with professionals doing design thinking it was noted that those trained as designers — makers — often referred to ‘craft’ and ‘materials’ in describing design thinking. Those who were not designers, did not**. No assessment can be made about the quality of the design thinking that each participant did (that was out of scope of the study), but it is interesting to note how concepts traditionally associated with making — craft and materials and studios — do not have much parallel discussion in design thinking.

Should they?

One reason to consider craft is that it can be assessed with at least some independence. There is an ability to judge the quality of materials and the product integrity associated with a designed object according to some standards that can be applied somewhat consistently — if imperfectly — from reviewer to reviewer. For programs and policies, this could be done by looking at research evidence or through developmental evaluation of those products. Developmental design, an approach I’ve written about before, could be the means in which evaluation data, rapid prototyping, design excellence and evidence could come together to potentially create more robust design thinking products.

We have little correlates with design thinking assessment.

The danger with looking at evaluation and design thinking is falling into the trap of devising and applying rigid metrics, best practices and the like to domains of complexity (and where design thinking resides) where they tend to fail catastrophically. Yet, there is an equal danger that by not aspiring to vision what great design thinking looks like we produce results that not only fail (which is often a necessary and positive step in innovation if there is learning from it), but are true failures in the sense that they don’t produce excellent products. It is indeed possible to create highly collaborative, design thinking-inspired programs, policies and products that are dull, ineffective and uninspiring.

Where we go and how we get there is a problem for design and design thinking. Applying them both to each other might be a way to create the very products we seek.

* It is interesting to note that Finnish designer Alvar Aalto’s 1933 three-legged children’s stool has been considered both a design flop from a technical standpoint (it’s unstable given its three legs) and one of the biggest commercial successes for Artek, its manufacturer.

** The analysis of the findings of the project are still ongoing. Updates and results will be published on the Design Thinking Foundations project site in the coming months, where this post will be re-published.