Category: evaluation

evaluationinnovation

The Mindset for Design-Driven Evaluation

What distinguishes design-driven evaluation from other types of utilization-focused evaluation or innovation development is that it views the evaluative act as part of a service offering. It’s not a shift in method, but in mindset.

Evaluation is the innovator’s secret advantage. Any sustained attempt to innovate is driven by good data and systems to make sense of this data. Some systems are better than other and sometimes the data collected is not particularly great, but if you look at any organization that consistently develops new products and services that are useful and attractive and you’ll see some commitment to evaluation.

Innovation involves producing something that adds new value and evaluation is the means of assessing what that value is. What design-driven evaluation does is take that process one step further and views the process of data collection, sensemaking, decision-making, and action as part of the value chain of a service or product line.

It’s not a new way of evaluating things, it’s a new mindset for how we understand the utility of evaluation and its role in supporting sustained innovation and its culture within an organization. It does this by viewing evaluation as product on its own and as a service to the organization.

In both cases, the way we approach this kind of evaluation is the way we would approach designing a product and a service. It’s both.

Evaluation as a product

What does an evaluation of something produce? What is the product?

The simple, traditional answer is that an evaluation generates material for presentations or reports based on the merit, worth, and significance of what is being evaluated. A utilization-focused or developmental evaluation might suggest that the product is data that can be used to make decisions and learn.

Design-driven evaluation can do both, but extends our understanding of what the product is. The evaluation itself — the process of paying attention to what is happening in the development and use of a product or service, selecting what is most useful and meaningful, and collecting data on those activities and outcomes — has distinctive value on its own.

Viewed as a product, an evaluation can serve as a part of the innovation itself. Consider the tools that we use to generate many of our innovations from Sharpie markers and Post-it Notes to our whiteboards and wheely chairs to Figma or Adobe Illustrator software to the Macbook Pro or HP Envy PC that we use to type on. The best tools are designed to serve the creative process. There are many markers, computers, software packages, and platforms, but the ones we choose are the ones that serve a purpose well for what we need and what we enjoy (and that includes factoring in constraints) — they are well-designed. Why should an evaluation — a tool in the service of innovation — be any different?

Just like the reams of sticky notes that we generate with ideas serve as a product of the process of designing something new (innovating), so can an evaluation serve this same function.

These products are not just functional, they are stable and often invoke a positive emotional appeal to them (e.g. they look good, feel good, help you to feel good, etc.). Exceptional products do this while being sustainable, accessible, (usually) affordable, and culturally and environmentally sensitive the environments in which they are deployed. The best products combine it all.

Evaluations can do this. A design-driven evaluation generates something that is not only useful and used, but attractive. It invites conversation, use and showcases what is done in the service of creating an innovation by design.

The principles of good product design — designing for use, attraction, interaction, and satisfaction — are applied to an evaluation using this approach. This means selecting methods and tools that fit this function and aesthetic (and doesn’t divorce the two). It means treating the evaluation design and what it generates (e.g., data) as a product.

Evaluation as a service

The other role of a design-driven evaluation is to treat it as a service and thus, design it as such.

Service design is a distinct area of practice within the field of design that focuses on creating optimal experiences through service.

Designers Marc Stickdorn and Jakob Schneider suggest that service design should be guided by five basic principles :

  1. User-centered, through understanding the user by doing qualitative research;
  2. Co-creative, by involving all relevant stakeholders in the design process;
  3. Sequencing, by partitioning a complex service into separate processes;
  4. Evidencing, by visualizing service experiences and making them tangible;
  5. Holistic, by considering touchpoints in a network of interactions and users.

If we consider these principles in the scope of an evaluation, what we’ll see is something very different than just a report or presentation. This approach to designing evaluation as a service means taking a more concerted effort to identify present and potential future uses of an evaluation and understanding the user at the outset and designing for their needs, abilities, and preferences.

It also involves considering how evaluation can integrate into or complement existing service offerings. For innovators, it positions evaluation as a means of making innovation happen as part of the process and making that process better and more useful.

This is beyond A/B testing or forms of ‘testing’ innovations to positioning evaluation as a service to those who are innovating. In developmental evaluations, this means designing evaluation activities — from the data collection through to the synthesis, sensemaking, application, and re-design efforts of a program — as a service to the innovation itself.

Designing a mindset

Design-driven evaluation requires a mindset of an innovator and designer with the discipline of an evaluator. It is a way of approaching evaluation differently and goes beyond simple use to true service. This is not an approach that is needed for every evaluation either. But if you want to generate better use of evaluation results, contribute to better innovations and decision making, generate real learning (not learning artifacts), then designing a mindset for evaluation that views it alongside the care and attention that goes into all the other products and services we engage in matters a great deal.

If we want better, more useful evaluations (and their designs) we need to think and act like designers.

Photo credits: Heading: Cameron Norman
Second Photo by Mark Rabe on Unsplash
Third Photo by Carli Jeen on Unsplash

evaluationinnovation

Developmental Evaluation is not for you

Developmental evaluation is a powerful tool to support innovation, engaging communities, and foster deep learning. While it might be growing in popularity, increasingly in demand, and a key difference-maker for social and technological innovators it might also not be for you.

Developmental evaluation (DE) is an approach to evaluation that is designed to support innovation and gather data to make sense of things in a complex environment. It is a powerful tool full of promise and many traps and has become increasingly popular in the social, finance, and health sectors. Maybe it’s for you. Maybe it’s not.

Chances are, it’s not.

If you are looking to force an outcome, DE is not for you.

DE might be for you if you are confused, nervous, a little excited, and curious about what it is that you’re doing, how you can make it more sustainable and useful, and interested in working with complexity, not fighting against it.

If you are not interested in learning — really, truly learning — skip the DE and try something else. DE is only good for those individuals and organizations that are serious about learning. This might mean struggling with uncertainty, honestly reflecting on past actions (including all the false-starts, non-starts, rough starts, and bad finishes) and envisioning the future and challenging what you belief (and sometimes affirming beliefs, too). A DE prompts you to do all of this and if that’s not your thing, don’t get into DE.

If you know the end of the story with your innovation before you begin, DE is not for you either.

If the status quo is your thing, DE is not.

Therapists see this all the time. They encounter people who say: “I want to change” and then witness them fight, struggle, deny, and abandon efforts to do the work to make the change happen, because it’s far easier to ask for change than it is to do it. This is OK — this struggle is part of being human. But if you are unwilling to do the work, struggle with it, and truly learn from your efforts, DE is not for you.

If you have the best idea in the world and a plan to change the world with it, DE is probably not for you. DE might get to you to re-think parts of your plan or the whole thing. It’s going to make your expected outcomes less expected and gum up the nice, simple, but wrong picture.

If practice makes perfect, DE is not for you. If practice is more of a vocation like medicine or doing meditation — a way of doing the work — then that’s a different story. For a DE practitioner, it’s not about becoming great at something, an improved version of yourself or your organization, or the best in the world. It’s about learning, growing and evolving (see above).

If you think DE is going to make you better as a person. Nope. Just as a 30-year old is not 6-times better than a 5-year old, someone who does DE is no better than they were beforehand. But they may have learned a lot and evolved as an innovator.

If you want something fast, efficient, outcomes-driven, and evidence-based from top-to-bottom don’t even think about DE.

Want to be trendy? Do DE. It’s what the cool kids are doing in evaluation and if being cool is important to you – definitely get into DE. (Unless you don’t like putting in a lot of work to become proficient areas of complexity, social and organizational behaviour, many different aspects of evaluation, and even design).

Lazy? Uncommitted? Allergic to creativity? Undisciplined? Low energy? Have a low tolerance for ambiguity? Then DE is not for you.

If you’re looking for a direct plan, a clear pathway to improvement and betterment, and quantifiable outcomes, DE is not for you.

If innovation has a specific look, feel, ROI, and outcome then you need tools and strategies that will assess all of that – which means you should not engage in DE. DE will only disappoint you. You will be exposed to many things, including possibilities you’d never considered, but they very likely won’t fit your model because, if what you are doing is truly innovative, it’s never really been tried before.

If you are changing the game while playing it, the rules that started won’t apply to what happens when you finish. You can’t start playing chess and wind up playing volleyball and still seek to measure the movements of the Rook, Bishop, or Queen. If you’re not really into game-changing – the kind that’s not about hyperbole and catchphrases — DE is not for you.

If you are short on time, commitment, and resources to bring people together, take time to pause and truly reflect, sit with uncertainty, delight in surprise, exceed your expectations, and sometimes end up disappointed, DE isn’t for you.

If strategy is a plan that you stick to no matter what, then DE is not for you.

If you’ve embraced failure as a mantra or are afraid of “failure” (which to you means not doing everything you set out to do in the manner you set out to it), then DE is certainly not for you. The only way you will fail at DE is the failure to devote attention to learning.

If you view relationships as transactions, rather than as opportunities to grow and transform, DE is most certainly not for you.

Innovation is about discovery. If you wish to work in ways that are aligned with natural development — the kind we see in our children, pets, gardens, communities, and ourselves – you might find yourself discovering a lot and DE can be a big help. If ‘discovery’ is a code-word for re-packaging what you already have or doing what you’ve always done (be honest with yourself), then DE is a big waste of time.

Can’t handle surprises? Run away from DE and use something else.

If you’re looking to just check off a box because you committed to doing that in your corporate plan, then make your life easy and give DE a pass. If you see organizations as living beings and wish to create value for others in a manner that is consistent with this perspective, then DE could be a powerful ally in that process.

DE is becoming popular, but it most certainly not everyone. Maybe not for you, either. Now, you have lots of reasons to show why you should try something else.

If you still think DE is for you after all this, let’s connect — because DE seems to suit us at Cense just fine and we can help it to suit you, too.

Photo by Loren Gu on Unsplash

design thinkingevaluation

Utility in Design-Driven Evaluation

Design-driven evaluation focuses attention on what organization’s use in their decision-making and learning to support the creation of their products, delivery of services, and the overall quality of their work. While ultimately practical, this focus on utility also introduces some difficult conversations about what organizations truly value, not just what they say.

To say something is design-driven is to imply that its emphasis is on the process of creating something of value for someone. In this case, the value is through evaluation and what it means. Value in this case is framed as utility asking: what is it for? (and for whom and under what context)?

These are the questions that designers ask of any service, product, or structure that they seek to apply their attention to. Designers might ask: “what are hiring [the thing being designed] to do?” These are simple questions that may provoke much discussion and can transform the way we approach the creation and maintenance of something moving forward.

A different take is to ask people to describe what they already do (and what they want) to frame the discussion of how to approach the design. This can lead us into a trap of the present moment. It keeps people framing their work in the context of language that supports their present identity and the conceptions (and misconceptions) associated with it, not necessarily where they want to go.

Evidence-based?

You show me an evidence-based human service organization and I’ll show you one that is lying to you (and maybe to themselves), is in deep denial, or is focused on a narrow, established scope of practice. Very few fit the last category (but they do exist), which leaves us with the unsettling reality that we are likely dealing in some level of bullshit — a deliberate misrepresentation of the facts to impress others and themselves (see Harry Frankfurt’s work on the subject (PDF).

This is not to say that these organizations don’t use evidence at all or care about its application, but that there are so many areas within that scope of work that are not based on solid or even superficial evidence that to describe something as ‘evidence-based’ is an over-reach at best, a lie at worst. The reasons for this deception are many, but among them is simply that there is not enough evidence available to inform many aspects of the work. It’s impossible to truly be evidence-based when dealing with areas of complexity, social innovation, or complex innovation.

Consider this: an organization seeks to develop an evidence-based program and spends weeks or months gathering and reviewing research. They may even collect some data on their own and synthesize the findings together to inform a recommendation based on evidence. At play is the evidence for the program, the evidence to support the design of the program (converting evidence developed in one context into actionable structures, procedures, and plans into another), the evidence to support the implementation of the designed program in a new context, and the evidence generated through evaluation of the design, delivery, and outcomes associated with the program.

That is a lot of evidence to consider. I’ve never seen a program come even close to having evidence that even reasonably fits all of these contexts, let alone strong evidence. Why? Because there are so many variables at play in the program context (e.g., design, delivery, fidelity, etc..) and the process of evidence generation itself (e.g., design, data availability, analysis, etc..).

Utility means looking at what people actually use, not just what they say they use. To illustrate, I worked with an organization that proudly claimed that they were both evidence-based and a learning organization. When I asked what evidence they used and how they learned I was told with much more modest confidence that staff typically read “one or two” research articles per month (and that was it — and this was in a highly volatile, transnational, multidisciplinary field of practice). They also said that they engaged in reflective practice by writing up case reports, which (if completed at all), usually took up to four months to prepare after a site visit to a particular site or event due to the other activities they had to do as part of the day-to-day work of the organization.

This organization did their best, but that best wasn’t anywhere enough if they truly wished to be a learning organization or evidence-based. Yet, because they insisted they were these things they also insisted on an evaluation design that fit that narrative. They had not designed their organization to be evidence-based or a real learning organization. A design-driven approach would have developed things that suited that context and perhaps pushed them a little further toward being the organization they saw themselves to be.

Another Way: Fit for Purpose

Why bring up evidence-based decision-making? The reason has much to do with defining what makes design-driven evaluation different from other forms of evaluation (even research). Design-driven evaluation is about generating evidence for use within specific contexts. It involves using design principles and strategies to uncover and understand those principles ahead of the evaluation being designed. It means designing not only the evaluation itself, but the manner in which it produces products and the means associated with decision-making based on those products.

It is about being fit-for-purpose.

Most published forms of evidence is developed independent of the context in which it is to be used. That’s the traditional model for science. We learn things in one setting (maybe a lab) and then move it out into other settings (e.g., a clinic), do more trials and then eventually develop a body of evidence that is used to generalize to other settings. This works reasonably well for problems and issues that are simple or complicated in their structure.

As a situation involves ever greater complexity, that ability to translate from one setting or context to another breaks down. This complexity might also influence what the purpose and expected outcomes are of a program within that context. For example, a community-based health promotion program may have a theory, even program logic model, and goals, but it will need to consider neighbourhood design, differences in resident needs, local history, and the availability of other programs and resources. The purpose in one neighbourhood might be to provide a backstop to a local organization that is having financial problems where in another neighbourhood it might be to provide a vehicle for local leaders to take action where there are no other alternatives.

Not Leaving Things to Chance

Developing a fit-for-purpose program is not something that should be left to chance, because chances are very likely it won’t happen. If good design improves the use, usability, and overall translation of knowledge. A look at how real evidence-based practice emerges comes down to ways in which the design — intended or not — of the knowledge, the exchange opportunities, relationships, and systems come together.

Design-driven evaluation seeks to remedy one of the fundamental problems within the evidence translation process: the poor fit of the evaluation (data, process, focus) for the implementation of its findings. It’s about not leaving it to chance with the hope that maybe someone will figure out how to use things, overcome poor usability, persist through confusion, and still make good use of an evaluation.

Is this the system we want? Or could we do better? My answer is ‘no’ to the first and ‘yes’ to the second. Design-driven evaluations can be the means to get us to that ‘yes’ because as things get more complicated and complex and the need for better data, improved decisions, and decisive action rises we need to make sure we don’t leave doing better to chance.

Photo by Jeff Sheldon on Unsplash

If you’re interested in better and doing design-driven evaluation, contact Cense via this link.

design thinkingevaluation

Design-driven Evaluation

Fun Translates to Impact

A greater push for inclusion of evaluation data to make decisions and support innovation is not generating value if there is little usefulness of the evaluations in the first place. A design-driven approach to evaluation is the means to transform utilization into both present and future utility.

I admit to being puzzled the first time I heard the term utilization-focused evaluation. What good is an evaluation if it isn’t utilized I thought? Why do an evaluation in the first place if not to have it inform some decisions, even if just to assess how past decisions turned out? Experience has taught me that this happens more often than I ever imagined and evaluation can be simply an exercise in ‘faux’ accountability; a checking off of a box to say that something was done.

This is why utilization-focused evaluation (U-FE) is another invaluable contribution to the field of practice by Michael Quinn Patton.

U-FE is an approach to evaluation, not a method. Its central focus is engaging the intended users in the development of the evaluation and ensuring that users are involved in decision-making about the evaluation as it moves forward. It is based on the idea (and research) that an evaluation is far more likely to be used if grounded in the expressed desires of the users and if those users are involved in the evaluation process throughout.

This approach generates a participatory activity chain that can be adapted for different purposes as we’ve seen in different forms of evaluation approaches and methods such as developmental evaluation, contribution analysis, and principles-focused approaches to evaluation.

Beyond Utilization

Design is the craft, production, and thinking associated with creating products, services, systems, or policies that have a purpose. In service of this purpose, designers will explore multiple issues associated with the ‘user’ and the ‘use’ of something — what are the needs, wants, and uses of similar products. Good designers go beyond simply asking for these things, but measuring, observing, and conducting design research ahead of the actual creation of something and not just take things at face value. They also attempt to see things beyond what is right in front of them to possible uses, strategies, and futures.

Design work is both an approach to a problem (a thinking & perceptual difference) and a set of techniques, tools, and strategies.

Utilization can run into problems when we take the present as examples of the future. Steve Jobs didn’t ask users for ‘1000 songs in their pockets‘ nor was Henry Ford told he needed to invent the automobile over giving people faster horses (even if the oft-quoted line about this was a lie). The impact of their work was being able to see possibilities and orchestrate what was needed to make these possibilities real.

Utilization of evaluation is about making what is fit better for use by taking into consideration the user’s perspective. A design-driven evaluation looks beyond this to what could be. It also considers how what we create today shapes what decisions and norms come tomorrow.

Designing for Humans

Among the false statements attributed to Henry Ford about people wanting faster cars is a more universal false statement said by innovators and students alike: “I love learning.” Many humans love the idea of learning or the promise of learning, but I would argue that very few love learning with a sense of absoluteness that the phrase above conveys. Much of our learning comes from painful, frustrating, prolonged experiences and is sometimes boring, covert, and confusing. It might be delayed in how it manifests itself with its true effects not felt long after the ‘lesson’ is taught. Learning is, however, useful.

A design-driven approach seeks to work with human qualities to design for them. For example, a utilization-focused evaluation approach might yield a process that involves regular gatherings to discuss an evaluation or reports that use a particular language, style, and layout to convey the findings. These are what the users, in this case, are asking for and what they see as making evaluation findings appealing and thus, have built into the process.

Except, what if the regular gatherings don’t involve the right people, are difficult to set up and thus ignored, or when those people show up they are distracted with other things to do (because this process adds another layer of activity into a schedule that is already full)? What if the reports that are generated are beautiful, but then sit on a shelf because the organization doesn’t have a track record of actually drawing on reports to inform decisions despite wanting such a beautiful report? (We see this with so many organizations that claim to be ‘evidence-based’ yet use evidence haphazardly, arbitrarily, or don’t actually have the time to review the evidence).

What we will get is that things have been created with the best intentions for use, but are not based on the actual behaviour of those involved. Asking this and designing for it is not just an approach, it’s a way of doing an evaluation.

Building Design into Evaluation

There are a couple of approaches to introducing design for evaluation. The first is to develop certain design skills — such as design thinking and applied creativity. This work is being done as part of the Design Loft Experience workshop held at the annual American Evaluation Association conference. The second is more substantive and that is about incorporating design methods into the evaluation process from the start.

Design thinking has become popular as a means of expressing aspects of design in ways that have been taken up by evaluators. Design thinking is often characterized by a playful approach to generating new ideas and then prototyping those ideas to find the best fit. Lego, play dough, markers, and sticky notes (as shown above) are some of the tools of the trade. Design thinking can be a powerful way to expand perspectives and generate something new.

Specific techniques, such as those taught at the AEA Design Loft, can provide valuable ways to re-imagine what an evaluation could look like and support design thinking. However, as I’ve written here, there is a lot of hype, over-selling, and general bullshit being sprouted in this realm so proceed with some caution. Evaluation can help design thinking just as much as design thinking can help evaluation.

What Design-Driven Evaluation Looks Like

A design-driven evaluation takes as its premise a few key things:

  • Holistic. Design-driven evaluation is a holistic approach to evaluation and extends the thinking about utility to everything from the consultation process, engagement strategy, instrumentation, dissemination, and discussions on use. Good design isn’t applied only to one part of the evaluation, but the entire thing from process to products to presentations.
  • Systems thinking. It also utilizes systems thinking in that it expands the conversation of evaluation use beyond the immediate stakeholders involved in consideration of other potential users and their positions within the system of influence of the program. Thus, a design-driven evaluation might ask: who else might use or benefit from this evaluation? How do they see the world? What would use mean to them?
  • Outcome and process oriented. Design-driven evaluations are directed toward an outcome (although that may be altered along the way if used in a developmental manner), but designers are agnostic to the route to the outcome. An evaluation must contain integrity in its methods, but it must also be open for adaptation as needed to ensure that the design is optimal for use. Attending to the process of design and implementation of the evaluation is an important part of this kind of evaluation.
  • Aesthetics matter. This is not about making things pretty, but it is about making things attractive. This means creating evaluations that are not ignored. This isn’t about gimmicks, tricks, or misrepresenting data, it’s considering what will draw and hold attention from the outset in form and function. One of the best ways is to create a meaningful engagement strategy for participants from the outset and involving people in the process in ways that fit with their preferences, availability, skill set, and desires rather than as tokens or simply as ‘role players.’ It’s about being creative about generating products that fit with what people actually use not just what they want or think a good evaluation is. This might mean doing a short video or producing a series of blog posts rather than writing a report. Kylie Hutchinson has a great book on innovative reporting for evaluation that can expand your thinking about how to do this.
  • Inform Evaluation with Research. Research is not just meant to support the evaluation, but to guide the evaluation itself. Design research is about looking at what environments, markets, and contexts a product or service is entering. Design-driven evaluation means doing research on the evaluation itself, not just for the evaluation.
  • Future-focused. Design-driven evaluation draws data from social trends and drivers associated with the problem, situation, and organization involved in the evaluation to not only design an evaluation that can work today but one that anticipates use needs and situations to come. Most of what constitutes use for evaluation will happen in the future, not today. By designing the entire process with that in mind, the evaluation can be set up to be used in a future context. Methods of strategic foresight can support this aspect of design research and help strategically plan for how to manage possible challenges and opportunities ahead.

Principles

Design-driven evaluation also works well with principles-focused evaluation. Good design is often grounded in key principles that drive its work. One of the most salient of these is accessibility — making what we do accessible to those who can benefit from it. This extends us to consider what it means to create things that are physically accessible to those with visual, hearing, or cognitive impairments (or, when doing things in physical spaces, making them available for those who have mobility issues).

Accessibility is also about making information understandable (avoiding unnecessary jargon (using the appropriate language for each audience), using plain language when possible, accounting for literacy levels. It’s also about designing systems of use — for inclusiveness. This means going beyond doing things like creating an executive summary for a busy CEO when that over-simplifies certain findings to designing in space within that leaders’ schedule and work environment to make the time to engage with the material in the manner that makes sense for them. This might be a different format of a document, a podcast, a short interactive video, or even a walking meeting presentation.

There are also many principles of graphic design and presentation that can be drawn on (that will be expanded on in future posts). Principles for service design, presentations, and interactive use are all available and widely discussed. What a design-driven evaluation does is consider what these might be and build them into the process. While design-driven evaluation is not necessarily a principles-focused one, they can be and are very close.

This is the first in a series of posts that will be forthcoming on design-driven evaluation. It’s a starting point and far from the end. By taking into account how we create not only our programs but their evaluation from the perspective of a designer we can change the way we think about what utilization means for evaluation and think even more about its overall experience.

evaluationsocial systems

Baby, It’s Cold Outside (and Other Evaluation Lessons)

Competing desires or imposing demands?

The recent decision by many radio stations to remove the song “Baby, It’s Cold Outside” from their rotation this holiday season provides lessons on culture, time, perspective, and ethics beyond the musical score for those interested in evaluation. The implications of these lessons extend far beyond any wintery musical playlist. 

As the holiday season approaches, the airwaves, content streams, and in-store music playlists get filled with their annual turn toward songs of Christmas, the New Year, Hanukkah, and the romance of cozy nights inside and snowfall. One of those songs has recently been given the ‘bah humbug’ treatment and voluntarily removed from playlists, initiating a fresh round of debates (which have been around for years) about the song and its place within pop culture art. The song, “Baby, It’s Cold Outside” was written in 1944 and has been performed and recorded by dozens of duets ever since. 

It’s not hard for anyone sensitive to gender relations to find some problematic issues with the song and the defense of it on the surface, but it’s once we get beneath that surface that the arguments become more interesting and complicated. 

One Song, Many Meanings

One of these arguments has come from jazz vocalist Sophie Millman, whose take on the song on the CBC morning radio show Metro Morning was that the lyrics are actually about competing desires within the times, not a work about predatory advances.

Others, like feminist author Cammila Collar, have gone so far to describe the opposition to the song as ‘slut shaming‘. 

Despite those points (and acknowledging some of them), others suggest that the manipulative nature of the dialogue attributed to the male singer is a problem no matter what year the song was written. For some, the idea that this was just harmless banter overlooks the enormous power imbalance between genders then and now when men could impose demands on women with fewer implications. 

Lacking a certain Delorean to go back in time to fully understand the intent and context of the song when it was written and released, I came to appreciate that this is a great example of some of the many challenges that evaluators encounter in their work. Is “Baby, It’s Cold Outside” good or bad for us? Like with many situations evaluators encounter: it depends (and depends on what questions we ask). 

Take (and Use) the Fork

Yogi Berra famously suggested (or didn’t) that “when you come across a fork in the road, take it.” For evaluators, we often have to take the fork in our work and the case of this song provides us with a means to consider why.

A close read of the lyrics and a cursory knowledge of the social context of the 1940s suggests that the arguments put forth by Sophie Millman and Cammila Collar have some merit and at least warrant plausible consideration. This might just be a period piece highlighting playful, slightly romantic banter between a man and woman on a cold winter night. 

At the same time, what we can say with much more certainty is that the song agitates many people now. Lydia Liza and Josiah Lemanski revised the lyrics to create a modern, consensual take on the song, which has a feel that is far more in keeping with the times. This doesn’t negate the original intent and interpretation of the lyrics, rather it places the song in the current context (not a historical one) and that is important from an evaluative standpoint.

If the intent of the song is to delight and entertain then what once worked well now might not. In evaluation terms, we might say the original merit of the song may hold based on historical context, its worth has changed considerably within the current context.

We may, as Berra might have said, have to take the fork and accept two very different understandings within the same context. We can do this by asking some specific questions. 

Understanding Contexts

Evaluators typically ask of programs (at least) three questions: What is going on? What’s new? and What does it mean? In the case of Baby, It’s Cold Outside, we can see that the context has shifted over the years, meaning that no matter how benign the original intent, the potential for misinterpretation or re-visioning of the intent in light of current times is worth considering.

What is going on is that we are seeing a lot of discussion about the subject matter of a song and what it means in our modern society. This issue is an attractor for a bigger discussion of historical treatment, inequalities, and the language and lived experience of gender.

The fact that the song is still being re-recorded and re-imagined by artists illustrates the tension between a historical version and a modern interpretation. It hasn’t disappeared and it may be more known now than ever given the press it receives.

What’s new is that society is far more aware of the scope and implications of gender-based discrimination, violence, and misogyny in our world than before. It’s hard to look at many historical works of art or expression without referencing the current situation in the world. 

When we ask about what it means, that’s a different story. The myriad versions of the song are out there on records, CD’s, and through a variety of streaming sources. While it might not be included in a few major outlets, it is still available. It is also possible to be a feminist and challenge gender-based violence and discrimination and love or leave the song. 

The two perspectives may not be aligned explicitly, but they can be with a larger, higher-level purpose of seeking empowerment and respect for women. It is this context of tension that we can best understand where works like this live. 

This is the tension in which many evaluations live when dealing with human services and systems. There are many contexts and we can see competing visions and accept them both, yet still work to create a greater understanding of a program, service, or product. Like technology, evaluations aren’t good or bad, but nor are they neutral. 

Image credit MGM/YouTube via CBC.ca

Note: The writing article happened to coincide with the anniversary of the horrific murder of 14 women at L’Ecole Polytechnique de Montreal. It shows that, no matter how we interpret works of art, we all need to be concerned with misogyny and gender-based violence. It’s not going away.  

education & learningevaluation

Learning: The Innovators’ Guaranteed Outcome

Innovation involves bringing something new into the world and that often means a lot of uncertainty with respect to outcomes. Learning is the one outcome that any innovation initiative can promise if the right conditions are put into place. 

Innovation — the act of doing something new to produce value — in human systems is wrought with complications from the standpoint of evaluation given that the outcomes are not always certain, the processes aren’t standardized (or even set), and the relationship between the two are often in an ongoing state of flux. And yet, evaluation is of enormous importance to innovators looking to maximize benefit, minimize harm, and seek solutions that can potentially scale beyond their local implementation. 

Non-profits and social innovators are particularly vexed by evaluation because there is an often unfair expectation that their products, services, and programs make a substantial change to social issues such as poverty, hunger, employment, chronic disease, and the environment (to name a few). These are issues that are large, complex, and for which no actor has complete ownership or control over, yet require some form of action, individually and collectively. 

What is an organization to do or expect? What can they promise to funders, partners, and their stakeholders? Apart from what might be behavioural or organizational outcomes, the one outcome that an innovator can guarantee — if they manage themselves right — is learning

Learning as an Outcome

For learning to take place, there need to be a few things included in any innovation plan. The first is that there needs to be some form of data capture of the activities that are undertaken in the design of the innovation. This is often the first hurdle that many organizations face because designers are notoriously bad at showing their work. Innovators (designers) need to capture what they do and what they produce along the way. This might include false starts, stops, ‘failures’, and half-successes, which are all part of the innovation process. Documenting what happens between idea and creation is critical.

Secondly, there needs to be some mechanism to attribute activities and actions to indicators of progress. Change only can be detected in relation to something else so, in the process of innovation, we need to be able to compare events, processes, activities, and products at different stages. Some of the selection of these indicators might be arbitrary at first, but as time moves along it becomes easier to know whether things like a stop or start are really just ‘pauses’ or whether they really are pivots or changes in direction. 

Learning as organization

Andrew Taylor and Ben Liadsky from Taylor Newberry Consulting recently wrote a great piece on the American Evaluation Association’s AEA 365 blog outlining a simple approach to asking questions about learning outcomes. Writing about their experience working with non-profits and grantmakers, they comment on how evaluation and learning require creating a culture that supports the two in tandem:

Given that organizational culture is the soil into which evaluators hope to plant seeds, it may be important for us to develop a deeper understanding of how learning culture works and what can be done to cultivate it.

What Andrew and Ben speak of is the need to create the environment for which learning can occur at the start. Some of that is stirred by asking some critical questions as they point out in their article. These include identifying whether there are goals for learning in the organization and what kind of time and resources are invested to regularly gathering people together to talk about the work that is done. This is the third big part of evaluating for learning: create the culture for it to thrive. 

Creating Consciousness

It’s often said that learning is a natural as breathing, but if that were true much more would be gained from innovation than there is. Just like breathing, learning can take place passively and can be manipulated or controlled. In both cases, there is a need to create a consciousness around what ‘lessons’ abound. 

Evaluation serves to make the unconscious, conscious. By paying attention — being mindful — of what is taking place and linking that to innovation at the level of the organization (not just the individual) evaluation can be a powerful tool to aid the process of taking new ideas forward. While we cannot always guarantee that a new idea will transform a problem into a solution, we can ensure that we learn in our effort to make change happen. 

The benefit of learning is that it can scale. Many innovations can’t, but learning is something that can readily be added to, built on, and transforms the learner. In many ways, learning is the ultimate outcome. So next time you look to undertake an innovation, make sure to evaluate it and build in the kind of questions that help ensure that, no matter what the risks are, you can assure yourself a positive outcome. 

Image Credit: Rachel on Unsplash

education & learningevaluation

The Quality Conundrum in Evaluation

lex-sirikiat-469013-unsplash

One of the central pillars of evaluation is assessing the quality of something, often described as its merit. Along with worth (value) and significance (importance), assessing the merit of a program, product or service is one of the principal areas that evaluators focus their energy.

However, if you think that would be something that’s relatively simple to do, you would be wrong.

This was brought home clearly in a discussion I took part in as part of a session on quality and evaluation at the recent conference of the American Evaluation Association entitled: Who decides if it’s good? How? Balancing rigor, relevance, and power when measuring program quality. The conversation session was hosted by Madeline Brandt and Kim Leonard from the Oregon Community Foundation, who presented on some of their work in evaluating quality within the school system in that state.

In describing the context of their work in schools, I was struck by some of the situational variables that came into play such as high staff turnover (and a resulting shortage among those staff that remain) and the decision to operate some schools on a four-day workweek instead of five as a means of addressing shortfalls in funding. I’ve since learned that Oregon is not alone in adopting the 4-day school week; many states have begun experimenting with it to curb costs. The argument is, presumably, that schools can and must do more with less time.

This means that students are receiving up to one fifth less classroom time each week, yet expecting to perform at the same level as those with five days. What does that mean for quality? Like much of evaluation work, it all depends on the context.

Quality in context

The United States has a long history of standardized testing, which was instituted partly as a means of ensuring quality in education. The thinking was that, with such diversity in schools, school types, and populations there needed to be some means to compare the capabilities and achievement across these contexts. A standardized test was presumed to serve as a means of assessing these attributes by creating a benchmark (standard) to which student performance could be measured and compared.

While there is a certain logic to this, standardized testing has a series of flaws embedded in its core assumptions about how education works. For starters, it assumes a standard curriculum and model of instruction that is largely one-size-fits-all. Anyone who has been in a classroom knows this is simply not realistic or appropriate. Teachers may teach the same material, but the manner in which it is introduced and engaged with is meant to reflect the state of the classroom — it’s students, physical space, availability of materials, and place within the curriculum (among others).

If we put aside the ridiculous assumption that all students are alike in their ability and preparedness to learn each day for a minute and just focus on the classroom itself, we already see the problem with evaluating quality by looking back at the 4-day school week. Four-day weeks mean either that teachers are creating short-cuts in how they introduce subjects and are not teaching all of the material they have or they are teaching the same material in a compressed amount of time, giving students less opportunity to ask questions and engage with the content. This means the intervention (i.e., classroom instruction) is not consistent across settings and thus, how could one expect things like standardized tests to reflect a common attribute? What quality education means in this context is different than others.

And that’s just the variable of time. Consider the teachers themselves. If we have high staff turnover, it is likely an indicator that there are some fundamental problems with the job. It may be low pay, poor working conditions, unreasonable demands, insufficient support or recognition, or little opportunity for advancement to name a few. How motivated, supported, or prepared do you think these teachers are?

With all due respect to those teachers, they may be incompetent to facilitate high-quality education in this kind of classroom environment. By incompetent, I mean not being prepared to manage compressed schedules, lack of classroom resources, demands from standardized tests (and parents), high student-teacher ratios, individual student learning needs, plus fitting in the other social activities that teachers participate in around school such as clubs, sports, and the arts. Probably no teachers have the competency for that. Those teachers — at least the ones that don’t quit their job — do what they can with what they have.

Context in Quality

This situation then demands new thinking about what quality means in the context of teaching. Is a high-quality teaching performance one where teachers are better able to adapt, respond to the changes, and manage to simply get through the material without losing their students? It might be.

Exemplary teaching in the context of depleted or scarce resources (time, funding, materials, attention) might look far different than if conducted under conditions of plenty. The learning outcomes might also be considerably different, too. So the link between the quality of teaching and learning outcomes is highly dependent on many contextual variables that, if we fail to account for them, will misattribute causes and effects.

What does this mean for quality? Is it an objective standard or a negotiated, relative one? Can it be both?

This is the conundrum that we face when evaluating something like the education system and its outcomes. Are we ‘lowering the bar’ for our students and society by recognizing outstanding effort in the face of unreasonable constraints or showing quality can exist in even the most challenging of conditions? We risk accepting something that under many conditions is unacceptable with one definition and blaming others for outcomes they can’t possibly achieve with the other.

From the perspective of standardized tests, the entire system is flawed to the point where the measurement is designed to capture outcomes that schools aren’t equipped to generate (even if one assumes that standardized tests measure the ‘right’ things in the ‘right’ way, which is another argument for another day).

Speaking truth to power

This years’ AEA conference theme was speaking truth to power and this situation provides a strong illustration of that. While evaluators may not be able to resolve this conundrum, what they can do is illuminate the issue through their work. By drawing attention to the standards of quality, their application, and the conditions that are associated with their realization in practice, not just theory, evaluation can serve to point to areas where there are injustices, unreasonable demands, and areas for improvement.

Rather than assert blame or unfairly label something as good or bad, evaluation, when done with an eye to speaking truth to power, can play a role in fostering quality and promoting the kind of outcomes we desire, not just the ones we get. In this way, perhaps the real measure of quality is the degree to which our evaluations do this. That is a standard that, as a profession, we can live up to and that our clients — students, teachers, parents, and society — deserve.

Image credit:  Lex Sirikiat