Sign up for our monthly Leap Update newsletter and announcements from the Leap Ambassadors Community:
By clicking "Stay Connected" you agree to the Privacy Policy
By clicking "Stay Connected" you agree to the Privacy Policy
The error of treating statistical significance as ipso facto equivalent to social significance seems to be standard practice in the program evaluation domain.
– David Hunter, Hunter Consulting, LLC
Statistical significance and policy relevance are not the same thing. One has to reach a judgment about any project’s results by separately considering both forms of significance.
– Gordon Berlin, President, MDRC
The July 2015 exchange among ambassadors about statistically significant impact vs. social (or policy) value in many ways reflects the internal debate over the hot-button issue of external evaluation during the development of “The Performance Imperative.” Ultimately, ambassadors robustly endorsed external evaluation as Pillar 7 of the Performance Imperative, while making clear that no one type of evaluation is right for every organization at every stage of organizational development. Similarly, ambassadors both questioned and offered thoughtful explanations of the value of statistical significance, while agreeing that it is not the same as social value and that additional context and considerations are necessary to ensure socially valuable outcomes.
The full context of each ambassador’s contribution to the exchange is listed in the Appendix.
In a community post on July 11, Mario Morino solicited ambassadors’ thoughts on the Social Innovation Research Center’s (SIRC) report, “Social Innovation Fund: Early Results Are Promising,” which described, with reasonable qualification, positive evaluation findings for several SIF-funded programs. David Hunter’s concern with the report centered on the way in which SIRC interpreted the results of MRDC’s evaluation of Reading Partners, a volunteer tutoring program for kids who are 6-30 months behind their age-mates in reading. While taking no issue with the quality of MRDC’s research, which found “statistically significant impacts…equivalent to approximately one-and-a-half to two months of additional progress in reading,” David critiqued SIRC for concluding, “the Reading Partners model is effective.” As David stated, “I want to focus on one issue: the difference between statistical significance and social significance…and how very frequently (including in this instance) the former is treated as if it were inherently the latter.“
David illustrated his point further:
If our child was 30 months behind his or her age cohort in reading, would we be happy if that child caught up 2 months (maximum) after participating in a 28-week reading improvement program? He or she would still be 28 months behind and, I think it is fair to say, still at a great educational disadvantage….In other words, the report does not address whether the statistically significant impacts of Reading Partners has any social value.
Three ambassadors– Gordon Berlin, Kris Moore, and Mari Kuraishi–responded to David’s critique by offering additional considerations for assessing the value of statistical vs. social significance.
Statistical and social significance must be considered separately
Gordon Berlin: “Statistical significance and policy relevance are not the same thing. One has to reach a judgment about any project’s results by separately considering both forms of significance.”
Statistically significant reading outcomes may signal greater value (in science, math, behavior) than a single study can determine
Kris Moore: “While I agree with these cautions, I want to add another consideration, which is that the ability to read, even a little better, makes it more likely that a child will be able to do science and math problems, and perhaps they are more likely to read on their own. And, hopefully, if school is less frustrating, a child might even behave a little better. A longitudinal follow-up with data on exploratory (as well as confirmatory) outcomes could answer these kinds of questions.”
Statistical significance has value to the individual student making some gains (private good), even if insignificant in the broader scheme of things (social good)
Mari Kuraishi: “This also fits the classic economics divide of private good vs. social good…As a parent with a kid with an IEP, I know that ANY gain is sought after and appreciated. But it is frustrating not to know how to get the gains faster and or cheaper.”
“This equation becomes a lot harder to balance when we have to think about this in the public good context because we are not asking individual households to define their own utility functions and letting them choose their own price point–the decision has to be made for them all as a group (and frequently the households are not consulted in choice of intervention in any event), and moreover the households aren’t paying directly for this intervention. They may be paying indirectly via taxes etc, but they are not at liberty to set price. And in this example we have no idea of alternative interventions, their cost and efficacy, as Gordon points out below.”
Gordon and David agreed that whether a program yields statistically significant evaluation results is a different question than whether it has policy relevance. Yet, they offered different perspectives when cost considerations are added to the equation.
Gordon Berlin: When cost is low, the policy value of modest statistical significance increases. “In considering policy relevance, cost matters. And because Reading Partners relies on volunteers, the cost is low. If policymakers can get a two month boost on average for all served children for very low cost, in the resource-constrained world we inhabit, policy relevance would rise. Benefits outweigh the modest costs.”
David Hunter: Is scaling inexpensive programs that get only modest results ultimately good for society? “Is it really good for our society and its people to have relatively inexpensive programs that produce weak results at best? Results that are unlikely to exert a meaningful influence over the life prospects of the people who most need a helping hand? Wouldn’t we want our social policy analysts to take this wider view? And might it not well be true that paying for inexpensive but weak programs will, in the end, cost our society more (in lost opportunities for people who needed more robust help than the weak programs provide–and the loss in contributions they might have made had they received such help)?
“Consider the example of home visitation programs for single mothers. If there were someone in your family who qualified for such a program, would you want her to receive services from the Nurse-Family Partnership or Healthy Families America? In the former there are proven and significant (life-changing) impacts both for the mother and sixteen years later for their children. In the latter, there also are proven impacts–but the effects are very modest in comparison and don’t seem to be strong enough to change life trajectories. The Nurse-Family Partnership is more expensive. Is that a good reason for policy analysts and makers to favor Healthy Families America when looking to spend public moneys?”
Dean Fixsen responded by offering several references from social science that go beyond RCTs/statistical significance to help “describe and assess social validity.” As he stated, “Present day views of social sciences have wed themselves to randomized group designs as the only way of knowing and in that mode statistical significance is the only test of what matters. Science is broader than that, and any science aimed at improving human service outcomes definitely goes far beyond that narrow view.” References recommended by Dean (with links added):
Our thanks to David Hunter, Gordon Berlin, Dean Fixsen, Kris Moore, and Mari Kuraishi for sharing their insights.
This document, developed collaboratively by the Leap of Reason Ambassadors Community (LAC), is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License. We encourage and grant permission for the distribution and reproduction of copies of this material in its entirety (with original attribution). Please refer to the Creative Commons link for license terms for unmodified use of LAC documents.
Because we recognize, however that certain situations call for modified uses (adaptations or derivatives), we offer permissions beyond the scope of this license (the “CC Plus Permissions”). The CC Plus Permissions are defined as follows:
You may adapt or make derivatives (e.g., remixes, excerpts, or translations) of this document, so long as they do not, in the reasonable discretion of the Leap of Reason Ambassadors Community, alter or misconstrue the document’s meaning or intent. The adapted or derivative work is to be licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License, conveyed at no cost (or the cost of reproduction,) and used in a manner consistent with the purpose of the Leap of Reason Ambassadors Community, with the integrity and quality of the original material to be maintained, and its use to not adversely reflect on the reputation of the Leap of Reason Ambassadors Community.
Attribution is to be in the following formats:
“From ‘Statistical vs. Social Significance,’ developed collaboratively by the Leap of Reason Ambassadors Community, licensed under CC BY ND https://creativecommons.org/licenses/by-nd/4.0/. For more information or to view the original product, visit https://leapambassadors.org/ambassador-insights/statistical-vs-social-significance/.”
The above is consistent with Creative Commons License restrictions that require “appropriate credit” be required and the “name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice and a link to the original material” be included.
The Leap of Reason Ambassadors Community may revoke the additional permissions described above at any time. For questions about copyright issues or special requests for use beyond the scope of this license, please email us at info@leapambassadors.org.
We use cookies for a number of reasons, such as keeping our site reliable and secure, personalising content and providing social media features and to analyse how our site is used.
Accept & Continue