Page - Statistical vs. Social Significance
Statistical vs. Social Significance
Insights from the Ambassadors
The error of treating statistical significance as ipso facto equivalent to social significance seems to be standard practice in the program evaluation domain.
– David Hunter, Hunter Consulting, LLC
Statistical significance and policy relevance are not the same thing. One has to reach a judgment about any project’s results by separately considering both forms of significance.
– Gordon Berlin, President, MDRC
Overview
The July 2015 exchange among ambassadors about statistically significant impact vs. social (or policy) value in many ways reflects the internal debate over the hot-button issue of external evaluation during the development of “The Performance Imperative.” Ultimately, ambassadors robustly endorsed external evaluation as Pillar 7 of the Performance Imperative, while making clear that no one type of evaluation is right for every organization at every stage of organizational development. Similarly, ambassadors both questioned and offered thoughtful explanations of the value of statistical significance, while agreeing that it is not the same as social value and that additional context and considerations are necessary to ensure socially valuable outcomes.
The full context of each ambassador’s contribution to the exchange is listed in the Appendix.
Background/The Issue at Hand
In a community post on July 11, Mario Morino solicited ambassadors’ thoughts on the Social Innovation Research Center’s (SIRC) report, “Social Innovation Fund: Early Results Are Promising,” which described, with reasonable qualification, positive evaluation findings for several SIF-funded programs. David Hunter’s concern with the report centered on the way in which SIRC interpreted the results of MRDC’s evaluation of Reading Partners, a volunteer tutoring program for kids who are 6-30 months behind their age-mates in reading. While taking no issue with the quality of MRDC’s research, which found “statistically significant impacts…equivalent to approximately one-and-a-half to two months of additional progress in reading,” David critiqued SIRC for concluding, “the Reading Partners model is effective.” As David stated, “I want to focus on one issue: the difference between statistical significance and social significance…and how very frequently (including in this instance) the former is treated as if it were inherently the latter.“
David illustrated his point further:
If our child was 30 months behind his or her age cohort in reading, would we be happy if that child caught up 2 months (maximum) after participating in a 28-week reading improvement program? He or she would still be 28 months behind and, I think it is fair to say, still at a great educational disadvantage….In other words, the report does not address whether the statistically significant impacts of Reading Partners has any social value.
Context and Considerations
Three ambassadors– Gordon Berlin, Kris Moore, and Mari Kuraishi–responded to David’s critique by offering additional considerations for assessing the value of statistical vs. social significance.
Statistical and social significance must be considered separately
Gordon Berlin: “Statistical significance and policy relevance are not the same thing. One has to reach a judgment about any project’s results by separately considering both forms of significance.”
- Statistical significance represents only an average, and, like any average, there is a lot of dispersion around that mean. “Some children gained more, some children gained nothing; in fact, some might have lost ground. The analysis and the numbers do not really tell us anything about individual children.”
- When the control group also gets extra attention, the social value of statistically significant findings increases. “In analyzing the results of an experiment one always wants to understand the control group context, keeping in mind that the program’s impact is driven by the treatment difference between those in the program group and those in the control group. In this case, both groups of children got extra attention and resources above and beyond what they would have normally gotten. That RP still made a difference despite the fact that control group children also received school provided special resources, adds to the author’s and MDRC’s confidence that RP is policy relevant.”
Statistically significant reading outcomes may signal greater value (in science, math, behavior) than a single study can determine
Kris Moore: “While I agree with these cautions, I want to add another consideration, which is that the ability to read, even a little better, makes it more likely that a child will be able to do science and math problems, and perhaps they are more likely to read on their own. And, hopefully, if school is less frustrating, a child might even behave a little better. A longitudinal follow-up with data on exploratory (as well as confirmatory) outcomes could answer these kinds of questions.”
Statistical significance has value to the individual student making some gains (private good), even if insignificant in the broader scheme of things (social good)
Mari Kuraishi: “This also fits the classic economics divide of private good vs. social good…As a parent with a kid with an IEP, I know that ANY gain is sought after and appreciated. But it is frustrating not to know how to get the gains faster and or cheaper.”
- Upon review of the exchange, Mari offered this additional explanation:
“Markets work in private goods because ‘in theory’ there are multiple suppliers (in non-monopoly situations) that match up to multiple demanders, each with their own utility function reflected in different willingness to pay, and therefore different clearing prices. So in this example that David sets up, ‘If our child was 30 months behind his or her age cohort in reading, would we be happy if that child caught up 2 months (maximum) after participating in a 28 week reading improvement program?’ Different families might be willing to pay different prices for this intervention that results in a net 10-week gain, or 2+ month gain. Positing that repeating this intervention has cumulative results (completely unrealistic, I know) a parent might be wiling to pay the price 14 times over to get their kid caught up because the gain to them is a kid that’s caught up, and hypothetically able to participate in public education with no supports etc.
“This equation becomes a lot harder to balance when we have to think about this in the public good context because we are not asking individual households to define their own utility functions and letting them choose their own price point–the decision has to be made for them all as a group (and frequently the households are not consulted in choice of intervention in any event), and moreover the households aren’t paying directly for this intervention. They may be paying indirectly via taxes etc, but they are not at liberty to set price. And in this example we have no idea of alternative interventions, their cost and efficacy, as Gordon points out below.”
A Point of Contention: The Cost Factor
Gordon and David agreed that whether a program yields statistically significant evaluation results is a different question than whether it has policy relevance. Yet, they offered different perspectives when cost considerations are added to the equation.
Gordon Berlin: When cost is low, the policy value of modest statistical significance increases. “In considering policy relevance, cost matters. And because Reading Partners relies on volunteers, the cost is low. If policymakers can get a two month boost on average for all served children for very low cost, in the resource-constrained world we inhabit, policy relevance would rise. Benefits outweigh the modest costs.”
David Hunter: Is scaling inexpensive programs that get only modest results ultimately good for society? “Is it really good for our society and its people to have relatively inexpensive programs that produce weak results at best? Results that are unlikely to exert a meaningful influence over the life prospects of the people who most need a helping hand? Wouldn’t we want our social policy analysts to take this wider view? And might it not well be true that paying for inexpensive but weak programs will, in the end, cost our society more (in lost opportunities for people who needed more robust help than the weak programs provide–and the loss in contributions they might have made had they received such help)?
“Consider the example of home visitation programs for single mothers. If there were someone in your family who qualified for such a program, would you want her to receive services from the Nurse-Family Partnership or Healthy Families America? In the former there are proven and significant (life-changing) impacts both for the mother and sixteen years later for their children. In the latter, there also are proven impacts–but the effects are very modest in comparison and don’t seem to be strong enough to change life trajectories. The Nurse-Family Partnership is more expensive. Is that a good reason for policy analysts and makers to favor Healthy Families America when looking to spend public moneys?”
Resources for Understanding Social Value
Dean Fixsen responded by offering several references from social science that go beyond RCTs/statistical significance to help “describe and assess social validity.” As he stated, “Present day views of social sciences have wed themselves to randomized group designs as the only way of knowing and in that mode statistical significance is the only test of what matters. Science is broader than that, and any science aimed at improving human service outcomes definitely goes far beyond that narrow view.” References recommended by Dean (with links added):
- Wolf, M. M. (1978). Social Validity: The case for subjective measurement or how applied analysis is finding its heart. Journal of Applied Behavior Analysis, 11, 203-214.
- Schwartz, I. S., & Baer, D. M. (1991). Social Validity Assessments: Is Current Practice State-of-the- Art. Journal of Applied Behavior Analysis, 24(2), 189-204.
- Wolf, M. M. (1991). Why did social validity become a classic? Current Contents, 23, 8.
- Sudsawad, P. (2005). Concepts in clinical scholarship – A conceptual framework to increase usability of outcome research for evidence-based practice. American Journal of Occupational Therapy, 59(3), 351-355.
- Hurley, J. J. (2012). Social Validity Assessment in Social Competence Interventions for Preschool Children: A Review. Topics in Early Childhood Special Education, First published online April 6, 2012.
Conclusion
Our thanks to David Hunter, Gordon Berlin, Dean Fixsen, Kris Moore, and Mari Kuraishi for sharing their insights.
Download the complete file that includes the Appendix with full text of ambassadors’ individual comments from the online exchange.
- The Performance Imperative
A Framework for Social-Sector Excellence - …Kickstarter for Churches
- …Kickstarter for Faith-Based Orgs
- …Kickstarter for Healthcare
- …Kickstarter for Small Nonprofits
- Performance Matters Presentations
- Funding Performance
Shining a Spotlight on Funders Investing in Grantees’ Performance - Kickstarter: Funding Performance
- Profile: Blagrave Trust
- Profile: Duncan Campbell
- Profile: Einhorn Family Charitable Trust
- Profile: Impetus-PEF
- Profile: Mulago Foundation
- Profile: Raikes Foundation
- Profile: Rose Letwin & Wilburforce Foundation
- Profile: Venture Philanthropy Partners
- Profile: Weingart Foundation
- Ambassador Insights
Curated insights from online discussions - Board Stewardship for Mission Effectiveness and Performance
- CEO Insight: Graceful Exit – Succession Planning for High-Performing CEOs
- CEO Insight: Is a Nonprofit Board Really Best-Suited to Evaluate its CEO?
- CEO Insight: Should I Stay or Should I Go?
- Determined to Survive: Factors to Consider
- Develop Strong Fiscal Muscles for High Performance
- Responding to COVID-19 in an Equitable Way: Resources to Assist Nonprofits
- Measurement Reconsidered
- Congratulations! You’ve gotten PPP…Now what?
- Performance-Based Contracting: Rules for the Road
- SIBs: What’s Missing?
- Statistical vs Social Significance
- Structuring a Director of Outcomes Evaluation Position
Share With Your Networks
Sample Tweets
Are statistical & social significance synonymous in #eval? See what @LeapAmbassadors say - http://bit.ly/2beUapS #NPStrong Click To Tweet What's difference between statistical & social significance? @LeapAmbassadors weigh in http://bit.ly/2beUapS #NPStrong Click To Tweet
Sample Facebook or LinkedIn Post
What's the difference between statistical & social significance? @Leapambassadors, a private online community of nonprofit leaders and practitioners, provide explanation: Statistical vs. Social Significance
This document, developed collaboratively by the Leap of Reason Ambassadors Community (LAC), is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License. We encourage and grant permission for the distribution and reproduction of copies of this material in its entirety (with original attribution). Please refer to the Creative Commons link for license terms for unmodified use of LAC documents.
Because we recognize, however that certain situations call for modified uses (adaptations or derivatives), we offer permissions beyond the scope of this license (the “CC Plus Permissions”). The CC Plus Permissions are defined as follows:
You may adapt or make derivatives (e.g., remixes, excerpts, or translations) of this document, so long as they do not, in the reasonable discretion of the Leap of Reason Ambassadors Community, alter or misconstrue the document’s meaning or intent. The adapted or derivative work is to be licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License, conveyed at no cost (or the cost of reproduction,) and used in a manner consistent with the purpose of the Leap of Reason Ambassadors Community, with the integrity and quality of the original material to be maintained, and its use to not adversely reflect on the reputation of the Leap of Reason Ambassadors Community.
Attribution is to be in the following formats:
- For unmodified use of this document, the attribution information already contained in the document is to be maintained intact.
- For adaptations or derivatives of this document, attribution should be prominently displayed and should substantially follow this format:
“From ‘Statistical vs. Social Significance,’ developed collaboratively by the Leap of Reason Ambassadors Community, licensed under CC BY ND https://creativecommons.org/licenses/by-nd/4.0/. For more information or to view the original product, visit https://leapambassadors.org/resources/ambassador-insights/statistical-vs-social-significance/.”
The above is consistent with Creative Commons License restrictions that require “appropriate credit” be required and the “name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice and a link to the original material” be included.
The Leap of Reason Ambassadors Community may revoke the additional permissions described above at any time. For questions about copyright issues or special requests for use beyond the scope of this license, please email us at info@leapambassadors.org.