Kinder believes that there are still too many charities carrying out programs without sufficient evidence that these programs are effective or work as intended. This state of affairs is reflected in public perception and may be causing harm to the charitable sector.
Kinder wants to shift the culture so that charities invest in interventions that work and regain public trust by providing evidence and reasoning. That’s why stage two of the Kinder Vetting Framework, Organisational Competence, includes measures of charities’ commitment to conducting research on their programs and producing data.
Lack of sufficient evidence in the sector is by no means a new problem, and Kinder is certainly not the first organization to try and rectify it. Back in 2006, the Stanford Social Innovation Review (SSIR) was already warning us that charities were “drowning in data”. According to SSIR, charities were conducting low-quality evaluations with low-quality results at the request of large donors who underestimate the difficulty and expense of impactful evaluations.
We want to avoid this type of failure and failures that are related to it. Goodhart’s Law accurately encapsulates our view: “when a measure becomes a target, it ceases to be a good measure”.
A close-to-home example of a Goodhart type failure is the current hyperfocus on reducing overhead (operational) costs in philanthropy. This failure can, and usually does, cause underinvestment in hiring skilled staff. The president of the charity consultancy Grey Matter, Ron Sellers, claims one of the reasons calculating overhead ratio became so popular among nonprofits is that it’s so much easier than quantifying impact. But as a guide to impact per unit donated, it’s almost useless.
To sum up, rewarding charities for doing internal research could, if not done with care, result in organizations wasting resources, publishing low-quality research, or attempting to “prove” their effectiveness instead of testing it.
What we are already doing right
In some ways, Kinder is already combating this issue. In step two of the vetting framework, we look for several signifiers of quality that are relatively hard to fake. We check for:
Data transparency (e.g. the frequency of updates)
Adjustment of programs in sync with data collection (a test of the intentions behind the data collection)
Making the research available online and inviting peer review
The use of external references and external assessments in impact analysis, among other things.
We also have a section which assesses the logic of the strategy, and a section that looks at whether the charity checks their programs are functioning as intended or not. Both are prerequisites for a valuable impact analysis.
Instead of focusing only on evidence of effectiveness, rewarding organizations for producing high-quality research counters the temptation of publication bias — a type of bias that occurs when the results of a research project influence the decision to publicly share it.
With regards to the authenticity of internal research, the third step, Intervention Effectiveness, will be the proof of the pudding since organisations which successfully improve will automatically score higher in effectiveness over time.
Ways we could improve
In future upgrades of our framework’s second step, we should focus on checking the authenticity of the intention to improve. This can be done by adding more questions to the vetting framework. We have to balance the ease of comparing charities with the security of hard-to-game human discretion. Both human discretion and having more questions make the framework harder to ‘game’.
To set an example, and to better understand the difficulties charities face in gathering data, Kinder should swallow its own medicine. First of all, we should conduct some external research on the effects Kinder could have on charities’ behaviour, and later, research on the effects it is actually having. Then we will be able to track with evidence whether our intervention is functioning as intended and adapt, or abandon it as needed.
Goodhart’s Law and Why Measurement is Hard; “why the problem with metrics — and algorithms that rely on them — isn’t something that can be avoided.”