Learning and Impact

Evaluation of the Ethics and Governance of Artificial Intelligence Initiative


In 2016, as the Ethics and Governance of AI Initiative (the Initiative) was being conceptualized, numerous events occurred that would impact research, policy, and public discourse on the ethics and governance of AI. Examples include: the founding of the Partnership on AI (PAI); the ProPublica investigation that uncovered significant racial bias in AI used by law enforcement; and Brexit and the US presidential election, two political events which involved the spreading of misinformation on social media platforms1 The funders recalled that the field of AI ethics was nascent when the Initiative was created: “There was definitely a sense in 2016 that there was so much going on […] it was a very rapidly moving field that hadn’t taken shape at all2

By 2017, $26 million had been raised for the Initiative that sought “to ensure that technologies of automation and machine learning are researched, developed, and deployed in a way which vindicates social values of fairness, human autonomy, and justice.”3 Philanthropic support was provided by Luminate (founded by The Omidyar Group), Reid Hoffman, Knight Foundation, and the William and Flora Hewlett Foundation. The Miami Foundation provided fiscal management. The Initiative was structured as a joint project of the MIT Media Lab and the Harvard Berkman-Klein Center for Internet and Society (BKC).

In their joint proposal, BKC and Media Lab articulated that, in collaboration with partners, they would “deploy new prototypes, conduct research, directly impact both policy and technologies, build community, teams, and even institutions, and engage in education and outreach that meaningfully connects human values with the technical capabilities of AI….”4 The Initiative was active from 2017 to 2022 and awarded approximately $23 million to 39 grantees working on 42 projects.

As the Initiative neared the end of its funding, The Miami Foundation and funding partners sought to assess the durability of its collaborative efforts, and the impact of projects supported through its grants. In August 2021, The Miami Foundation contracted Caribou Digital to evaluate the Initiative by reviewing 200+ Initiative documents, surveying grantees, and conducting 30 interviews with Initiative stakeholders. The top-level findings and recommendations from this evaluation are presented in this report.

Initiative Impact

Using the Initiative’s implied Theory of Change—reconstructed by Caribou Digital in Annex 3—as a framework, impact was described and assessed across four categories: 1) relevance and centrality of assets developed under the Initiative, 2) informed public and private sectors, 3) changes in governance, public policy, and industry practice, and 4) building the AI ethics and governance community.

The Initiative generated vast quantities of assets: over 250 publications (Annex 6), more than a dozen products (Annex 5), and countless engagements. In terms of their centrality and utility to the broader field, academic citations of these assets ranged from zero to thousands. Uptake of AI products for public good varied, with some stand-out examples of high and sustained uptake. Insights from large events funded by the Initiative suggested high relevance, engagement, and value.

One in three of the Initiative’s grantees provided examples of their contributions to more informed public and private sectors. Grantees informed policymakers in a variety of ways: providing evidence, testifying, delivering briefings, sitting on advisory groups, and engaging in partnerships. Industry representatives were notably more difficult to engage, as they were less accessible and less likely to share that a grantee’s work was informative. Excluding a number of internationally mandated institutions, the majority of examples emanated from North America, Europe, and the UK.

One in four of the Initiative’s grantees linked explicit policy changes and actions to their work. Grantees identified changes at several major technology companies, including: improvements to the quality of information on platforms (Twitter, Pinterest, Google, Facebook); online safety protections for users (Disqus); and assessments on bias within AI systems (Amazon and HSBC). However, grantees highlighted that impact on technology companies may be underreported. Within the public sector, all changes were concentrated in the US, at the local level, such as the ban on face surveillance technology in Massachusetts, and Electronic Frontier Foundation’s (EFF) work on the Public Oversight of Surveillance Technology (POST) Act in New York City. EFF’s two legal rulings to reverse the use of AI to implicate or imprison, set precedents for future campaigns.

Views about the cohesiveness and strength of the AI ethics and governance community varied considerably, as did assessments of the roles of BKC, MIT, and the Initiative in strengthening them. Some noted that it “definitely exists” and that BKC and MIT “definitely contributed to it.” Others thought that, while there is a “healthy field,” the unification of computer science and social science “hasn’t ended up with that galvanization.” But one thing is certain; the number of institutions producing outputs and the number of people convened under the banner of AI ethics grew, and the Initiative fueled this growth.

Two broader changes enabled by the Initiative surfaced: organization progression and career progression and change. Some research products produced through the Initiative contributed to their authors’ career progression from research to developing policy or practices around AI ethics and governance. A few grantees credited the Initiative with their growth from projects to organizations and as leaders in their field. For example, the Markup used Initiative funds as seed funding; by the end of their grant, they had raised $25 million.5 DigiChina transitioned from a startup project within the New America Foundation to a program based at Stanford University with multi-year funding.New America, final report, October 2021.6

Unsurprisingly for an initiative of this scale and diversity, many projects continued to generate impact and a few closed. The benefits of educated professionals, members of the public, research assets, and legal reforms are relatively durable. Some Initiative projects, such as Tattle, the Markup, DigiChina, CivilServant and the FAT ML (later the ACM FAccT) conference, have grown or found homes in new institutions and continue to add value. However, 18% (n=7) of grantees, representing 6% ($1,367,188) of total funding, did not report on impact beyond outputs.

Was aggregate impact observed enough? While it may be up to each funder to assess whether reported impact was sufficient, it is also worth considering the nascency of AI ethics and governance in 2016, which required an element of foundation laying. This more exploratory work tends to weigh towards outputs rather than longer-term impacts. Ultimately, Initiative leadership and funders should emerge with new knowledge and a clearer view on where resources should be focused next or new learnings to apply to similar future initiatives.

Initiative Implementation

There was a relatively proportional mix of theoretical and practical projects. There were significantly more research papers produced than AI products for public good developed. However, if a broader view of “practical” is taken—i.e., including engagement with public and private sector actors and the development of public resources and trainings—the theoretical and practical mix does not appear disproportional. Further, 72% of grantees worked in more than one of the Initiative’s three strategic areas—1) community and capacity building, 2) research sprints and pilot projects, and 3)education, training, and outreach—demonstrating that most projects embodied a mix of theoretical and practical.

The Initiative was responsive to most trends in the broad field of AI. This responsiveness can be characterized across four efforts: 1) to embrace the inherent interdisciplinarity of AI (and of AI ethics and governance); 2) to uplift and amplify a diversity of voices; 3) to include and engage broader elements of society; and 4) to develop and support a counterweight to industry resources and priorities.

  • Interdisciplinarity. It is notable, responsive, and appropriate that the majority of the Initiative projects had various interdisciplinary aspects to them—either in their teams or in the people they convened. Grantees felt that such interdisciplinarity was important to continue and in the long term the community will be healthier and more resilient for this.
  • Diversity. While the “diversity disaster” in the broader AI field is well known.7 Within the sub-field of AI ethics and governance, grantees noted that a field that relied on the same voices, geographies and, often, institutions would result in missing perspectives. With 15% of grantees being non-US based8 and 80% of funding support academic institutions, on this front, and in line with their international ambitions9 the Initiative could have done more. There remains an imperative to push for substantive diversity of thought and experience, both within geographies and across them.
  • Active inclusion of society. Over 50% of the Initiative’s projects included society as one of their target audiences. Several grantees shared that the AI ethics and governance community must build on the Initiative’s efforts to actively engage with society. Democratic societies determine how their governments use technology and automated systems; the Initiative illustrates how it is vital to bridge the knowledge gap and explain how these systems work to ensure that current social injustices are not replicated through AI.
  • Industry counterweight. While the Initiative may be seen as a counter-weight to the significant industry resources invested in AI. Donors have the opportunity to off-set market and geopolitical incentives in support of human-centric and ethical applications of AI. While it will not be possible for donors’ funds to equal the amount spent on AI by industry or major governments, academic and civil society organizations will continue to play a key role in increasing public awareness and influencing policy.

Recommendations for funders

The three most pertinent recommendations to support broad, complex multi-donor/ year/grantee initiatives are highlighted here.

  1. Design for diversity—in institutions, approaches, and geography—in the selection process. Conducting informative activities—such as ecosystem scanning and surveys on priorities—prior to selection processes is an opportunity to gain consensus on the gaps in research and practice and increase awareness of a broader range of organizations conducting relevant work. Another approach is to set a quota/cap on the number of grants provided to certain organizational types, those from specific countries, or those representing specific interests. This will enable a richer set of implementers, perspectives, and impacts. Support for intra-initiative engagement—from internal newsletters, discussion forums, or annual convenings—could further maximize the benefits of diverse grantee voices.
  2. Define the community mission to galvanize people and institutions. Building communities is inherently difficult work, made more difficult without clarity on the community’s mission. For future community-building initiatives, articulating the vision and mission, clarifying membership, and determining strategies to achieve the mission would galvanize people and institutions towards it.
  3. Design for impact measurement at the start of initiatives. A framework could include: a robust theory of change, measurement principles, specific and appropriate metrics, dedicated resources to regularly aggregate and review insights generated by grantees, and intra-Initiative learning convenings.