COVID-19 Response Policies: Uses of Data and Implications for Equity
Carly Jones, Nick Chan, Tobi Jegede, Emily Reece, Kit Rodolfa, Rayid Ghani (Carnegie Mellon University)
INTRODUCTION
Over the past year, the global COVID-19 pandemic has produced wide-ranging social, health, and economic harms demanding swift action from policymakers in response. The scope and speed at which policy decisions are being made in response is unprecedented, with new changes to regulations, shutdowns and reopenings, and allocations of relief resources to individuals and businesses announced daily across the country. With so many policy responses, it is difficult for policymakers, analysts, and researchers to understand what decisions have been made, or to assess which have been successful in implementation and which have not.
Further complicating the pandemic response process is the recognition that harms are not being felt equally across communities. Over the past year, we have witnessed disproportionately large COVID-19 outbreaks at historically-black long-term care facilities, soaring infection rates among Native American populations, and death rates amongst Latinos more than triple that of white residents. The asymmetrical impacts of the pandemic were so evident in early stages that the Centers for Disease Control issued specific guidance to policymakers listing social determinants of health as primary risk factors for infection – persons of color, the poor, the elderly, and other vulnerable populations are suffering more acutely than the average person in this country.
Although we seem to have entered a phase of recovery with many indicators showing signs of improvement, the benefits of this recovery are not necessarily being shared by everyone – considerable inequities still exist across groups. This begs the question: To what extent have policy decision-making processes created or exacerbated these disparate outcomes? And how could we do things differently?
As a first step to answering these questions, our team of researchers at Carnegie Mellon University has amassed a broad inventory of policy decisions made across the country in response to COVID-19. This initial inventory comprises decisions made at all levels of government, and in select nonprofits and private organizations, to address the social, health, and economic impacts of the pandemic, each with its own goals, actions, and intended targets. Developing this inventory has given us some preliminary insight into the ways in which policymakers are making and implementing COVID-19 response decisions, including the data and tools they are using to do so.
We are providing this resource as a starting point for policymakers, analysts, and researchers to critically engage with the landscape of potential approaches to addressing urgent public needs created by crises such as the pandemic. By supporting comparative analysis of policy decisions, we aim to help policymakers identify alternative, and perhaps more equitable, methods for informing, implementing, and documenting their decisions. Most importantly, we hope to motivate closer examination of the ways in which responses to the pandemic may have been disparately harmful instead of, or perhaps in addition to, being helpful at an aggregate level.
THE INVENTORY
In our initial phase of research, our team scoured the federal register, state websites, and reliable media sources to catalogue 226 total policy decisions enacted across the US in response to COVID-19, categorized by sector, policy area, and policy action(s). The aim of this work was not to develop a comprehensive accounting of all the policy responses that exist, but rather explore the landscape of decisions made, data and tools used, and their possible impact on equity. As the project continues, and with contributions from the policy and research communities, we hope to expand this inventory into a rich and comprehensive dataset. We welcome additions of relevant policy decisions using the input form provided.
To facilitate analysis of the ways in which data or analytical tools have been used to make or implement these policy decisions, we have also captured stated data sources, referenced data fields or criteria used to justify or implement a policy decision, and any tools used. The primary data sources cited by each policy are further divided by category and by type.
- Sector. The government, nonprofit, or private sector that the policymaker belongs to. Categories are: Federal, Multi-State, State, Multi-Local, Local, Tribal, Non-Profit, and Private.
- Policy Area. The topic area or domain that the policy decision belongs to. Categories are: Housing, Public Health, Transportation, Employment, Criminal Justice, Economic Development, Food Security, Education, and General Social Services.
- Primary Goal. The explicit or implied goal of the policy decision. Categories are: Reduce Community Spread, Support Economic Recovery, and Support People in Need.
- Policy Actions. The mechanism or intervention being used to achieve the goals of the policy decision. Multiple may be selected. Categories are: Limit Mobility or Services, Shutdown, Reopen, Provide Individual Relief, Provide Business Relief, Provide Local Government Relief, Allocate Resources, Change Regulation, and Communicate to the Public.
- Primary Data Source Category. The domain of the primary data source used to make or administer the policy decision. Categories are: Mobility, Economic, Health, Demographic, Other, and Not Stated.
- Primary Data Source Type. The type of the primary data source. Categories are: Public Record, Public Administrative or Operational, Self-Reported, Industry, Academic, and Not Stated.
OBSERVATIONS
Although our inventory is not a comprehensive representation of all the ways in which policymakers have responded to the COVID-19 pandemic, we believe that there are important insights to be gleaned from this initial set of 226 policies. When it comes to mitigating the possibility of inequitable outcomes, identifying interesting or even concerning patterns that warrant an additional investigation is an important first step.
With this in mind, we offer the following general observations of the COVID-19 policy response landscape as we have captured it.
1. A majority of policies mention using data or evidence to inform their decision-making process, but few provide complete information or sources of that evidence.
Overall, our team found it difficult to get a full picture of how a policy decision was made from publicly available information. Most policies were described as “data-driven” or “evidence-based” but few, if any, seem to offer detailed information regarding the evidence or data used to make the decision, let alone describe the process and criteria behind it.
While we found that policymakers generally cite the criteria used to justify or implement a policy decision, these criteria seem difficult to validate or apply without reference to an appropriate data source. For example, the federal Coronavirus Aid, Relief, and Economic Security (CARES) Act established the Pandemic Unemployment Assistance program to temporarily expand unemployment insurance eligibility to persons working in the informal job sector, including self-employed and part-time workers (Index 25). However, we wonder how labor officials actually implemented this program at the state level, as unemployment insurance claim data typically does not cover these types of workers, nor does the CARES Act propose an alternative data source for identifying the universe of workers who could or should benefit from this program.
At the state and local levels, where many policy decisions seem to focus on allocating resources directly to individuals, local businesses, and healthcare providers, we found references to data sources to be similarly missing. For example, the Governor of Pennsylvania provides detailed eligibility criteria for the CARES Act-funded small business relief grant program, including the requirement for businesses to demonstrate that “COVID-19 has had an adverse economic impact” (Index 192). However, grant funding is allocated through Community Development Financial Institutions, and our team was unable to determine how these institutions track funding awards or use data to verify eligibility. Without this information, it is difficult, if not impossible, for policymakers, analysts, and researchers to figure out how these resources were allocated, in what amounts, and most importantly the impact they have.
The lack of publicly available information around data used to inform or implement these policies is likely to be a significant barrier to effective auditing and evaluation of these policies. When it comes to mitigating potential biases, not knowing the policy goal or purpose, or how data was used to inform, justify, or implement the decision hinders our ability to identify where in the policy-making process an intervention is needed to proactively deal with downstream issues. From a different perspective, if policymakers were to provide comprehensive information about their successful policy innovations, others may be able to learn from them and adopt or adapt good ideas.
Despite these difficulties, our team managed to find some examples of well-documented decisions, particularly from the group of policies enacted after the pandemic’s first wave. We suspect that both data use and transparency about the decision-making process have increased over time as the coronavirus becomes less novel and more complete and reliable data is made available to policymakers. The most transparent examples of policy decisions our team found were state reopening plans, which often provide a detailed enumeration of the processes, criteria, and data sources being used by state officials to ease restrictions. Many states, including Michigan and Missouri, have also created reopening data dashboards that publish a variety of health and economic indicators for public consumption (Indices 156, 159).
2. Data-driven tools are generally being used to implement policy responses, rather than to inform their design.
Our team observed that a variety of tools, including artificial intelligence (AI) and machine learning (ML) applications, are being used by policymakers to respond to the pandemic. However, the majority of the policies we examined which cite the creation or operationalization of a tool appear to do so as a method for implementing a policy rather than informing its design and the broader decision process around it. These tools generally focus on applying technology to enhance the efficiency of a response, such as using an AI chatbot to communicate information about the virus to the public (Indices 76, 91, and 157) or using data processing technologies to transition existing public services to remote delivery (Indices 90 and 97). One such example is the Orleans Parish Communication District in Louisiana, which adopted a cloud-based emergency communications platform to connect 911 callers to first responders via video chat (Index 90).
We believe that evidence-based methods, including AI and ML hold much promise for public sector resource prioritization, and were disappointed to find that these tools rarely seem to be used to inform decision processes or actually design policies. One exception is epidemiological models, which at the state and local levels seem to be referenced by policymakers to inform a wide variety of public health decisions. However, many of these modeling applications also seem to lack sufficient documentation, as the methodologies and specific data sources used to generate the model are not always provided. One exception is the Colorado COVID-19 Modeling Group, which has published detailed documentation of the assumptions and formulation of their deterministic susceptible, exposed, infected, recovered (SEIR) model (Index 123).
In contrast to what we observed in our inventory of policy decisions, additional research by our team suggests that a wide variety of COVID-specific AI and ML applications have been developed or proposed by researchers. In addition to enabling automated contract tracing (Indices 151 and 214) and evaluating the effectiveness of interventions (Index 78), these applications span uses much broader than those indicated in our inventory, including enabling new diagnostic methods, evaluating the virus’s reproduction rate, and predicting the future spread of infections. While it is possible that our team simply did not find very many instances of these applications being used in practice, there seems to be a considerable gap between tools and methods being developed by the research and tech communities and actual adoption by policymakers and practitioners. Although there will certainly be variability in how well any given AI/ML approach might improve the response to a pandemic, we suspect there is real scope for informing more effective policymaking with some of these tools and expect that the research and development that has been taking place during the COVID pandemic might prove helpful in better preparedness for responding to the next crisis.
3. There are many points in the policy-making decision process where policymaker choices critically impact equity in outcomes.
Through this research, our team has realized that for as many ways as a policymaker might respond to an urgent crisis, there are at least as many possible pitfalls which could undermine the intended impact of the response. Some of these pitfalls are well known to researchers, such as failing to address persistent data quality issues or designing the system to achieve a metric that is misaligned with the outcome of interest. However, other problems may be less obvious, especially those which might result from choices made in the early development or late implementation stages of the policy-making process. Often, as our team has observed, process, data, and modeling problems collide, resulting in complex and difficult-to-address sources of bias. As we collected and reviewed policies for our inventory, we thought critically about how equity issues could arise from the way that data, criteria, and tools/models could inform decision making, and identified several potential sources of bias that we feel demand careful examination.
As a starting point, we considered the potential impact of data quality issues on equity, as these are common problems for anyone who works with data. Our team suspects that data quality issues affect many of the pandemic response policies in our inventory, even though we cannot easily detect or measure them given the observed lack of information about data sources and usage. These data quality issues may include lack of representativeness or coverage, mismeasurement or inaccuracies, and issues around timeliness or granularity of reporting, and are problematic when they do not occur randomly but systematically within subgroups of the data being used to make a decision.
To illustrate this problem, consider the large number of policy decisions made to address mass unemployment caused by COVID-19. At the state and local level, policymakers typically use unemployment insurance claims as proxies to understand who, where, and what types of working-eligible individuals are without a job. However, these data only include those who file for unemployment insurance and do not represent every type of unemployed person in need of assistance. Those who are not eligible for benefits, such as undocumented, informal, or gig workers, and those who do not file, perhaps due to stigma, lack of knowledge, or lack of access resulting from technological or language barriers, are excluded from these data even though they may have great need for the resources being allocated on the basis of them. Furthermore, when a particularly novel and devastating event such as a pandemic happens, the unemployed population may not simply increase in scale but also shift significantly in its underlying distributions. If policymakers continue to use historical unemployment records to inform policy decisions, those decisions may be subject to coverage issues, where particularly hard-hit groups such as minorities are underrepresented and ultimately under-assisted.
Our team also considered the ways in which a policymaker’s choice of metrics to drive policy implementation may impact equity. It is important that any selection criteria, thresholds, or eligibility requirements used to administer a policy or program align with the problem at hand, the amount of resources available for intervention, and the incentives surrounding participation or compliance. Use of an inappropriate metric or set of criteria may result in a policy impacting the wrong group, or, potentially, influencing behavior in a way that undermines the fairness of the impact.
As an example, consider housing assistance programs established in response to the widespread economic impacts of COVID-19. Many states, including our resident state of Pennsylvania, implemented rent, mortgage, and home energy assistance programs to allocate small-sum grants to households with demonstrated need. One definition of need frequently used by state policymakers is household income at or below 80% of the Median Area Income, or the federal definition of “low income”. While eligibility thresholds for longstanding, large-budget federal public housing programs are generally set using both Median Area Income and Fair Market Rent estimates, many of the state and local COVID-19 housing assistance programs our team examined seem to only use Median Area Income to set eligibility thresholds. Without considering the range in cost of living of an area, these policies may inadvertently exclude households with disproportionately high costs of living from receiving needed assistance. It is also important to consider the level of aggregation at which metrics such as income limits are calculated. Applying eligibility thresholds calculated at the county rather than the more granular tract level, for example, may lead to significant differences in the population of eligible households.
Lastly, we thought broadly about how design or process choices made by policymakers can lead to inequitable outcomes. In the rush to provide life-saving assistance, many policymakers implemented processes that allocate response resources on a first-come-first-served basis (including large-scale efforts like the paycheck protection program), relying heavily on self-reported information provided by applicants through online portals. While administering critical policies in this way may have advantages of ease and speed of implementation that are attractive in the context of a rapidly-evolving crisis, it is important to consider whether or not these methods of administering assistance programs support fair outcomes in practice. In particular, knowledge about the opportunity, facility navigating bureaucratic systems, and access to technology could all pose barriers to already-vulnerable populations receiving support from programs administered in this way.
A case in point is the process of distributing COVID-19 vaccines. Following federal guidance, most states have established vaccine rollout policies which include tiered prioritization of high-risk populations. However, we also observed that these high-risk individuals often bear the burden of finding, understanding, and meeting vaccine eligibility rules, in addition to needing to secure their own appointments via online scheduling. This process likely disadvantages two high-risk populations it is meant to serve: namely, the elderly, who may have difficulty accessing the technology required to make a vaccine appointment, and immigrants, who may have difficulty navigating vaccination protocols in a language that is not their native language. Failure to consider these kinds of barriers to access when designing the process by which a policy decision will be implemented can lead to increased disparity risk.
NEXT STEP
Our goal in developing this inventory was to map and understand the wide-ranging landscape of policy decisions being made in response to the pandemic, with a particular focus on uses of data and technology and their effects on the disparate impact we have witnessed across communities. Through our initial review of this policy landscape, we identified several avenues for additional work which could increase the capacity of policymakers and researchers to efficiently respond to crises while taking the equity of outcomes into account.
First, additional primary source research is needed to understand the extent to which data has actually been used to design or implement policies in response to COVID-19. We suspect that many common economic, health, and mobility data sources were used to make a wide range of policy decisions; however, as we have noted, these sources are not generally listed publicly. To gather additional information, we have reached out to policymakers, analysts, and researchers in both government and academia for interviews focused on data use and decision-making during the pandemic. We hope that these conversations will fill some of our gaps in understanding, as well as facilitate further study of the equity implications of individual uses of data. Likewise, we believe the inventory we present here can serve as a helpful starting point for other researchers focused on specific decisions or policy areas to dive deeper and contribute to the growing body of knowledge about policymaking in the context of a widespread and rapidly-developing crisis like the COVID-19 pandemic.
Second, there is considerable work to be done understanding the gap between AI and ML tool development and adoption. For both policy and tech researchers, it will be valuable to take a closer look at these tools and models to investigate where and to what extent they might have been able to improve the pandemic response, had adoption been higher. For policymakers, a close evaluation of this toolkit as the urgency of the crisis begins to recede might help improve day-to-day processes in general or at least provide a set of tools that can be quickly deployed when they are faced with the next crisis.
Third, if many policymakers are, in fact, using common economic, health, and mobility data sources to make a wide range of decisions, there is a need to document the limitations of these sources as well as to identify the equity implications of using them to make specific policy decisions. In the midst of an urgent crisis, policymakers need ready access to information about potential issues with the data, metrics, and models they have access to in order to consider bias as a factor influencing the ultimate outcomes of their decisions. After gathering more detailed information from interviews, our team hopes to develop an equity analysis framework that facilitates the preparation and use of this information.
We started this work not only to diagnose common policy choices which may lead to disparities but also to develop practical mitigation strategies that can be employed by policymakers. While there could be a perceived tension between the urgency of distributing critical resources and the necessity of considering implications for equity, we believe that careful planning and preparation may provide a wider range of options for policymakers to respond to crises as they emerge while making sure that the equity goals are not left behind.
We hope that the resources we develop will prove useful to policymakers and other researchers hoping to better understand what happened during the COVID-19 pandemic and be better prepared for the next one. As such, we see this inventory as a living document and intend to continue to update and improve on it as we (along with the community) learn more, and very much encourage input, feedback, and additions to our initial work described here.
What’s next? Over the next few weeks, we will be publishing the second part of this work focusing on the biases in commonly used unemployment, health, and mobility data sources. We will also be proposing a framework that policymakers can use to analyze the potential equity impacts of these biases to support them in making more informed and more equitable decisions.
About the team:
Carly Jones is a second-year Master’s student at Heinz College studying public policy and data analytics. Through her research and professional work, she hopes to equip policymakers with practical tools that support responsible data-driven decision-making.
Nick Chan is second-year dual degree Master’s and Juris Doctor student at Heinz College and Pitt Law studying public policy and law. He hopes to examine the intersections of the law, data, and public policy to advocate for equitable outcomes.
Tobi Jegede is a first-year Master’s student at Heinz College studying public policy and data analytics. Her academic and professional goals include working on projects that leverage the power of data science to shed light on and provide solutions for social good projects.
Emily Reece is a second-year Master’s student at Heinz College studying public policy and data analytics. She is passionate about using data analytics and technology to improve the policymaking process through evidence-based and cost-effective solutions.
Kit Rodolfa is a Senior Research Scientist at Carnegie Mellon University, working with Professor Rayid Ghani at the intersection of machine learning and public policy on using these methods to benefit society. His research interests include the bias, fairness, and interpretability of machine learning methods.
Rayid Ghani is a Distinguished Career Professor at the Machine Learning Department and the Heinz College of Information Systems and Public Policy at Carnegie Mellon University, working at the intersection of AI, Machine Learning, Social Good and Public Policy. His work, in collaboration with government agencies and non-profits, is focused on building data-driven systems to help make decisions in various policy areas that lead to fair and equitable outcomes.
How to contribute to the inventory:
Please fill out a COVID-19 Response Policy Inventory input form with complete information about any policy decision you would like to have added to the inventory. Our team will review inputs periodically and make changes to the published inventory. All references should be publicly available.