The current COVID-19 pandemic has highlighted the long-standing disparities and inequities that exist in the health infrastructures of many countries around the world, including the United States. Vulnerable and marginalized populations in the United States continue to be disproportionately affected by the COVID-19 pandemic because of these disparities, including, but not limited to, the lack of access to quality health facilities, socioeconomic conditions, and racial discrimination. For example, the United States requires at least $4.5 billion more in funding to provide a minimum standard for foundational public health infrastructure. Access and utilization to this infrastructure, however, varies widely by race and ethnicity as Hispanic, African American, and American Indian/Alaska Native groups are more likely to be uninsured and/or delay their medical care. Additionally, a recent New England Journal of Medicine (NEJM) article has found that structural racism is significantly associated with poorer mental and physical health.

Moreover, the collateral consequences of COVID-19 on the economy and society may continue to exacerbate other social and economic disparities that disproportionately impact marginalized groups. In fact, the Journal of the American Medical Association has found that in the United States, economic and social disparities have widened in relation to socioeconomic factors. For example, Black and Latinx people, have faced 50% more job loss than White people due to COVID-19, and are therefore more susceptible to housing instability.

These social and economic disparities constitute the “social determinants of health” - the conditions in which people are born, grow, work, live, and age. These determinants are inextricably linked to health conditions. These include conditions such as poverty, homelessness, occupation, population density, food access, and many more. These conditions are shaped by both present and historical circumstances, and their interplay has led to [profoundly uneven impacts on COVID-19 morbidity and mortality.]( social determinants of health,effect on COVID-19 outcomes.)

As the COVID-19 pandemic continues, more work needs to be done on elucidating these disparities, how and why they occur, and their impact on different aspects of society and on different groups of people. Through this datathon, we hope that you can use your findings in order to shine light on these problems and figure out how we may tackle the challenges that COVID-19 presents in the upcoming future.

The Research Problem:

The effect of COVID-19 has been quite disproportionate on ethnic and racial minorities. For instance, a report from the Centers of Disease Control and Prevention reveals that the African American and Hispanic populations in the United States have been three times more likely to contract COVID-19 than white residents and nearly twice as likely to die from it.

Additionally, the differential access and distribution of essential health infrastructure and goods have contributed to how different regions in the US have responded to the COVID-19 pandemic. This has also contributed to how different populations have experienced the social and economic effects of COVID-19.

For this datathon, we would like you to explore the given data to 1) further understand how different factors have contributed to disparities in outcomes between different groups because of COVID-19, and 2) to further analyze the progression of these disparities as COVID-19 continues. As a starting point, we ask you to consider the following questions when designing your experiments and interpreting your findings:

  1. What is the correlation between income levels and insurance costs? How does that affect access to healthcare? What is the distribution of medical insurance purchase across racial/ethnic minorities and economically disadvantaged groups?
  2. What is the correlation between education and COVID-19 cases? What is the gender/ethnicity/age distribution? If your findings reveal a trend, what is your hypothesis of the trend?
  3. What is the correlation between housing conditions and social distancing measures in areas populated by racial/ethnic minorities and economically disadvantaged groups?
  4. Dr. Eliseo Pérez-Stable, Director of NIH’s National Institute on Minority Health and Health Disparities (NIMHD), has cited language barriers as a prominent factor in health disparities. Approximately 5 to 10 percent of US adults do not speak English well and according to him, the dissemination of information among this section of the population is challenging. What is the correlation between COVID-19 rates and areas with higher percentages of non-English speaking citizens? What are the major racial/ethnic groups that comprise this population? What are possible ways in which this challenge can be addressed?
  5. What is the correlation between the number of ICU beds available and the coronavirus cases per state among the population aged 60+? How does that correlate to the death rate per state among this age group?
  6. What is the correlation between ethnicity and access to health facilities? How does that vary from state to state?
  7. What is the correlation between disability status and coronavirus cases per state? If a trend is found, is there a correlation between the number of cases and the availability of special-accommodation infrastructure?
  8. What is the correlation between race, age, and unemployment during the COVID-19 pandemic? How might you relate this to the health outcomes for different demographics?

The Data:

The puzzle of the potential predictors and driving factors behind the rise of COVID prevalence presents an excellent opportunity to apply data science techniques to address a pertinent health crisis. We invite you to find creative solutions regarding the aforementioned research questions, but we also encourage you to ask and begin finding answers to any other questions you may find relevant.