top of page
light color for a background of a FDA project report.jpg

Aim 1. Data Mining and Geospatial Analysis of ClinicalTrials.gov

Picture2_edited.png
Picture1_edited.png

01

Study Design

Aim 1 of this project systematically mined U.S.-based COVID-19 clinical trial data from ClinicalTrials.gov. The process began by filtering for relevant keywords, conditions, and MeSH terms related to COVID-19. Key attributes such as study type, phase, design, conditions, and interventions were extracted. Natural Language Processing (NLP) techniques were used to identify additional information on minority populations and social disparities.

​

The data was organized into a COVID-19-specific dataset, linked to geospatial information, and managed using Python scripts and a data lake in S-3. Secondary data from sources like the American Community Survey and CDC WONDER were integrated for multivariate statistical testing on health equity and community-level variables, focusing on underrepresented communities in COVID-19 research.

​

Geospatial analysis examined the distance from clinical trial sites to identify clusters of health equity interest. The project resulted in a continuously updated nationwide database of COVID-19 clinical trials, with all software made available as open-source on GitHub.

02

Key Findings

  • 37,043 US sites for 1,654 COVID-19 clinical trial studies listed on CT.gov, with 70% being interve4ntonal and over 30% of these being Phase 2 trials. 

  • The Los Angeles and New York City areas were statistically significant hot spots across all study types.

  • Clinical study sites were more likely to be found in counties with higher proportions of Asian American (p<0.001) and Native American residents (p<0.001). 

  • Areas with greater concentrations of African American residents had significantly lower concentrations of observational (p<0.001) and government-sponsored COVID-19 studies (p=0.003) in national analysis and significantly fewer concentrations of study sites in both Los Angeles (p<0.001) and New York (p=0.007).   

​

03

Project Deliverables

  • Publication (Paper) - Demographic Disparities of COVID-19 Clinical Study Site Proximity in the United States: A Geospatial Analysis (Under Review) 

  • Conference Poster - Identifying Spatial Disparities in Online Discussions About Clinical Trials for COVID-19 Vaccines. APHA Annual Meeting & Expo 2022, Boston Nov. 6-9. Abstract Poster

Let’s Work Together

9150 Chesapeake Dr. Suite 290

San Diego, CA 92123

Tel: 123-456-7890

Thanks for submitting!

bottom of page