Chan Zuckerberg Initiative Grant Will Diversify Cancer and Normal Tissue Datasets, Improve AI Models

Summary
- People of African ancestry are underrepresented in biological datasets used to study cancer.
- A new award from the Chan Zuckerberg Initiative aims to help rectify this disparity.
- The international project is led by Sylvester researchers, and will generate data to power AI-based models of disease.
A new $1.9 million grant led by researchers at Sylvester Comprehensive Cancer Center, part of the University of Miami Miller School of Medicine, will fund a project to increase the diversity of biological datasets used to study prostate, breast and gynecological cancers.
Data from the project, funded by the Chan Zuckerberg Initiative (CZI), will be used to improve artificial intelligence tools that model tumors.
The project itself will also be “democratized,” said principal investigator Sophia George, Ph.D., a molecular geneticist at Sylvester, part of the University of Miami School of Medicine. Data will be shared with study participants, and researchers from Africa, the Caribbean and the U.S. will be trained in specimen collection, processing and analysis.

The output will be “one of the largest AI-ready, single-cell datasets capturing the genomic diversity of African ancestry populations,” said Vasileios Stathias, Ph.D., assistant director for data science at Sylvester. “I’m thrilled to be contributing.”
Implementing International Science
As part of the project, researchers will analyze samples of breast, prostate, uterine, ovarian and cervical tumors and normal tissue. More than 3,200 normal and tumor samples will be collected across 20 sites in Africa, the Caribbean and the Miami region, home to a diverse population of individuals with African ancestry.
Scientists will scrutinize the DNA and RNA of individual cells in the samples. DNA sequencing can yield data such as information on the mutations driving the tumor. RNA analyses can identify which genes are active.
The data will be integrated into existing public databases and the Sylvester Data Portal, the institution’s management and analysis platform for research data. Information on patient characteristics such as health history and genetic predisposition will also be uploaded.
The project will help correct a longstanding imbalance in the number of people of African ancestry represented in biological datasets.
The new data “will enable researchers to explore ancestral genomic differences at the cellular level and train next-generation AI models applicable to ethnically and demographically diverse populations,” said Dr. Stathias.
Outstanding research questions include why young Black men are at an elevated risk of aggressive prostate cancer and why Black women suffer disproportionately from aggressive forms of endometrial and breast cancer.
The new data may shed light on such questions by helping researchers understand the molecular and cellular basis of these cancers, and their links to ancestry, underlying health and other patient characteristics.
“Data from the single-cell research can help inform cancer diagnosis and prevention strategies for our patients and even personalized treatments in the future,” said Aisha Mustapha, M.D., M.P.H., the project’s site lead for Nigeria, based at the gynecologic oncology unit in the Department of Obstetrics and Gynecology at Ahmadu Bello University Teaching Hospital in Zaria.
Building Cancer Research Capacity
The new effort builds on another CZI-funded project led by Dr. George, the African Caribbean Cancer Consortium. That consortium has been collecting and analyzing normal breast and gynecological samples across several Caribbean and African countries.
“Participating in the single cell study has helped us to build research capacity, improve data collection and facilitate collaboration and networking within Nigeria, across the African continent and in the United States,” said Dr. Mustapha.

Sample collection will now be expanded to tumor tissue and involve more research sites, which will send scientists to Sylvester for training.
It’s been “inspiring and motivating” to interact with such a diverse group with a variety of perspectives, added Dr. Stathias, who also works closely on the project with Stephan Schürer, Ph.D.
“This project represents a transformative opportunity to address disparities in cancer research by creating AI-ready datasets that reflect the genomic diversity of African ancestry populations,” added Dr. Schürer, associate director of data science at Sylvester and professor of molecular and cellular pharmacology at the Miller School. “By integrating cutting-edge AI methods with single-cell biology data, we aim to uncover new insights into cancer biology that will ultimately lead to more inclusive and effective treatments.”
Riding the AI Wave
The new grant showcases how artificial intelligence is rapidly changing scientific research, said Dr. George, an associate professor in the Division of Gynecological Oncology at the Miller School.
“AI is allowing us to discover things that we wouldn’t necessary discover on our own, because it’s going to find relationships between cell states, DNA, RNA and patient metadata” such as ancestry and lifestyle, said Dr. George, an associate professor of obstetrics, gynecology and reproductive sciences at the Miller School. “The technology is changing how quickly we can analyze data and interpret the results that we have.”
The project will help ensure that AI models are built on data that represent the diversity of the human population.
“This will be an invaluable resource that will significantly accelerate the genomic AI efforts of the scientific community,” said Dr. Stathias.
Tags: AI, artificial intelligence, cancer disparities, cancer research, Chan Zuckerberg Initiative, data science, Dr. Sophia George, Dr. Stephan Schürer, Dr. Vasileios Stathias