Summer 2023 CIC Webinar Recap

Editor's note:

Guest Post: Anna Eggers

The twenty-sixth session of the COVID Information Commons (CIC) webinar series took place on July 26th, 2023. In this forum, leading COVID-19 scientists presented their current research on the global pandemic. 

Event moderators included Florence Hudson, Executive Director of the Northeast Big Data Innovation Hub at Columbia University and COVID Information Commons Principal Investigator (PI), Lauren Close, Operations & Communications Manager, and Emily Rothenberg, National Student Data Corps (NSDC) Program Manager. 

The researchers presented a range of topics, including education, medication administration, machine learning, and more. Each touched on broader themes related to the COVID-19 pandemic. 

The session started with a presentation from Carlos Badenes-Olmedo from Universidad Politécnica de Madrid. Carlos discussed his research project, Drugs4Covid: Knowledge Graph about Drugs used in the Clinical Control of the Coronavirus.

Badenes-Olmedo and his team found that there are expansive amounts of research without clear connections about the drugs administered during the COVID-19 pandemic. Medical practitioners and researchers have therefore struggled to use this research to its full potential for clinical treatment. Using the CORD-19 dataset (one of the 80 datasets available through the CIC portal), which houses over 400,000 scientific articles about the COVID-19 pandemic, they created a workflow which processes scientific publications from their initial text files into knowledge graphs and easily accessible user interfaces. By extracting drug information using language models fine-tuned for identification and normalization, they were able to provide better access to evidence and conclusions drawn from scientific articles and improve understanding of the drugs being used to treat COVID-19.

A video of Carlos’ presentation can be found on the CIC website.


Next, Niu Gao from the Public Policy Institute of California presented her research on the Impact of COVID-19 on Science Education: Early Evidence from California. This project was funded by the NSF Division Of Research On Learning.

Gao and her team raised questions about student performance in science education following reports of falling test scores in other more prioritized subjects such as ELA (English Language Arts). In survey responses from 213 districts in California, their team found that 62% of districts reported putting lower emphasis on Science education following the beginning of the pandemic. This includes a large difference in support programs offered by districts which varied by subject, where over 60% of districts offered small group instruction for Math/ELA education, and only 25% offered this support for Science education. When making educational recovery plans, over 80% of districts considered Math/ELA a high priority area, while Science was regarded as low or no priority by nearly 40% of districts.

A video of Niu's presentation can be found on the CIC website.


Next, Hong Qin from the University of Tennessee, Chattanooga presented his research to Develop and Evaluate Computational Frameworks to Predict and Prevent Future Coronavirus Pandemics. This research is funded by the NSF Division of Computing and Communication Foundations.

Qin and his team created a deep learning model intended to predict the viability of future potential SARS-CoV-2 strains. Using the sequences of all known bat coronaviruses, they are able to examine existing mutations and build a geospatial model which should predict their recombination probability. A visual viral fitness landscape offers insight into the likelihood of different recombination events. The project on the  pathological fitness of viruses will hopefully allow for AI to predict and warn us of future coronavirus strains while also benefiting vaccine development. In the future, this technology could also be used for predicting influenza strains. This research is funded by the NSF Predictive Intelligence for Pandemic Prevention (PIPP) program.

A video of Qin's presentation can be found on the CIC website.


Next, Evelyn Yemurai Zhou from the University of South Africa presented her research on Advances in Machine Learning Explainability to Contextualize Equity Market Sustainability in South Africa During the COVID-19 Era. She was the 2022 COVID Information Commons Undergraduate Student Paper Challenge 1st Place Winner.

Zhou sought to uncover the impact of the COVID-19 pandemic on stock returns from the top 40 companies on the Johannesburg Stock Exchange in South Africa. She gathered the closing stock prices ranging from January 2017 to September 2022 and ran them through multiple data regressions and machine learning models to see the correlation between COVID-19 prevalence and stock market prices. Standard Bank was found to be negatively impacted by COVID-19, a result which can be attributed to relief services offered during the pandemic for banking fees and loan payments. Anglogold Ashanti, a gold mining company, was positively impacted, as their precious materials remained a valuable commodity. Clicks, a pharmaceutical company, was also positively impacted, as they produced essential pharmaceuticals throughout the pandemic. Ultimately, Zhou found that traditional stock market models were insufficient when applied to developing economies and suggested that novel machine learning can provide more reliable results.

A video of Evelyn’s presentation can be found on the CIC website.


To finish the webinar, Xin Zan from the University of Florida presented her research on Data-driven Adaptive Testing Resource Allocation Strategies for Real-time Monitoring of Infectious Diseases. She was the  2022 COVID Information Commons Graduate Student Paper Challenge 3rd Place Winner. 

Zan and her team recognized the growing challenge of unreliable and insufficient testing methods for global disease outbreaks and their impediment to  the analysis of outbreaks. Examining the limited testing options available, their goal was to find a test allocation strategy for quick disease outbreak detection. Understanding the dynamics of transmission and the health disparity caused by infectious diseases allows researchers to develop a  physics-informed model of prospective infection risk. They can then prioritize sample populations for testing based on if groups are found to be at higher risk of contracting the disease, or at higher risk of experiencing the disease severely. Running simulations of the model revealed it was robust and highly accurate at early recognition of disease outbreaks.

A video of Xin’s presentation can be found on the CIC website.


Following the presentation, Florence Hudson, Lauren Close, and Emily Rothenberg hosted a Q&A session where the audience was able to speak directly with researchers to have their questions answered and engage in enriching discussions. 

A recording of this event is available on the Northeast Big Data Innovation Hub’s YouTube Channel and the COVID Information Commons website. The COVID Information Commons is an NSF-funded project brought to you by the Big Data Innovation Hubs, led by the Northeast Big Data Innovation Hub at Columbia University. 

We look forward to welcoming you to the next CIC Lightning Talks webinar! Please sign up for the CIC newsletter to be informed of future CIC events.


August 08, 2023