Biomedical signal processing

The open-source electrophysiological toolbox (OSET)

Cardiovascular health

The OSET is a codebase aiming to enhance biomedical informatics and engineering through open-source software development, particularly in biomedical signal processing. Recognizing the gap between advanced data-driven methods and domain-specific skills, OSET was conceived in 2006, building on well-designed and science-driven methodologies for physiological time-series analysis. OSET’s functionalities started with toolsets for electrocardiogram data analysis, and were gradually extended to electroencephalogram and phonocardiogram data, which are the most commonly acquired time-series data in biomedical applications. It also features biosignal modeling tools for data augmentation for training machine learning and deep learning models. It also offers generic biosignal processing tools for denoising and curating all types of physiological time-series recordings. The OSET codebase is primarily developed in MATLAB, with some partial functionalities in C++ and more recently in Python. It operates under a permissive open license, encouraging community-driven development. This approach provides a transparent means for implementing physiological signal processing pipelines and offers standardized benchmarking for research and development purposes. Over the years, some modules of OSET have been translated to Python and C/C++, and integrated into medical devices and cloud-based diagnostic software for managing large datasets.

ECG-Image-Kit

With the shift from printed to digital formats, there is a pressing need to convert paper/scanned ECG archives into time-series data for machine learning (ML) model training, as these models cannot use scanned images directly due to deterioration risks and data requirements.

To overcome these challenges, we have developed ECG-Image-Kit, a toolset for creating synthetic ECG images that mimic real-world imperfections such as wrinkles and handwritten notes on paper-like backgrounds, ensuring no personally identifiable information is included. These images enhance the training and evaluation of ML models for ECG digitization and analysis. ECG-Image-Kit offers tools for analyzing scanned ECGs, generating synthetic images, and converting them into digital data. We aim to improve ECG digitization and computerized diagnosis through advanced deep learning and image processing techniques. The toolkit is used for data augmentaion in the 2024 PhysioNet Challenge. Read more here:

  • Kshama Kodthalu Shivashankara, Deepanshi, Afagh Mehri Shervedani, Gari D. Clifford, Matthew A. Reyna, Reza Sameni (2024). A Synthetic Electrocardiogram (ECG) Image Generation Toolbox to Facilitate Deep Learning-Based Scanned ECG Digitization. doi: 10.48550/ARXIV.2307.01946

  • ECG-Image-Kit: A Toolkit for Synthesis, Analysis, and Digitization of Electrocardiogram Images, (2024). URL: https://github.com/alphanumericslab/ecg-image-kit

Electrocardiogram (ECG) analysis

ECG signal processing

In a continuum of research, our team pioneered the field of model-based cardiac signal processing with applications in adult and fetal electrocardiography, phonocardiography, long-term Holter monitors and wearable devices. The highlights of this research, which resulted in a new track in biomedical cardiac signal processing over several years are:

Multimodal cardiac monitoring

Cardiovascular health

The EPHNOGRAM project aimed to develop a low-cost, low-power device for simultaneous electrocardiogram (ECG) and phonocardiogram (PCG) recording, incorporating additional channels for environmental audio to enhance PCG through active noise cancellation. The objective was to study the multimodal electro-mechanical activities of the heart, providing insights into the differences and synergies between these modalities during various cardiac activity levels. To date, we have collected an open-access simultaneous ECG and PCG dataset from young, healthy adults during a stress test.

Funded by the American Heart Association, we are using this dataset to advance the field of cardiovascular diagnostics by developing a novel sensor fusion technology that integrates ECG and PCG to simultaneously monitor the electrical and mechanical activities of the heart. By leveraging state-of-the-art machine learning and time-series analysis, we aim to create multimodal biomarkers that enhance diagnostic precision beyond the capabilities of individual modalities. This project addresses significant gaps in current clinical practices and knowledge, especially the limited use of multimodal data in cardiovascular research and its potential in low-resource and ambulatory settings. We aim to develop robust algorithms to extract and fuse ECG and PCG data, investigate the interaction between the heart’s electrical and mechanical functions, and explore the generalizability of this approach to other cardiac modalities. The success of this research could lead to breakthroughs in understanding heart function and improving cardiovascular disease (CVD) diagnostics, which would be particularly beneficial for ambulatory monitoring and in resource-limited environments. Read more:

Electroencephalogram (EEG) analysis

The EEG phase, including the instantaneous phase (IP) and instantaneous frequency (IF), has emerged as a rich complement to the EEG spectrum, essential for understanding phenomena such as phase coupling and phase resetting. We have made significant contributions to EEG analysis, demonstrating through theoretical proof how the common EEG phase calculation method, widely documented in the literature, is highly susceptible to noise and minor variations in algorithmic parameters. To address this, our team developed a robust Monte Carlo algorithm for EEG phase calculation and successfully applied it in brain-computer interface applications. The source code of this project is available online for public use. Additionally, we have developed algorithms and software to remove electrooculogram (EOG) artifacts from multichannel EEG recordings, serving as a complement to classical ICA-based methods, which often face challenges in automation and performance consistency. Read more:

Noninvasive fetal electrocardiography

Noninvasive fetal electrocardiography

Since 2005, our team has made significant contributions to various aspects of fetal cardiac monitoring, employing noninvasive modalities such as fetal electrocardiogram, magnetocardiogram, phonocardiogram, and Doppler ultrasound. These efforts have culminated in a US patent, utilized in an FDA-approved fetal ECG monitor produced by MindChild Medical Inc., a series of publications, and open-access datasets and codes. Recent scientific advances from the team include: a) the implementation of online fetal ECG (fECG) extraction using online source separation algorithms; b) the use of fECG to estimate and track fetal movements/rotations relative to maternal body coordinates; c) noninvasive fECG extraction from low-rank (as few as a single-channel) and time-varying mixtures; d) the development of a novel semi-blind source separation algorithm for fECG extraction in the presence of nonstationary noise and irregular maternal beats.

Noninvasive fetal magnetoencephalography

Congenital heart defects

Using advanced multichannel signal processing techniques, we have been able to extract and study the magnetoencephalogram (MEG) of the fetus from noninvasive signals recorded by SQUID technology systems. Our contribution in this project has been in the signal processing aspects and we have used datasets provided by our colleague Prof. Dirk Hoyer, from the Biomagnetic Center in Jena, Germany. Read more:


Public health

Demography-aware blood pressure monitoring

Blood pressure monitoring

Regular blood pressure (BP) monitoring is crucial for managing cardiovascular diseases, though concerns persist about inaccuracies due to device errors and biases against under-represented patient demographic groups. In a series of ongoing studies we are investigating how advanced AI and machine learning models, enhanced by large language models trained on clinical literature and notes, can improve the accuracy of BP measurements and removing demographic biases. Our analysis of over 90 million patient encounters at Emory Healthcare involving more than 3.4 million unique patients reveals significant demographic variations in BP, underscoring the need for personalized healthcare approaches. Additionally, we have developed Bayesian estimation models to address BP measurement biases caused by respiratory effects and device-specific errors, aiming to refine BP monitoring technologies for better clinical outcomes. More recently, we are building blood pressure-specific natural language processing (NLP) and large-language model (LLM)-based tools. Read more:

Population-based surveillance of congenital heart diseases

Congenital heart defects

Congenital heart defects (CHDs) affect about 1% of births in the U.S. annually, ranging from severe anatomical defects requiring early surgery to more progressive conditions like valve and coronary anomalies. Survival rates have significantly improved due to early detection, innovative surgeries, and collaborative public health efforts. Since 2012, the Emory Adolescent and Adult CHD Program, in partnership with the CDC and other institutions, has enhanced CHD surveillance, analyzed health utilization, and influenced healthcare policy to better the outcomes for those with CHD and address healthcare inequities. Our lab collaborates closely with Prof Wendy Book and Emory’s CHD program since 2020. We have recently extended the study to a longitudinal study of ECG data in CHD patients. Visit our detailed CHD webpage, and read more from our published research:

Epidemic disease spread modeling and non-pharmaceutical control

Pandemic modeling and control

During the COVID-19 pandemic, our lab contributed to the development of mathematical models for tracking the trend of the pandemic spread in large populations and in built settings. We created novel algorithms for model-based prediction and optimal control of pandemics through non-pharmaceutical interventions. Our team at Emory University, the Alphanumerics Team, participated in and was recognized as one of the finalists in the XPRIZE Pandemic Response Challenge, which aimed to advance the forecasting and control of pandemics using these methods. We also collaborated closely with the Simulation and Estimation of Epidemics with Algorithms (SEEPIA) research group at Grenoble Alpes University, France. This group, formed from independent researchers with various backgrounds including mathematicians, control theory experts, signal processing experts, epidemiologists, and research engineers, synergized their expertise in modeling and forecasting the pandemic’s spread under social and economic constraints.

Maternal-fetal healthcare in low-resourced settings

Maternal-fetal health

In collaboration with Prof. Gari Clifford and our extended research team at Emory University, we are advancing maternal and fetal mobile health technology in low-resourced settings via our collaborative work with the Emory Co-Design Lab and the Guatemalan NGO Wuqu’ Kawoq. The AI-driven system, safe+natal, led by Prof. Gari D. Clifford, Rachel Hall-Clifford, and Dr. Peter Rohloff, supports community healthcare workers in the highlands of Guatemala and has improved pregnancy and early childhood outcomes, leading to its adoption as standard care after a successful RCT. The team is now expanding to new populations, working with Morehouse School of Medicine on the IMPROVE Initiative in Georgia, and partnering with the Gates Foundation in sub-Saharan Africa. This project is supported by the NIH, Google.org, and the MacArthur Foundation. Read more:


Data challenges and crowdsourcing

Data challenges

PhysioNet Challenges

The George B. Moody PhysioNet Challenges are annual competitions that invite participants to develop automated approaches for addressing important physiological and clinical problems. Our lab collaborates closely with Dr. Matthew A. Reyna and Prof. Gari D. Clifford and contributes to the annual PhysioNet Challenges.

PhysioCrowd

PhysioCrowd aims to create a sustainable ecosystem for enhancing ECG-based diagnostics by integrating mobile and wearable device technology with advanced machine learning. This initiative seeks to address the gap between the overwhelming influx of biomedical data and the scarcity of expert analysis, exacerbated by challenges like limited data diversity and expertise in algorithmic interpretation. PhysioCrowd proposes a crowdsourcing platform combining human expertise and algorithmic annotations to develop, share, and utilize extensive ECG databases effectively. It emphasizes open licensing to ensure global accessibility, particularly benefiting under-resourced institutions and underserved populations. The platform is designed to support a collaborative environment for algorithm developers, clinicians, and researchers, fostering advancements in cardiology diagnostics and promoting inclusive participation from a diverse range of contributors worldwide. PhysioCrowd will ultimately serve as a major hub for annotated physiological data, advancing open-source algorithm development and contributing broadly to global health improvements. Read more: PhysioCrowd ECG Annotator


Former projects

Portable monitors

Cardiovascular health

In our former lab at Shiraz University, our team has prototyped several hardware devices for biomedical applications, including three-lead ECG Holter monitors, simultaneous ECG and PCG acquisition systems, and noninvasive IR vein finders.

Computational hardware/firmware architecture design

Shiraz University FPGA board design

Hardware accelerators are currently at the heart of many machine learning and biomedical signal processing systems. In our former research lab at Shiraz University, our team has contributed to the development of efficient computational firmware based on field-programmable gate array (FPGA) technologies. The objective has been to develop firmware modules common in many machine learning and biomedical signal processing systems. To date, our contributions include the development of FPGA-based linear and nonlinear filter units, automated deep and shallow neural network architectures, low-level toolboxes for matrix and vector manipulation on hardware, automated mechanisms for porting state-space systems onto FPGA, and an automated mechanism for transforming recursive signal processing pseudo-codes into FPGA-based modules. The objective of this research was to develop an ecosystem of open-source firmware modules, which can be integrated and used to develop machine learning and signal processing hardware accelerators. Considering that FPGA technology is also used for prototyping application-specific integrated circuits (ASIC), the developed units can eventually be used for developing customized machine learning chips. The FPGA hardware systems required for our firmware design and evaluation have also been designed and manufactured by our team and used as trainer boards for an FPGA lab developed and presented at Shiraz University from 2008 to 2018. Some of our scientific contributions in this area include:

Interpretive signal processing

Interpretive signal processing

Interpretive Signal Processing (ISP) is an ad hoc technique for customizing signal processing algorithms for non-numeric data. Genomic data such as DNA or protein sequences are examples of such data. Contrary to the conventional approach of coding and decoding non-numeric data to numeric values, the main idea in ISP is to interpret signal processing algorithm as they are and to tailor similar operators for the direct manipulation of non-numeric data. We have studied two cases of ISP in our previous research: