Department of Health Informatics and Data Science and 

Center for Health Outcome and Informatics Research



“Making Free Text Reports Available for Researchers via a De-identification and Natural Language Processing Pipeline”


  Presented by:

Umit Topaloglu, PhD, FAMIA

Associate Professor, Wake Forest School of Medicine, Wake Forest University, Winston-Salem, NC

Abstract: With the recent advancements in predictive modeling and Machine Learning (ML) in healthcare, availability and accessibility of biomedical data provides an exciting opportunity for researchers. However, given majority of the clinical data is being unstructured, reproducible and scalable informatics tools that complies with regulations are needed. Dr. Topaloglu will present a set of tools that enables processing, de-identification, extraction of concepts and named entities, and deploying a search tool to provide researcher access and search notes prior to an IRB approval. These tools have been approved by the Privacy Board and the Compliance Office and implemented at Wake Forest Baptist Medical Center. Dr. Topaloglu will also discuss use cases for a newly developed specimen search tool for pathology and radiology reports. In addition, he will present an ML-based natural language processing implementations for specific data marts and potential improvements via federated learning paradigm.


When: Wednesday, March 31st        11:00 am – 12:00 pm

Join via Zoom:  https://luc.zoom.us/j/83887402738

Add to Calendar


About the Speaker: Dr. Topaloglu is a biomedical informatician with 15+ years of research experience in semantic research data frameworks that includes standard based data collection, Natural Language Processing and Machine Learning as well as data quality dimension of those endeavors. He has been working on implementation of an innovative scalable clinical research informatics systems for the last 10+ years. He has developed unified research informatics solutions and strategies across research spectrum which is successfully used by the National Children’s Study and many cancer and translational research studies. He is also co-leading the Oncology Domain Team at the National COVID Cohort Collaborative (N3C). He currently serves as the Associate Director for the Center for Biomedical Informatics, CTSI Informatics Program, and the Wake Forest Baptist Comprehensive Cancer Center.


Visit here to watch previous presentations and to find more information about future seminars.