Natural Language Processing

Principal Investigator: John Pestian, PhD

Medical records include more than 50 different types of clinical annotations such as radiology reports, discharge summaries and surgical notes. Essential for delivery of care, this free-text also could be used by researchers in a variety of ways -- for example, in combination with genomic and proteomic data to make advances in personalized medicine. To date, however, studies of clinical free-text have been sparse, at best, for several reasons including regulations requiring the data to be anonymized and the nature of the data itself. Redundant, lacking structure and filled with abbreviations, acronyms and jargon, clinical free-text is much more difficult to mine than structured data such as laboratory results.

John Pestian, PhD, and his research team are attempting to overcome these obstacles by using natural language processing (NLP). Specifically, the group is focused on developing and implementing neuro-cognitive algorithms that enable computers to understand the concepts and semantic relationships within clinical text. Already, the group has developed a tool that anonymizes free-text and has used this tool to create a corpus to support NLP research. The group's next steps include further annotating the existing corpus, developing a second corpus, and using these corpora to train new, memory-based text processing algorithms.

Encryption Broker and Ontologizer

To meet Health Information Portability and Accountability (HIPAA) and other regulatory requirements, Dr. Pestian and his group developed Encryption Broker, a tool that anonymizes and disambiguates clinical text without corrupting its meaning. The resulting text can then be fed into the group's ontologizer, a tool that creates conceptual maps based on the Unified Medical Language System (UMLS). By creating an automated solution to the problems of anonymizing and annotating clinical free-text, the group hopes to provide researchers, particularly those involved in personalized medicine, with new sources of knowledge that might lead to improved clinical therapies and prevention strategies. Investigators interested in using Encryption Broker or the ontologizer should contact Dr. Pestian.

Group Members

Dr. Pestian's collaborators include Christopher Brew, PhD, and DJ Hovermale of The Ohio State University; Wlodzislaw Duch, PhD, of Nicolaus Copernicus University in Torun, Poland; Kevin Cohen of the Center for Computational Pharmacology at the University of Colorado Health Sciences Center; Max Wiznitzer, MD, of the Case Western University School of Medicine; and Tracy Glauser, MD, Robert Kowatch, MD, Pawel Matykiewucz, Todd Nick, PhD, Cindy Prows, MSN, RN, Shannon Saldana, PharmD, Randy Sallee, MD, PhD, Malik Spencer, Erica Steckl, Sander Vinks, PharmD, PhD, FCP, and Kejain Zhang, MD, of Cincinnati Children's. There's always room for more bright and energetic scientists, too!

Projects