Skip to main content

The promise of secondary use of clinical data — from population health to clinical trial matching — often fails due to the reality of data quality. This session presents an end-to-end framework for converting fragmented, multimodal clinical data into audit-ready, high-fidelity data assets. The presentation is divided into three segments:

The Evidence for Multimodal Integration — We begin by addressing the "structured data gap." Clinical research indicates that nearly 40% of critical diagnoses, and the vast majority of Social Determinants of Health (SDoH) and tumor properties, reside exclusively in unstructured free text. We demonstrate why risk scores and quality measures derived solely from structured EHR fields are inherently incomplete and how the integration of multimodal data (text, FHIR, PDF, and imaging) is a prerequisite for clinical accuracy.

Architecting the Secondary Use Data Platform — Having established the need for data synthesis, we detail a scalable solution architecture that serves as the foundation for modern AI Agents. We outline the transition from raw data ingestion to "OMOP Gold" status using a tiered Bronze/Silver/Gold data layer strategy. This section covers the technical requirements for automated de-identification, clinical coding to standard terminologies, and the preservation of full provenance — ensuring that every inference made by an AI agent is traceable back to its source evidence.

Case Study: Achieving Regulatory-Grade Accuracy in Cancer Registries — Finally, we provide a deep dive into a high-stakes application of this architecture: the automation of cancer registries. Abstracting a single registry requires navigating over 2,500 pages of specifications and 750+ data fields that vary by year and jurisdiction. We present definitive benchmarks and evaluation metrics across major solid tumor types, proving that AI-powered abstraction can achieve parity with certified human cancer registrars.

Watch the Recording

 

Learning Objectives

  1. Identify the limitations of structured EHR data in predicting clinical outcomes and population health metrics.
  2. Understand the design patterns for a "secondary use" data platform that bridges the gap between raw clinical records and AI Agents.
  3. Evaluate the benchmarks required to move AI abstraction from "experimental" to "regulatory-grade" in complex oncological use cases.

Speaker

David Talby
CEO
John Snow Labs

 

Dates and Times: -
Type: Industry Partner
Course Format(s): On Demand
Price: Free
Share