Migrating a research data warehouse to a public cloud: challenges and opportunities
Article authors Michael G. Kahn and Michael J. Ames discuss this month's JAMIA Journal Club selection:
Kahn MG, Mui JY, Ames MJ, et al. Migrating a research data warehouse to a public cloud: challenges and opportunities. J Am Med Inform Assoc. 2022;29(4):592-600. doi:10.1093/jamia/ocab278
Watch the Recording
Dr. Kahn is Emeritus Professor of Medicine at the University of Colorado Denver. Prior to retirement, he was the Translational Informatics Core Director for the Colorado Clinical and Translational Sciences Institute; co-Director of the Colorado Center for Personalized Medicine; and Director of Research Informatics at Children’s Hospital Colorado. Dr. Kahn also led Health Data Compass, the first-in-class Google cloud-based research data warehouse that combines data from four clinical, financial and research institutions plus state and federal sources. Dr. Kahn participates in multiple regional, national, and international clinical data research networks. His informatics research focus areas are data model harmonization and developing sharable data quality measures in distributed research networks.
Mr. Ames is Senior Director of Healthcare and Life Sciences at SADA, a Google Cloud reseller and consultancy. In this role, he helps connect the diverse scientific and clinical needs of the healthcare and life sciences community with Google Cloud's computing, data, and analytics technologies.
Prior to joining SADA, Mr. Ames was Director of Technology Innovation for the Colorado Center for Personalized Medicine and Associate Director of Health Data Compass, where he worked with Dr. Michael Kahn in designing and deploying a first-of-its-kind enterprise data warehouse and analytics platform on Google Cloud. This work was preceded by Mr. Ames’s role as Research Informatics Manager for the University of Colorado Cancer Center.
Prior roles include leadership of Boston Scientific's Global Clinical Information Systems department, and startup companies in the healthcare analytics space.
Mr. Ames holds a business degree from Utah State University and a Master of Biomedical Informatics from OHSU.
JAMIA Journal Club managers and monthly moderators are JAMIA Student Editorial Board members.
Moderator and Manager
Statement of Purpose
Academic institutions and commercial entities are creating massive-scale integrated research data warehouses (RDWs) with a growing number that include linkages to biobanks, biological data repositories, and open data archives. As the depth and scale of these warehouses explode and the analytics being applied to these data increase in complexity, the computing and storage needs of these research environments may quickly exceed the capacity of on-premises systems. New data management and analytics environments are migrating to cloud platforms for the scalability and flexibility needed to meet these challenges. A recent body of literature, mostly from the National Center for Advancing Translational Sciences (NCATS) and the Clinical and Translational Science Awards (CTSA) community, has highlighted the diversity of environments, functionalities, and governance across institutions. However, the migration of systems and functionalities required to support data-driven discoveries from on-premises environments into the cloud is far more complex than is generally realized and the challenges with a cloud migration are rarely, if ever, discussed explicitly.
We describe our experience in migrating a multi-institutional RDW to a public cloud. We organized unanticipated challenges into eight categories: (1) networking/network security; (2) data engineering; (3) computation; (4) storage; (5) secure analytics; (6) sandboxes/public data; (7) innovation and consulting services; (8) costs versus utilization.
While migrating our RDW to the cloud has enabled capabilities and innovations that would not have been possible with an on-premises environment, the efforts have not been as straightforward as we had anticipated. Notwithstanding the challenges of managing cloud resources, the resulting RDW capabilities have been highly positive to our institution, research community, and partners.
Our presentation can provide institutions seeking to make a similar transition with insights we wished we knew when we started our migration.
The target audience for this activity is professionals and students interested in health informatics.
After participating in this webinar the listener should be better able to:
- Compare the similarities and differences of cloud-based research data warehouses (RDWs) and traditional on-premises RDWs
- Anticipate unique challenges in promoting, planning, executing, and supporting cloud-based research environments
- Characterize classes of cloud-based RDW challenges that require multi-disciplinary engagement across institutional boundaries, roles and responsibilities, and existing policies and procedures.
- 35-minute presentation by article author(s) considering salient features of the published study and its potential impact on practice
- 25-minute discussion of questions submitted by listeners via the webinar tools and moderated by JAMIA Student Editorial Board members
The American Medical Informatics Association is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians.