Biomedical informatics and data science: evolving fields with significant overlap.
Author(s): Brennan, Patricia Flatley, Chiang, Michael F, Ohno-Machado, Lucila
DOI: 10.1093/jamia/ocx146
Author(s): Brennan, Patricia Flatley, Chiang, Michael F, Ohno-Machado, Lucila
DOI: 10.1093/jamia/ocx146
Allergies are increasing, but the reasons for this are unclear. Although environmental factors are thought to be important, there is a lack of data on how they contribute to symptom development. To understand this relationship better, we need accurate data about both symptoms and environmental factors. Our objective here is to ascertain whether experience sampling is a reliable approach for collecting allergy symptom data in the general population, allowing us [...]
Author(s): Vigo, Markel, Hassan, Lamiece, Vance, William, Jay, Caroline, Brass, Andrew, Cruickshank, Sheena
DOI: 10.1093/jamia/ocx148
The DAta Tag Suite (DATS) is a model supporting dataset description, indexing, and discovery. It is available as an annotated serialization with schema.org, a vocabulary used by major search engines, thus making the datasets discoverable on the web. DATS underlies DataMed, the National Institutes of Health Big Data to Knowledge Data Discovery Index prototype, which aims to provide a "PubMed for datasets." The experience gained while indexing a heterogeneous range [...]
Author(s): Gonzalez-Beltran, Alejandra N, Campbell, John, Dunn, Patrick, Guijarro, Diana, Ionescu, Sanda, Kim, Hyeoneui, Lyle, Jared, Wiser, Jeffrey, Sansone, Susanna-Assunta, Rocca-Serra, Philippe
DOI: 10.1093/jamia/ocx119
A growing variety of diverse data sources is emerging to better inform health care delivery and health outcomes. We sought to evaluate the capacity for clinical, socioeconomic, and public health data sources to predict the need for various social service referrals among patients at a safety-net hospital.
Author(s): Kasthurirathne, Suranga N, Vest, Joshua R, Menachemi, Nir, Halverson, Paul K, Grannis, Shaun J
DOI: 10.1093/jamia/ocx130
Electronic health record (EHR)-based phenotyping infers whether a patient has a disease based on the information in his or her EHR. A human-annotated training set with gold-standard disease status labels is usually required to build an algorithm for phenotyping based on a set of predictive features. The time intensiveness of annotation and feature curation severely limits the ability to achieve high-throughput phenotyping. While previous studies have successfully automated feature curation [...]
Author(s): Yu, Sheng, Ma, Yumeng, Gronsbell, Jessica, Cai, Tianrun, Ananthakrishnan, Ashwin N, Gainer, Vivian S, Churchill, Susanne E, Szolovits, Peter, Murphy, Shawn N, Kohane, Isaac S, Liao, Katherine P, Cai, Tianxi
DOI: 10.1093/jamia/ocx111
Bioinformatics publications typically include complex software workflows that are difficult to describe in a manuscript. We describe and demonstrate the use of interactive software notebooks to document and distribute bioinformatics research. We provide a user-friendly tool, BiocImageBuilder, that allows users to easily distribute their bioinformatics protocols through interactive notebooks uploaded to either a GitHub repository or a private server.
Author(s): Almugbel, Reem, Hung, Ling-Hong, Hu, Jiaming, Almutairy, Abeer, Ortogero, Nicole, Tamta, Yashaswi, Yeung, Ka Yee
DOI: 10.1093/jamia/ocx120
To provide an open source, interoperable, and scalable data quality assessment tool for evaluation and visualization of completeness and conformance in electronic health record (EHR) data repositories.
Author(s): Estiri, Hossein, Stephens, Kari A, Klann, Jeffrey G, Murphy, Shawn N
DOI: 10.1093/jamia/ocx109
Outpatient clinics lack guidance for tackling modern efficiency and productivity demands. Workflow studies require large amounts of timing data that are prohibitively expensive to collect through observation or tracking devices. Electronic health records (EHRs) contain a vast amount of timing data - timestamps collected during regular use - that can be mapped to workflow steps. This study validates using EHR timestamp data to predict outpatient ophthalmology clinic workflow timings at [...]
Author(s): Hribar, Michelle R, Read-Brown, Sarah, Goldstein, Isaac H, Reznick, Leah G, Lombardi, Lorinna, Parikh, Mansi, Chamberlain, Winston, Chiang, Michael F
DOI: 10.1093/jamia/ocx098
Lack of reproducibility in medical studies is a barrier to the generation of a robust knowledge base to support clinical decision-making. In this paper we outline the Medical Information Mart for Intensive Care (MIMIC) Code Repository, a centralized code base for generating reproducible studies on an openly available critical care dataset.
Author(s): Johnson, Alistair Ew, Stone, David J, Celi, Leo A, Pollard, Tom J
DOI: 10.1093/jamia/ocx084
Biomedical science is driven by datasets that are being accumulated at an unprecedented rate, with ever-growing volume and richness. There are various initiatives to make these datasets more widely available to recipients who sign Data Use Certificate agreements, whereby penalties are levied for violations. A particularly popular penalty is the temporary revocation, often for several months, of the recipient's data usage rights. This policy is based on the assumption that [...]
Author(s): Xia, Weiyi, Wan, Zhiyu, Yin, Zhijun, Gaupp, James, Liu, Yongtai, Clayton, Ellen Wright, Kantarcioglu, Murat, Vorobeychik, Yevgeniy, Malin, Bradley A
DOI: 10.1093/jamia/ocx101