# Health Data

> "*The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery.*" \[[1](#references)]

The term "big data" has been used since the early 1990s \[[2](#references)]. Big data are characterized by the "3 Vs": Volume (size), Velocity (speed of generation), and Variety (different types) \[[1](#references)]. This has expanded to additional Vs (5 Vs, 10 Vs, 14 Vs, etc.) such as: Veracity, Value, Validity, Variability, and Vocabulary.&#x20;

There are many sources of big data in biomedicine and health care \[[3](#references)]. These include Electronic Health Records (EHR) \[[4](#references)], Health Information Exchanges (HIE) \[[5](#references)], All-Payer Claims Databases (APCD) \[[6](#references)], biological and biomedical databases \[[7](#references)], and public health surveys \[[8](#references)].

Health data can be broadly categorized as "structured" (e.g., demographics, diagnoses, procedures, and medications) or "unstructured" (e.g., clinical reports and notes) \[[9](#references)].  Use of established [**Health Data Standards**](https://docs.bcbi.brown.edu/codiac-for-health/foundations/health-data-standards) is critical for sharing and exchange of health data within and across organizations to support [**Artificial Intelligence in Health**](https://docs.bcbi.brown.edu/codiac-for-health/foundations/artificial-intelligence-in-health) and [**Observational Health Research**](https://docs.bcbi.brown.edu/codiac-for-health/foundations/observational-health-research).

{% hint style="info" %}
**See CODIAC for Health chapter on Health Data and Data Standards (forthcoming) for more information.**
{% endhint %}

## References

1. Bellazzi R. **Big data and biomedical informatics: a challenging opportunity**. Yearb Med Inform. 2014 May 22;9(1):8-13. doi: 10.15265/IY-2014-0024. PMID: 24853034; PMCID: [PMC4287065](https://pmc.ncbi.nlm.nih.gov/articles/PMC4287065/).
2. Lohr S. **The Origins of ‘Big Data': An Etymological Detective Story**. The New York Times. 2013 Feb 1. \[ [Link](https://archive.nytimes.com/bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/) ]
3. **Healthcare Big Data and the Promise of Value-Based Care**. Catalyst Carryover. 2018 Jan 1. \[ [Link](https://catalyst.nejm.org/doi/full/10.1056/CAT.18.0290) ]&#x20;
4. Ehrenstein V, Kharrazi H, Lehmann H, et al. **Obtaining Data From Electronic Health Records**. In: Gliklich RE, Leavy MB, Dreyer NA, editors. Tools and Technologies for Registry Interoperability, Registries for Evaluating Patient Outcomes: A User’s Guide, 3rd Edition, Addendum 2 \[Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2019 Oct. Available from: <https://www.ncbi.nlm.nih.gov/books/NBK551878/>
5. Sarkar IN. **Health Information Exchange as a Global Utility**. Chest. 2023 May;163(5):1023-1025. doi: 10.1016/j.chest.2022.12.001. PMID: [37164575](https://pubmed.ncbi.nlm.nih.gov/37164575/).
6. Love D, Custer W, Miller P. **All-payer claims databases: state initiatives to improve health care transparency**. Issue Brief (Commonw Fund). 2010 Sep;99:1-14. PMID: [20830868](https://pubmed.ncbi.nlm.nih.gov/20830868/).
7. Sayers EW, Beck J, Bolton EE, et al. **Database resources of the National Center for Biotechnology Information**. Nucleic Acids Res. 2024 Jan 5;52(D1):D33-D43. doi: 10.1093/nar/gkad1044. PMID: [37994677](https://pubmed.ncbi.nlm.nih.gov/37994677/); PMCID: PMC10767890.
8. Blewett LA, Call KT, Turner J, Hest R. **Data Resources for Conducting Health Services and Policy Research**. Annu Rev Public Health. 2018 Apr 1;39:437-452. doi: 10.1146/annurev-publhealth-040617-013544. Epub 2017 Dec 22. PMID: [29272166](https://pubmed.ncbi.nlm.nih.gov/29272166/); PMCID: PMC5880724.
9. Weber GM, Mandl KD, Kohane IS. **Finding the missing link for big biomedical data**. JAMA. 2014 Jun 25;311(24):2479-80. doi: 10.1001/jama.2014.4228. PMID: [24854141](https://pubmed.ncbi.nlm.nih.gov/24854141/).

## Resources

### Books/Chapters

* Ehrenstein V, Kharrazi H, Lehmann H, et al. **Obtaining Data From Electronic Health Records**. In: Gliklich RE, Leavy MB, Dreyer NA, editors. Tools and Technologies for Registry Interoperability, Registries for Evaluating Patient Outcomes: A User’s Guide, 3rd Edition, Addendum 2 \[Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2019 Oct. Available from: <https://www.ncbi.nlm.nih.gov/books/NBK551878/>
* NIH Pragmatic Trials Collaboratory Rethinking Clinical Trials
  * [Acquiring Real-World Data](https://rethinkingclinicaltrials.org/chapters/conduct/acquiring-real-world-data/introduction/)
  * [Using Electronic Health Record Data](https://rethinkingclinicaltrials.org/chapters/design/using-electronic-health-record-data-pragmatic-clinical-trials-top/using-electronic-health-record-data-in-pragmatic-clinical-trials-introduction/)
* S**econdary Analysis of Electronic Health Records** \[Internet]. Cham (CH): Springer; 2016. Available from: <https://www.ncbi.nlm.nih.gov/books/NBK543630/> doi: 10.1007/978-3-319-43742-2.

### Articles

* Sarkar IN. **Transforming Health Data to Actionable Information: Recent Progress and Future Opportunities in Health Information Exchange**. Yearb Med Inform. 2022 Aug;31(1):203-214. doi: 10.1055/s-0042-1742519. Epub 2022 Dec 4. PMID: [36463879](https://pubmed.ncbi.nlm.nih.gov/36463879/); PMCID: PMC9719753.
* Sarkar IN. **Health Information Exchange as a Global Utility**. Chest. 2023 May;163(5):1023-1025. doi: 10.1016/j.chest.2022.12.001. PMID: [37164575](https://pubmed.ncbi.nlm.nih.gov/37164575/).
