# Health Data

<mark style="color:$primary;">"</mark>*<mark style="color:$primary;">The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery.</mark>*<mark style="color:$primary;">"</mark> \[[1](#references)]

The term "big data" has been used since the early 1990s \[[2](#references)]. Big data are characterized by the "3 Vs": Volume (size), Velocity (speed of generation), and Variety (different types) \[[1](#references)]. This has expanded to additional Vs (5 Vs, 10 Vs, 14 Vs, etc.) such as: Veracity, Value, Validity, Variability, and Vocabulary.&#x20;

There are many sources of big data in biomedicine and health care \[[3](#references)]. These include Electronic Health Records (EHR) \[[4](#references)], Health Information Exchanges (HIE) \[[5](#references)], All-Payer Claims Databases (APCD) \[[6](#references)], biological and biomedical databases \[[7](#references)], and public health surveys \[[8](#references)].

Health data can be broadly categorized as "structured" (e.g., demographics, diagnoses, procedures, and medications) or "unstructured" (e.g., clinical reports and notes) \[[9](#references)].  Use of established [**Health Data Standards**](https://docs.bcbi.brown.edu/codiac-for-health/foundations/health-data-standards) is critical for sharing and exchange of health data within and across organizations to support [**Artificial Intelligence in Health**](https://docs.bcbi.brown.edu/codiac-for-health/foundations/artificial-intelligence-in-health) and [**Observational Health Research**](https://docs.bcbi.brown.edu/codiac-for-health/foundations/observational-health-research).

{% hint style="info" %}
**See CODIAC for Health chapter on Health Data and Data Standards (forthcoming) for more information.**
{% endhint %}

## References

1. Bellazzi R. [**Big data and biomedical informatics: a challenging opportunity**](https://pmc.ncbi.nlm.nih.gov/articles/PMC4287065/). Yearb Med Inform. 2014 May 22;9(1):8-13. doi: 10.15265/IY-2014-0024. PMID: 24853034; PMCID: PMC4287065.
2. Lohr S. [**The Origins of ‘Big Data': An Etymological Detective Story**](https://archive.nytimes.com/bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/). The New York Times. 2013 Feb 1.&#x20;
3. [**Healthcare Big Data and the Promise of Value-Based Care**](https://catalyst.nejm.org/doi/full/10.1056/CAT.18.0290). Catalyst Carryover. 2018 Jan 1.&#x20;
4. Ehrenstein V, Kharrazi H, Lehmann H, et al. [**Obtaining Data From Electronic Health Records**](https://www.ncbi.nlm.nih.gov/books/NBK551878/). In: Gliklich RE, Leavy MB, Dreyer NA, editors. Tools and Technologies for Registry Interoperability, Registries for Evaluating Patient Outcomes: A User’s Guide, 3rd Edition, Addendum 2 \[Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2019 Oct.&#x20;
5. Sarkar IN. [**Health Information Exchange as a Global Utility**](https://pubmed.ncbi.nlm.nih.gov/37164575/). Chest. 2023 May;163(5):1023-1025. doi: 10.1016/j.chest.2022.12.001. PMID: 37164575.
6. Love D, Custer W, Miller P. [**All-payer claims databases: state initiatives to improve health care transparency**](https://pubmed.ncbi.nlm.nih.gov/20830868/). Issue Brief (Commonw Fund). 2010 Sep;99:1-14. PMID: 20830868.
7. Sayers EW, Beck J, Bolton EE, et al. [**Database resources of the National Center for Biotechnology Information**](https://pubmed.ncbi.nlm.nih.gov/37994677/). Nucleic Acids Res. 2024 Jan 5;52(D1):D33-D43. doi: 10.1093/nar/gkad1044. PMID: 37994677; PMCID: PMC10767890.
8. Blewett LA, Call KT, Turner J, Hest R. [**Data Resources for Conducting Health Services and Policy Research**](https://pubmed.ncbi.nlm.nih.gov/29272166/). Annu Rev Public Health. 2018 Apr 1;39:437-452. doi: 10.1146/annurev-publhealth-040617-013544. Epub 2017 Dec 22. PMID: 29272166; PMCID: PMC5880724.
9. Weber GM, Mandl KD, Kohane IS. [**Finding the missing link for big biomedical data**](https://pubmed.ncbi.nlm.nih.gov/24854141/). JAMA. 2014 Jun 25;311(24):2479-80. doi: 10.1001/jama.2014.4228. PMID: 24854141.

## Resources

### Books/Chapters

* Ehrenstein V, Kharrazi H, Lehmann H, et al. [**Obtaining Data From Electronic Health Records**](https://www.ncbi.nlm.nih.gov/books/NBK551878/). In: Gliklich RE, Leavy MB, Dreyer NA, editors. Tools and Technologies for Registry Interoperability, Registries for Evaluating Patient Outcomes: A User’s Guide, 3rd Edition, Addendum 2 \[Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2019 Oct.&#x20;
* NIH Pragmatic Trials Collaboratory Rethinking Clinical Trials
  * [**Acquiring Real-World Data**](https://rethinkingclinicaltrials.org/chapters/conduct/acquiring-real-world-data/introduction/)
  * [**Using Electronic Health Record Data**](https://rethinkingclinicaltrials.org/chapters/design/using-electronic-health-record-data-pragmatic-clinical-trials-top/using-electronic-health-record-data-in-pragmatic-clinical-trials-introduction/)
* [**Secondary Analysis of Electronic Health Records**](https://www.ncbi.nlm.nih.gov/books/NBK543630/) \[Internet]. Cham (CH): Springer; 2016. doi: 10.1007/978-3-319-43742-2.

### Articles

* Sarkar IN. [**Transforming Health Data to Actionable Information: Recent Progress and Future Opportunities in Health Information Exchange**](https://pubmed.ncbi.nlm.nih.gov/36463879/). Yearb Med Inform. 2022 Aug;31(1):203-214. doi: 10.1055/s-0042-1742519. Epub 2022 Dec 4. PMID: 36463879; PMCID: PMC9719753.
* Sarkar IN. [**Health Information Exchange as a Global Utility**](https://pubmed.ncbi.nlm.nih.gov/37164575/). Chest. 2023 May;163(5):1023-1025. doi: 10.1016/j.chest.2022.12.001. PMID: 37164575.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.bcbi.brown.edu/codiac-for-health/foundations/health-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
