> For the complete documentation index, see [llms.txt](https://docs.bcbi.brown.edu/codiac-for-health/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.bcbi.brown.edu/codiac-for-health/observational-research/research-pipeline.md).

# Research Pipeline

Below is an example pipeline or process for conducting research with [**Health Data**](https://docs.bcbi.brown.edu/codiac-for-health/foundations/health-data) such as EHR data. The list is by no means exhaustive. However, it is a good place to start. A page could be written on each of the steps in the pipeline - and most likely will be in future releases of CODIAC for Health.

1. Conduct a literature review.
2. Explicitly describe the research question.
3. Form an interdisciplinary team that can guide and perform each step of the study.
4. Fully specify the research protocol *in advance* of executing the study.
5. Apply for IRB approval of the study.
6. Apply for an Institutional Reliance Agreement, if necessary.
7. Execute a Data (Transfer and) Use Agreement (DUA, DTUA), as required
8. Comply with any application and approval procedures set forth by the data provider.
9. Request access to / Set up computing infrastructure, as necessary.
10. Assess the suitability (strengths and weaknesses) of the dataset(s) to be used in the study.
11. Assess the quality of the dataset(s).
12. Define the study cohort (and matching cases, if applicable).
13. Create standard code sets for each clinical concept in the cohort definition and every independent and dependent variable.
14. Compose a computable data request / data extraction specification.
15. Clean and stage extracted data for analysis; handle missing values according to protocol.
16. Characterize the study cohort (and matching cases, if applicable).
17. Adjust for any bias or confounders in the data.
18. Analyze the data according to protocol.
19. Produce research products.
20. Comply with any review procedures required by the data provider.
21. Publish your work!&#x20;

## Resources

### Books

* [**Secondary Analysis of Electronic Health Records**](https://www.ncbi.nlm.nih.gov/books/NBK543630/) \[Internet]. Cham (CH): Springer; 2016. doi: 10.1007/978-3-319-43742-2
* Ed. Hulin Wu et al. **Statistics and machine learning methods for EHR data: from data extraction to data analytics**. CRC Press 2021; ISBN 978-0-367-44239-2
* O’Neil ST, Beasley W, Loomba J, Patrick S, Wilkins KJ, Crowley KM., Anzalone, AJ (Eds.) (2023). [**The Researcher’s Guide to N3C: A National Resource for Analyzing Real-World Health Data**](https://zenodo.org/record/7749367#.ZEgPD3bMJmM). DOI: 10.5281/zenodo.7749367
* [**The Book of OHDSI**](https://ohdsi.github.io/TheBookOfOhdsi/)

### Articles

* Blewett LA, Call KT, Turner J, Hest R. [**Data Resources for Conducting Health Services and Policy Research**](https://pmc.ncbi.nlm.nih.gov/articles/PMC5880724/). Annu Rev Public Health. 2018 Apr 1;39:437-452. doi: 10.1146/annurev-publhealth-040617-013544. Epub 2017 Dec 22. PMID: 29272166; PMCID: PMC5880724.
* Sayers EW, Beck J, Bolton EE, et al. [**Database resources of the National Center for Biotechnology Information**](https://pmc.ncbi.nlm.nih.gov/articles/PMC7778943/). Nucleic Acids Res. 2021 Jan 8;49(D1):D10-D17. doi: 10.1093/nar/gkaa892. PMID: 33095870; PMCID: PMC7778943.
* Shang N, Weng C, Hripcsak G. [**A conceptual framework for evaluating data suitability for observational studies**](https://pubmed.ncbi.nlm.nih.gov/29024976/)**.** J Am Med Inform Assoc. 2018 Mar 1;25(3):248-258. doi: 10.1093/jamia/ocx095. PMID: 29024976; PMCID: PMC7378879.
* Weber GM, Mandl KD, Kohane IS. [**Finding the Missing Link for Big Biomedical Data**](https://jamanetwork.com/journals/jama/article-abstract/1883026). *JAMA.* 2014;311(24):2479–2480. doi:10.1001/jama.2014.4228


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.bcbi.brown.edu/codiac-for-health/observational-research/research-pipeline.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
