Database research can be a great addition to any research toolkit. At a fundamental level, a database research project includes, accessing large datasets that were generated by someone else, searching it for something interesting, conducting an analysis, and writing up the results for publication. There are of course pros and cons of database research.
- Existing data eliminates expensive and time-consuming data collection, the worry
of low response rates, and low statistical power.
- Typically, there is no need for complicated IRB applications.
- Database research studies are not as time sensitive as other studies where you
may need to collect data over an extended period of time.
- Generally, database research can be conducted in a shorter timeframe than other
- Publicly available datasets can be large and intimidating to download, process,
- The database may require specialized, or even custom computer programs.
- The database developer’s documentation of what data was collected, and how it
was collected, can be massive, complicated, and sometimes incomplete.
The value of developing skills to conduct database research is significant. You can use the practice of database research to further build your own knowledge of material that is of value to you as a professional. In addition, the reduced timeframe and flexibility of database research can advance your research productivity.
Note: As augments to the content on this page, you can view other content and videos that are available on my site. On the Exploring Statistics page, there is a video on the basics of SPSS. This is an introductory video that will teach you to import data into SPSS, understand the variables within SPSS, and execute basic computations such as descriptives, as well as correlation, linear regression, and two variable analyses. On the Conducting Research page, you can view a presentation on writing quality research questions, as well as a video on getting started in research.
The articles and presentation below provide an overview of the database research experience.
The following videos provide brief overviews of various databases. By the end of each video, you should be able to access the data, export the data to an external program (e.g., Excel, SPSS, SATA, etc.), and begin exploring the data to generate a research question.
An eight minute video that provides an overview of accessing and exporting data from the Centers for Disease Control and Prevention (CDC) website.
An 11 minute video that provides an overview of accessing and exporting data from the Wharton Research Data Services (WRDS) website.
An eight minute video that provides an overview of accessing and exporting data from the Pew Research Center website.
A brief 10-minute overview of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (NCI). It is a source of information on cancer incidence and survival in the United States.
A more in-depth (120 minute) video presented by the NCI, on the SEER Program and SEER*Stat (Note: topics presented at the following timepoints: SEER Introduction from minute 1 to minute 15, Sample study from minute 15 to minute 20, SEER*Stat introduction from minute 20 to minute 49, SEER*Stat demonstration from minute 50 to end of presentation).
A sixteen minute video that provides an overview of the past, present, and future of NHANES. An important video for describing the dataset in a manuscript Method section.
A seven minute video that provides an introduction to exploring the NHANES database.
Often times, a difficult task in database research is merging data files. This eight minute video provides an overview of merging NHANES data files to SPSS.
The National Center for Health Statistics (NCHS) covers a vast amount of American health data – categorized by subject.
Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world.
The National Center for Educational Statistics (NCES)
The National Trauma Data Bank® (NTDB®) is the largest aggregation of U.S. trauma registry data ever assembled.
The National Health and Nutrition Examination Survey (NHANES) - Highly Recommended!
The National Electronic Injury Surveillance System (NEISS)
The Surveillance, Epidemiology, and End Results (SEER) Database
The National Ambulatory Medical Care Survey (NAMCS) Database
The HCUP family of administrative longitudinal databases contains encounter-level information on inpatient stays, emergency department visits, and ambulatory surgery in U.S. hospitals.