Latest about COVID-19 and W&M's Path Forward.

Info for... William & Mary
William & Mary W&M menu close William & Mary

Finding Data for Research

Looking for data for a research project?  There is so much data to choose from that it can be overwhelming. To help students get started, we’ve created a list of commonly-used datasets that are accessible to most students, organized by the unit of analysis. If this list does not have what you’re looking for, scroll down for links to searchable websites. 

(To see PowerPoint slides from a Schroeder Center informational workshop on types of health and healthcare data available for use in economics and public policy research, please click here.)

Data on U.S. States

  • Centers for Disease Control and Prevention (CDC)

    -  Behavorial Risk Factor Surveillance System (BRFSS)
    Measures:  Alcohol consumption, cholesterol awareness, chronic health indicators, colorectal cancer screening, e-cigarette use, days of poor health, demographics, fruits/vegetables, health care access/coverage, health status, HIV/AIDS, hypertension awareness, immunization, injury, oral health, overweight/obesity, physical activity, prostate cancer, tobacco use, and women’s health (available years vary by indicator)

    -  Cancer Data
    Measures: Cancer incidence, cancer mortality, for all cancer and cancer by type (2017; 2013-2017)

    Diabetes Atlas
    Measures: Burden/magnitude, preventive care practices, health status/disability, risk factors for complications, end-stage renal disease, risk factors for diabetes (available years vary by indicator)

    Wonder Data
    Measures: Mortality, disability, heart disease and stroke, chronic diseases, communicable diseases, environmental health, health practice/prevention, injuries (available years vary by indicator)

    Youth Risk Behavior Survey (YRBS)
    Measures: Behaviors that contribute to unintentional injuries and violence, sexual behaviors, alcohol/other drug use, tobacco use, unhealthy dietary behaviors, and inadequate physical activity (available years vary by indicator)

  • Centers for Medicare & Medicaid Services (CMS)
    Measures: Medicare spending per beneficiary, state averages (quality measures, staffing, fine amount, number of deficiencies), unplanned hospital visits, inpatient psychiatric facility quality measure data, patient survey, payment, timely and effective care, ambulatory surgical center quality measures, complications/deaths, outpatient imaging efficacy, health care associated infections, patient survey PPS-exempt cancer hospital, outpatient/ambulatory consumer assessment of healthcare providers and systems, home healthcare, and dialysis facility (years vary by indicator)

  • Dartmouth Atlas of Health Care
    Measures: Medicare reimbursements, primary care access/quality measures, end-of-life chronic disease, mortality, hospital and post-acute care, health care for an aging population, prescription drug use (available years vary by indicator)

  • Kaiser Family Foundation (KFF): State Health Facts
    Measures:  Demographics and the economy, disparities, health costs/budgets, health coverage/uninsured, health insurance/managed care, health reform, health status, HIV/AIDS, Medicaid/CHIP, Medicare, providers/service use, and women’s health

  • National Cancer Institute: State Cancer Profiles
    Measures:  Demographics, screening and risk factors, incidence, prevalence, mortality (available years vary by indicator)

  • Schroeder Center: State Variable Longitudinal Data (Compiled from the U.S. Bureau of Labor Statistics Local Area Statistics Project, U.S. Census Bureau Small Area Income and Poverty Estimates, and U.S. Census Bureau Population and Housing Estimates for years 1980-2014)
    Measures: Population, age, gender, race/ethnicity, median household income, poverty level, and unemployment rate. 

Data on Virginia Counties/Cities

  • Centers for Disease Control

    Cancer Data
    Measures:  Cancer incidence and cancer mortality for all cancer and cancer by type (2017; 2013-2017)

    Wonder Data
    Measures: Mortality, disability, heart disease and stroke, communicable diseases, health practice/prevention, injuries (available years vary by indicator)

  • County Health Rankings: Virginia
    Measures:  Health outcomes (length of life, quality of life), health behaviors, clinical care, social/economic factors, and physical environment (2010-2020)

  • Dartmouth Atlas of Health Care
    Measures: Medicare spending (2003-2017), selected primary care access/quality measures (2003-2015), hospital post-discharge events (2004, 2009-2017)

  • National Cancer Institute: State Cancer Profiles
    Measures:  Demographics, screening and risk factors, incidence, mortality (available years vary by indicator)

  • Virginia Department of Health: Data Portal
    Measures:  Communicable diseases (2008-2018), demographics (1999-2016), health behaviors (available years vary by indicator), injury and violence (2016-2018), maternal and child health (2006-2017), environmental health (years depend on indicator), social determinants of health (poverty [2016] and unemployment [2000-2016])
    NOTE:  Each portal page provides guidance to researchers on how to access CSV and Excel files for their own use.

Data on Individuals/Households

  • Behavorial Risk Factor Surveillance System (BRFSS)
    Includes data from annual surveys of adults between 1985-2019. Survey measures include:  alcohol consumption, cholesterol awareness, chronic health indicators, colorectal cancer screening, e-cigarette use, days of poor health, demographics, fruits/vegetables, health care access/coverage, health status, HIV/AIDS, hypertension awareness, immunization, injury, oral health, overweight/obesity, physical activity, prostate cancer, tobacco use, and women’s health (available years vary by indicator)

  • Health and Retirement Study (HRS)
    Includes data from bienniel surveys of older adults. Survey measures include demographic traits, health status, health care utilization, health care costs, cognition, functional limitations, expectations, family structure, housing income, current and last job, job history, retirement and pension, social security, disability, health and life insurance, widowhood, divorce, internet use, physical measures, Covid-19
    NOTE:  W&M researchers interested in these data should contact Jennifer Mellor at jmmell@wm.edu
  • IPUMS Database: American Community Surveys (ACS)
    Includes data from the surveys of nearly 3 million persons each year from 2005-2019.  The ACS, developed by the U.S. Census Bureau, “provides an annual snapshot of the American population.”  This survey provides the Census Bureau with important data on sources of health insurance coverage and the uninsured in the U.S. population. Other information is available on family interrelationships, demographic, race/ethnicity, education, work, income, socioeconomic, migration, activity five years ago, disability, veteran status, and place of work, among others.  To get the data, visit https://usa.ipums.org/usa-action/variables/group

  • Medical Expenditure Panel Survey (MEPS)
    Includes data from “surveys of families and individuals, their medical providers (doctors, hospitals, pharmacies, etc.), and employers across the United States. MEPS collects data on the specific health services that Americans use, how frequently they use them, the cost of these services, and how they are paid for, as well as data on the cost, scope, and breadth of health insurance held by and available to U.S. workers.”

  • Medicare Current Beneficiary Survey (MCBS)
    Includes data under restricted use from "a continuous, multipurpose survey of a nationally representative sample of the Medicare population, conducted by the Office of Enterprise Data and Analytics (OEDA) of the Centers for Medicare & Medicaid Services (CMS) through a contract with NORC at the University of Chicago."  Available measures include expenditures and sources of payment for all services used by Medicare beneficiaries (co-payments, deductibles, and non-covered services); all types of health insurance coverage; and outcomes over time (health status  and the impacts of Medicare program changes on satisfaction with care).  
    NOTE:  W&M researchers interested in these restricted data should contact Jennifer Mellor at jmmell@wm.edu

  • National Longitudinal Study of Adolescent to Adult Health (Add Health)
    Is a "longitudinal study of a nationally representative sample of over 20,000 adolescents who were in grades 7-12 during the 1994-95 school year, and have been followed for five waves to date, most recently in 2016-18."  Public use data are available for waves 1-4 from three different sources.  Add Health datasets for public use "contain all the survey data from In-Home Interviews but only for a subset of the full Add Health sample....Public-use data doesn’t contain ID numbers of friends, siblings or romantic partners, nor does it contain files on Obesity and Neighborhood Environment, genetics, disposition, political context and alcohol density."  

  • National Longitudinal Surveys (NLS)
    Is a set of surveys, sponsored by the U.S. Bureau of Labor Statistics, which is "designed to gather information at multiple points in time on the labor market activities and other significant life events of several groups of men and women."  Public use data are available for the following seven cohorts:  
    National Longitudinal Survey of Youth 1997 (NLSY97), National Longitudinal Survey of Youth 1979 (NLSY79), NLSY79 Child and Young Adult, Older Men, Mature Women, Young Men, and Young Women.

  • Virginia Health Information (Virginia Hospitalizations): Patient Level Data
    Includes patient level information such as  demographic, , clinical and financial information for every discharge that occurs in Virginia hospitals (most years from 2008-2015)
    NOTE:  W&M researchers interested in these data should contact Jennifer Mellor at jmmell@wm.edu.

Are you looking for something different? Here are some places to try:

  • healthdata.gov -- Contains information on over 4,700 health-related datasets, and searchable by keyword, agency, or dataset name.
  • Health Data– Maintained by the U.S. Census Bureau with links to data on health insurance, disability employment, health care industries, disability, fertility, HIV/AIDS, small area health insurance estimates, among others.
  • Health Statistics & Data: Datasets/Raw Data -- Maintained by the University of California at Berkeley, with links to national and international datasets as well as California datasets.
  • ICPSR – Maintained by the University of Michigan, with the ability to browse data by topic, series title/description, thematic data collections, geography, restriction type, data format, time period, funding agency, among other things.
  • Statistics & Data Sets for Health Care and Public Health – Maintained by Dartmouth with links to national, state, and local datasets as well as datasets specific to health spending, utilization, and quality of care; food and drugs; and workplace injuries. Additional links to global health data and disease/condition data, and interdisciplinary health data.