Looking for a clinical data science project? Here are some free data sets that might be of use (in no particular order):
Human Physiological and Neuroimaging Data
- Cam-CAN (Cambridge Centre for Ageing Neuroscience) dataset: fMRI, MRI, MEG, and behavioral human neurodata on people of various ages (i.e., cross-sectional data)
- Physionet: A wide array of clinically relevant physiological recordings (e.g., sleep staging EEG/EMG data, ECG data) as well as some software
- ieeg.org: A large database of EEG and intracranial EEG (iEEG) data from patients and animals suffering from epilepsy
- Kaggle iEEG epilepsy competitions:
- Wrist sensor data for heart rate estimation
- Miller & Ojemann’s library of human electrocorticographic data and analyses: Free database of ECoG data acquired during various tasks (e.g., motor, memory, vision)
- Arnaud Delorme’s list of public EEG datasets
- ABIDE autism neuroimaging databases: MRI and fMRI data from individuals diagnosed with autism and control participants
- PAMAP2 Physical Activity Monitoring Data set: Contains data of 18 different physical activities (such as walking, cycling, playing soccer, etc.), performed by 9 subjects wearing 3 inertial measurement units and a heart rate monitor. The dataset can be used for activity recognition and intensity estimation, while developing and applying algorithms of data processing, segmentation, feature extraction and classification.
- The Human Connectome Project: fMRI, MRI, DTI, & MEG data from 900 individuals (some across several years I believe)
- OMEGA (The Open MEG Archive): Magnetoencephalogram database hosted by the Montreal Neurological Institute
- Heart Disease Data Set: 76 attributes of individuals that should help detect heart disease
Health Care Data:
- Medicare health outcomes survey: Measures the ‘physical and mental health and well-being’ of beneficiaries for a 2 year period. The data set covers recipients from 1998-2014.
- Medical expenditure panel survey: From the agency for Healthcare Research and Quality.
- Behavioral risk factor surveillance system: American adult health behaviors collected by the Center for Disease Control
- OpenFDA: The FDA’s open data platform. Includes things such as databases on medical devices.
- A Dutch Hospital’s Event Log and a Dutch sepsis patient event log: The former is from the 2011 Business Processing Intelligence Challenge. Winning analyses are posted for that competition as well.
- Anonymous medical records
- U.S. Food and Drug Administration records on www.data.gov: Includes things like lists of FDA approved therapeutic products, their marketing application, and active ingredients
Note, most of these lattermost links are courtesy of the Data Incubator.