ISSN: 3143-164X · DOI: 10.XXXXX/rehs · Vol. 1, Issue 1 · 2025 · Published by University & Beyond
University & Beyond ↗ PATH 2026 ↗ Submit Manuscript rehs@unb.college
REHS Journal
Research Experience for High School Students
Published by University & Beyond
Open Access CC BY 4.0 · Double-Blind Peer Review · COPE Compliant
Volume 1 · Issue 1 · 2025 · ISSN: 3143-164X · DOI: 10.XXXXX/rehs
✦ REHS Research Infrastructure · Open Data · 80+ Curated Sources

Research Datasets for
High School Scholars

Explore 80+ curated, freely accessible datasets spanning federal government repositories, machine learning archives, and discipline-specific collections — everything you need to power original research across science, social sciences, health, economics, and beyond.

✅ 100% Free Access 🏛️ US Federal Gov 🤖 UC Berkeley ML 🌍 International 🗺️ State & City Data 🔬 20+ Disciplines 📊 Structured & Raw 🔗 Direct Links
🔍 Explore Datasets → Submit Your Research 📚 Research Training
ℹ️ How to use this page: All datasets listed here are publicly available at no cost. Click any dataset name to visit the source. Use the search bar and discipline filters to find data relevant to your research question. Always cite datasets properly in your manuscript — see our Author Guidelines for citation format.
80+
Curated Sources
20+
Disciplines
15+
Federal Agencies
50+
US States w/ Portals
FREE
All Access
API
Many Have APIs
🔍
Try: "iris", "climate", "census", "titanic", "health", "economics", "wine"
Most Recommended for Student Research

⭐ Featured Starting Points


🤖
Machine Learning & Computer Science Repositories
UC Berkeley ML Library, Kaggle, Hugging Face, OpenML — curated for CS, AI, and data science research
12 Sources
🎓
UC Irvine / UC Berkeley Affiliation
The most cited ML dataset collection in academic research — 650+ datasets spanning classification, regression, clustering, and time series. Includes Iris, Adult Income, Wine Quality, Breast Cancer Wisconsin, and hundreds more. Ideal for any CS or data science project.
Computer ScienceMath/StatsMulti-disciplineFreeAPI
📦
Open Machine Learning Community
3,000+ datasets for reproducible machine learning experiments. Searchable by domain, size, and task type. Integrates with Python (scikit-learn), R, and WEKA. Great for benchmarking algorithms.
Computer ScienceStatisticsFreeAPI
🏆
Kaggle / Google
50,000+ community-uploaded datasets across virtually every topic — sports, finance, health, language, images. Many come with starter notebooks in Python/R. Free account required.
Data ScienceMulti-disciplineFree
🤗
Hugging Face
80,000+ datasets for NLP, computer vision, audio, and multimodal AI research. Essential for natural language processing projects — text classification, translation, question answering, and more.
AI / NLPComputer VisionFreeAPI
📓
Papers With Code
Research-grade datasets linked directly to published papers. Ideal for replication studies or building on prior research. Covers image, text, audio, video, and structured data.
ML ResearchMulti-disciplineFree
🌐
Stanford University
Large-scale social and information network datasets — Twitter, Facebook, Reddit, citation networks, road networks. Perfect for graph theory, social network analysis, and computational social science.
Networks / GraphsFree
🖼️
Stanford Vision Lab
14 million labeled images across 20,000 categories. The benchmark dataset for computer vision and image classification. Access subsets for student projects via official download portal.
Computer VisionDeep LearningFree (registration)
🔢
Community Curated / GitHub
A community-maintained mega-list of 500+ free public datasets organized by topic — agriculture, biology, climate, economics, finance, government, health, machine learning, and more.
All DisciplinesFree
🧠
Google Research
A search engine specifically for datasets — indexes millions of datasets published on the web. The fastest way to find datasets on any topic from any institution worldwide.
All DisciplinesFree

🏛️
US Federal Government Open Data
Census Bureau, CDC, NIH, FDA, NASA, USDA, EPA, BLS, BEA, and more — authoritative, peer-citable government datasets
20 Sources
🇺🇸
US General Services Administration
The official US government open data portal — 300,000+ datasets from all federal departments. Covers agriculture, climate, education, energy, finance, health, safety, science, and transportation.
All DisciplinesFreeAPI
👥
US Census Bureau
Population, housing, economic surveys, the American Community Survey (ACS), and decennial census data. Downloadable to CSV/Excel. Essential for sociology, urban planning, economics, public health, and immigration research.
EconomicsFreeAPI
🦠
Centers for Disease Control & Prevention
Mortality, natality (births), cancer, STI, environmental health, and vaccination data. Query by state, county, age, sex, race, and cause of death. Perfect for public health, epidemiology, and healthcare policy research.
Public HealthFree
🏥
US Department of Health & Human Services
125+ high-value health datasets from HHS including hospital quality, Medicare claims, HRSA, SAMHSA mental health data, COVID-19 datasets, and community health indicators.
HealthcareFreeAPI
🧬
National Center for Biotechnology Information / NIH
Genomes, genes, proteins, taxonomy, and clinical trial data. Download genome sequences, annotated gene datasets, and protein structures. Essential for molecular biology, genetics, and bioinformatics projects.
BiologyGeneticsBiomedicalFreeAPI
💊
US Food & Drug Administration
Drug adverse events, drug labels, recall enforcement, device adverse events, and food enforcement records. Full API access. Great for pharmacology, public health, and consumer safety research.
PharmacologyConsumer SafetyFreeAPI
🍎
US Department of Agriculture
Comprehensive nutritional data for 300,000+ foods — branded products, survey foods, and experimental data. Ideal for nutrition science, public health, and agricultural research.
NutritionAgricultureFreeAPI
🌍
Environmental Protection Agency
Air quality (AQS), water quality, toxic release inventory, Superfund site data, greenhouse gas emissions, and environmental justice data. Essential for environmental science and climate research.
Environmental ScienceClimateFreeAPI
🌡️
National Oceanic & Atmospheric Administration
Weather station data, climate normals, sea surface temperature, hurricane tracks, tornado records, snow/ice data, and more — dating back to the 1800s. Downloadable CSV format. Perfect for climate trend analysis.
ClimatologyMeteorologyGeoscienceFree
🚀
National Aeronautics & Space Administration
Earth observation, climate satellite data, exoplanet catalogs, asteroid tracking, solar activity, and mission telemetry. Includes APIs for real-time space weather and Earth imagery.
AstronomyEarth ScienceFreeAPI
💼
US Bureau of Labor Statistics
Unemployment, Consumer Price Index, Producer Price Index, employment by industry/occupation, wages, and productivity. Essential for macroeconomics, labor economics, and public policy research.
EconomicsFreeAPI
📈
Federal Reserve Bank of St. Louis
800,000+ economic time series — GDP, inflation, interest rates, housing, banking, and international data. The go-to source for economics and finance research. Excellent charts, downloadable data, and free API.
MacroeconomicsFinanceFreeAPI
⚖️
US Department of Justice
Crime, incarceration, court, policing, and victimization data. NCVS (National Crime Victimization Survey) and UCR data. Essential for criminology, sociology, and public policy research.
Free
🏫
National Center for Education Statistics
School quality, achievement gaps, dropout rates, college enrollment, financial aid, and longitudinal studies of students from K–12 through higher education. Perfect for education policy and sociology research.
Free
🌿
US Geological Survey
Earthquake, volcano, water resources, topographic, mineral, wildlife, and land use data. Excellent for geology, hydrology, ecology, and natural hazard research.
GeologyHydrologyEcologyFreeAPI
US Energy Information Administration
US and international energy production, consumption, prices, and CO₂ emissions — electricity, petroleum, natural gas, coal, renewables. Full API access. Great for energy and environmental policy research.
EnergyClimateEconomicsFreeAPI
🩺
National Library of Medicine / NIH
500,000+ clinical studies worldwide — trial design, enrollment, outcomes, intervention types. Useful for medicine, pharmacology, psychology, and public health research. Full API access available.
MedicinePharmacologyFreeAPI
🐾
IUCN / USFWS
Species threat assessments, geographic range data, population trends, and extinction risk for 150,000+ species. Ideal for conservation biology, ecology, and biodiversity research.
Conservation BiologyEcologyFree (registration)

🗺️
State & City Government Open Data Portals
All 50 states and major cities maintain open data portals — locally relevant, highly actionable for regional research projects
15+ Portals
🌎
Data.gov / GSA
A directory of all 50 US state open data portals in one place. Find your state's official data portal for locally relevant datasets on health, transportation, environment, education, and crime.
All DisciplinesFree
🏙️
City of New York
2,000+ datasets from all NYC agencies — 311 complaints, subway ridership, restaurant inspections, school quality, NYPD crime statistics, air quality, and real estate. Ideal for urban studies, public health, and social science.
Public HealthEnvironmentFreeAPI
🌉
City & County of San Francisco
500+ datasets including housing, homelessness, traffic, crime, fire incidents, and business registration. Excellent for urban policy, housing economics, and public safety research.
HousingFreeAPI
🎰
State of California
California's official open data repository — agriculture, environment, health, transportation, education, and economic data. One of the most comprehensive state portals in the US.
Multi-disciplineEnvironmentFreeAPI
🍎
State of Texas
400+ datasets covering Texas government, agriculture, energy, health, and education. Particularly strong for energy production and agricultural research.
Multi-disciplineEnergyFree
🌿
State of New Jersey
NJ government data — environment, health, transportation, education, crime, and economic development. Especially relevant for REHS Journal's NJ-based student community.
Multi-disciplineEnvironmentFreeAPI
🏛️
City of Chicago
800+ datasets — crime, permits, food inspections, public health, transportation, and government spending. Widely used in academic research as a model open-data city.
Public HealthFreeAPI
☀️
Miami-Dade County, Florida
Community, environmental, transportation, property, and public safety data for the Miami-Dade region — a major population and immigration hub excellent for demographic and environmental research.
EnvironmentFree
🌲
State of Washington
Environment, agriculture, fisheries, transportation, and health data — particularly strong in salmon ecology and Pacific Northwest environmental datasets.
EcologyFisheriesFree
🔵
City of Boston
Transportation, crime, housing, schools, and health datasets for Boston — including property assessment data, 311 service requests, and hospital locations. Good for urban health research.
Public HealthFreeAPI

🔬
Discipline-Specific Research Repositories
Curated by field — Biology, Chemistry, Physics, Psychology, Economics, History, Environment, and more
20 Sources
🧬
European Bioinformatics Institute
Functional genomics data — gene expression experiments, epigenomics, and transcriptomics. Over 100,000 experiments. Great for molecular biology and genetics research involving RNA/DNA analysis.
GenomicsMolecular BiologyFreeAPI
🦋
GBIF Secretariat / International
2.3 billion occurrence records of plants, animals, fungi, and microbes from 100+ countries. Searchable by species, location, date. Perfect for ecology, conservation, and evolution research.
EcologyBiodiversityConservationFreeAPI
❤️
MIT / Beth Israel Deaconess Medical Center
De-identified health data from ~40,000 ICU patients — vital signs, medications, lab results, diagnoses, procedures. Ideal for biomedical research, ML in healthcare, and physiology projects. Free with CITI training.
Clinical MedicineML in HealthcareFree (training req.)
📊
Pew Research Center
Survey datasets on political opinions, social trends, religion, media, international attitudes, and demographics. Great for political science, sociology, and communications research.
Free (registration)
🗳️
MIT Election Data & Science Lab
County- and state-level electoral data going back decades — vote totals, turnout, candidacy information, and election administration data. Essential for political science research.
Free
🌏
World Bank Group
Development indicators for 200+ countries — GDP, poverty, education, health, gender equality, infrastructure, and environment. Excellent for international economics, development studies, and global health research.
Development EconomicsFreeAPI
⚛️
CERN (European Organization for Nuclear Research)
Real particle collision data from the Large Hadron Collider — LHC experimental data, software, and documentation. Unique opportunity for advanced physics research projects involving particle detection and analysis.
Particle PhysicsData AnalysisFree
🧠
Open Psychometrics
Raw personality test data from millions of respondents — Big Five, Dark Triad, MBTI-like tests, and more. Excellent for psychology research on personality, behavior, and social attitudes.
Free
💹
Yahoo Finance / Nasdaq
Historical stock prices, financial statements, and market data for tens of thousands of companies. Download CSV directly for finance, economics, and behavioral finance research.
FinanceBehavioral EconomicsFree (basic)
🌊
NASA Goddard Space Flight Center
Satellite-derived ocean color, sea surface temperature, and chlorophyll concentration data. Ideal for oceanography, marine biology, and climate change impact research on ocean ecosystems.
OceanographyMarine BiologyFree
📚
NORC at the University of Chicago
Since 1972 — the most influential US social survey. Tracks social trends in opinions, behaviors, and demographics. Covers politics, religion, race, gender, family, and more. 6,000+ variables.
Free
🗺️
OpenStreetMap Foundation
Free geographic data for the entire world — roads, buildings, land use, parks, water bodies, and points of interest. Downloadable by region. Excellent for GIS, urban planning, and spatial analysis projects.
Geography/GISFree
🌱
NASA Earth Science Data Systems
Satellite imagery and Earth observation data — land cover, vegetation index (NDVI), soil moisture, atmospheric composition, and sea ice extent. Includes tutorials for student researchers.
Remote SensingClimateFree (account req.)
📰
GDELT Project (Google Ideas)
Monitors broadcast, print, and online news across 100 languages — catalogues events, sentiments, and trends worldwide. Great for international relations, media studies, political science, and conflict research.
NLPFree

🌍
International & Intergovernmental Data
WHO, UN, OECD, IMF, World Bank, EU — authoritative global datasets for international comparative research
10 Sources
🏥
World Health Organization
Global health indicators — mortality, disease burden, health system capacity, nutrition, and mental health across 194 countries. The primary source for global public health and epidemiology research.
Global HealthFreeAPI
🌐
United Nations Statistics Division
UN databases on population, trade, environment, human development, gender equality, and the SDGs (Sustainable Development Goals). Essential for international relations and global studies.
DevelopmentFree
💰
International Monetary Fund
Balance of payments, government finance, monetary and financial statistics for 190+ countries. Essential for international economics, macroeconomics, and financial systems research.
International EconomicsFinanceFreeAPI
📋
Organisation for Economic Co-operation & Development
Economic, social, and environmental statistics for 38 OECD member countries — education, health, employment, trade, science, and well-being indicators. Often used for cross-country policy comparisons.
EconomicsFreeAPI
🌡️
EU Copernicus Programme / ECMWF
European and global climate datasets — ERA5 reanalysis, seasonal forecasts, land surface, and ocean data. Among the highest-quality climate datasets available worldwide, free for research use.
ClimatologyAtmospheric ScienceFree (registration)
📖
Global Change Data Lab / Oxford
Beautifully curated global data on health, poverty, education, inequality, energy, and environment — all freely downloadable as CSV. Every chart links to its underlying data. Perfect for student researchers starting a new topic.
All DisciplinesFree
📜
UN High Commissioner for Refugees
Refugee, asylum seeker, internally displaced, and stateless population data by country of origin and asylum — going back to 1951. Vital for migration studies, international relations, and human rights research.
FreeAPI

For REHS Journal Researchers

📋 How to Use These Datasets in Your Research

01
Identify Your Research Question
Start with a specific, testable question — not "what affects health" but "does air quality index correlate with pediatric asthma hospitalization rates in NJ counties?" Then pick the matching dataset category.
02
Download & Explore the Data
Download CSV or Excel files. Open in Google Sheets, Excel, or Python (pandas). Look at column names, check for missing values, understand the unit of measurement for each variable.
03
Cite the Dataset Properly
Every dataset must be cited in your manuscript. Include: author/agency, title, version, year, and URL or DOI. Follow REHS Author Guidelines for dataset citation format in your reference list.
04
Describe Your Data Collection
In your Methods section, describe how the data was collected by the original source, what it covers (time range, geography, sample size), and any limitations or known biases in the dataset.
05
Conduct Your Analysis
Use appropriate statistical methods — descriptive statistics, correlation, regression, or machine learning. Free tools: R, Python (Google Colab), JASP, SPSS (student license), Excel, or Tableau Public.
06
Submit to REHS Journal
Once your analysis is complete and your manuscript written, submit to REHS Journal for peer review. Dataset-based research is one of our most common and successful submission types.
📤 Submit Your Dataset Research → 📚 Take the Research Training Module Author Guidelines
PATH 2026
Pathways in American Higher Education — August 8, 2026 · Edison, NJ
University & Beyond presents PATH 2026 — a free university exhibition connecting students on dependent visas with 40+ universities including NYU, Rutgers, Drexel, and Rowan. Expert panels on in-state tuition, visa transitions, and financial aid.
Learn More & Register Free →