BSOS Data Curation Fellowship

One of the biggest challenges for faculty within the social sciences teaching statistics and data science courses is developing data examples that are relevant, interesting, and structured. Using project-based learning and incorporating real-life datasets is invaluable for learning how exactly data science can be applied in the discipline, but the burden on faculty to clean, manage, and test the datasets is huge. The BSOS Data Repository was built as a response to calls from faculty across various BSOS departments about the need for high quality social science datasets. The University of Maryland Teaching and Learning Transformation Center (TLTC) Experiential Learning program-level grant gave us the opportunity to build this data repository and launch a new Data Curation Fellowship to provide a unique experience for undergraduate students to learn more about the upstream parts of the data project lifecycle and contribute to a real data resource that serves University of Maryland faculty.

Fellowship Overview

The BSOS Data Curation Fellowship was developed at the University of Maryland's College of Behavioral and Social Sciences to build up the data repository and give students the opportunity to learn to curate, clean, document, and publish social science datasets for teaching and research. Our fellowship teams work with real research data from psychology, public health, elections, and more, turning raw datasets into well-documented resources. The Data Curation Fellowship gives BSOS students hands-on experience in data management, documentation, and scholarly communication. Students learn industry-standard practices for data cleaning, quality assurance, and metadata creation while working on datasets that directly support faculty research and undergraduate instruction.

What students do: Identify relevant and interesting sources of raw data, collect and combine data sources, detect and handle missing values, standardize formats, create comprehensive codebooks, write dataset documentation, and publish datasets via CKAN for public access.

Faculty Mentors

Brian Kim

Brian Kim, PhD

Social Data Science, College of Behavioral and Social Sciences

University of Maryland, College Park

Role: Principal Investigator

Jacob J. Coutts

Jacob J. Coutts, PhD, MAS

Lecturer in Social Data Science and Psychology

University of Maryland, College Park

Faculty Mentor on the BSOS Data Curation Fellowship project.

Student Teams — Fellowship Cohorts, 2023–2026

Fellowship cohorts are listed in reverse chronological order so the current 2026 groups appear first. Cohort naming follows the original fellowship structure: 2023, 2025, and 2026 were organized as Groups, while 2024 was organized as Teams.

2026 Data Curation Fellowship

The current fellowship cohort is in progress. These groups are featured now so faculty and students can see the active curation tracks even before the final datasets are published into the repository.

Group 1

Rachel Carreras, Sophia Chau, Sahana Prasad

Current Focus
MIDUS Dataset New York State Dataset Environment & Flu Rates Dataset

Group 2

Sumedha Vankadara, Mondukpe Somakpo, Kashvi Tiwari

Current Focus
Causes of Deaths & Potential Factors Social Media Influence on Voting Behavior

Group 3

Okan Ulug-Berter, Lacie Hurst, Niyati Sharma

Current Focus
Ukraine Conflict Dataset Syria Conflict Dataset

2025 Data Curation Fellowship

The 2025 groups moved into larger integrated curation tracks: MIDUS, mortality and health factors, and election plus Maryland transportation work.

Group 1

Brynn Saffer, Danny Pham, Deena Habash

Worked On
MIDUS (Midlife in the United States) Dataset

Group 2

Faique Ameer, Jasmine Armoo, Tuongvy Ky

Worked On

Group 3

Maya Sarma, Olivia Durán, Johannes Tsigea

Worked On

2024 Data Curation Fellowship

The 2024 teams focused on mental health, crime, political contribution, and student wellbeing datasets, with each team handling multiple related projects.

Team 1

Gabrielle White, Michael Schwartz, Sarina Li

Worked On

Team 3

Orla Collins, Jaya Marella, Tanvi Schwartz

Worked On

2023 Data Curation Fellowship

The inaugural fellowship cohort split work across trust and nutrition data, public safety and civic behavior projects, and a student COVID-19 survey study.

Group 2

Michael Medeiros, Tony Shen, Adam Chazan

Worked On

Group 3

Callie Sullivan, Aya Hussein, Maryam Jameel

Worked On

Technical Implementation & Repository

Repository Architecture & Technical Implementation by Namit — The data curation platform runs on a CKAN-based repository with PostgreSQL backend, live analytics dashboards, and automated quality metrics. Namit designed and deployed the full platform infrastructure, including CKAN customizations, the analytics dashboard, deployment pipeline, and site integrations that enable students to publish and manage curated datasets at scale.

Curation Workflow

Workflow Figure — Coming Soon

A visual diagram of the data curation pipeline, from dataset discovery through cleaning, documentation, and publication, will be added here.

Usage Guidelines

  • Browse & Discover — Use the Groups page to explore datasets by thematic cluster or fellowship cohort.
  • Evaluate Quality — Each dataset page includes an Instructor Snapshot sidebar showing row/column counts, null percentages, and quality scores.
  • Download Data — CSV and JSON exports are available directly from dataset pages. Use the Data Preview to inspect records before downloading.
  • Read Documentation — Use the dataset notes, codebooks, and supporting files on each dataset page to understand provenance, variables, and curation decisions.
  • Explore Analytics — Visit the Interactive Dashboard for repository-wide trends, method breakdowns, and thematic coverage.

History & Origins

The first Data Curation Fellowship ran in Spring 2023, and the datasets that were curated as part of that fellowship formed the initial data repository. The current website hosted on CKAN was developed in 2024, with undergraduate students who were part of the 2024 fellowship contributing the maintaining the repository.

Funding & Support

The BSOS Data Repository and the Data Curation Fellowship were launched with support from the University of Maryland Teaching and Learning Transformation Center Experiemential Learning program-level grant. Continued support for the data repository and undergraduate fellowship come from the Social Data Science major.