BSOS Data Curation Fellowship
One of the biggest challenges for faculty within the social sciences teaching statistics and data science courses is developing data examples that are relevant, interesting, and structured. Using project-based learning and incorporating real-life datasets is invaluable for learning how exactly data science can be applied in the discipline, but the burden on faculty to clean, manage, and test the datasets is huge. The BSOS Data Repository was built as a response to calls from faculty across various BSOS departments about the need for high quality social science datasets. The University of Maryland Teaching and Learning Transformation Center (TLTC) Experiential Learning program-level grant gave us the opportunity to build this data repository and launch a new Data Curation Fellowship to provide a unique experience for undergraduate students to learn more about the upstream parts of the data project lifecycle and contribute to a real data resource that serves University of Maryland faculty.
Fellowship Overview
The BSOS Data Curation Fellowship was developed at the University of Maryland's College of Behavioral and Social Sciences to build up the data repository and give students the opportunity to learn to curate, clean, document, and publish social science datasets for teaching and research. Our fellowship teams work with real research data from psychology, public health, elections, and more, turning raw datasets into well-documented resources. The Data Curation Fellowship gives BSOS students hands-on experience in data management, documentation, and scholarly communication. Students learn industry-standard practices for data cleaning, quality assurance, and metadata creation while working on datasets that directly support faculty research and undergraduate instruction.
What students do: Identify relevant and interesting sources of raw data, collect and combine data sources, detect and handle missing values, standardize formats, create comprehensive codebooks, write dataset documentation, and publish datasets via CKAN for public access.
Faculty Mentors
Brian Kim, PhD
Social Data Science, College of Behavioral and Social Sciences
University of Maryland, College Park
Role: Principal Investigator
Jacob J. Coutts, PhD, MAS
Lecturer in Social Data Science and Psychology
University of Maryland, College Park
Faculty Mentor on the BSOS Data Curation Fellowship project.
Student Teams — Fellowship Cohorts, 2023–2026
Fellowship cohorts are listed in reverse chronological order so the current 2026 groups appear first. Cohort naming follows the original fellowship structure: 2023, 2025, and 2026 were organized as Groups, while 2024 was organized as Teams.
2026 Data Curation Fellowship
The current fellowship cohort is in progress. These groups are featured now so faculty and students can see the active curation tracks even before the final datasets are published into the repository.
Group 1
Rachel Carreras, Sophia Chau, Sahana Prasad
Current FocusGroup 2
Sumedha Vankadara, Mondukpe Somakpo, Kashvi Tiwari
Current FocusGroup 3
Okan Ulug-Berter, Lacie Hurst, Niyati Sharma
Current Focus2025 Data Curation Fellowship
The 2025 groups moved into larger integrated curation tracks: MIDUS, mortality and health factors, and election plus Maryland transportation work.
Group 1
Brynn Saffer, Danny Pham, Deena Habash
Worked OnGroup 3
Maya Sarma, Olivia Durán, Johannes Tsigea
Worked On2024 Data Curation Fellowship
The 2024 teams focused on mental health, crime, political contribution, and student wellbeing datasets, with each team handling multiple related projects.
Team 1
Gabrielle White, Michael Schwartz, Sarina Li
Worked OnTeam 2
Ishita Singh, Abigail Mor, Katy Lamb
Worked OnTeam 3
Orla Collins, Jaya Marella, Tanvi Schwartz
Worked On2023 Data Curation Fellowship
The inaugural fellowship cohort split work across trust and nutrition data, public safety and civic behavior projects, and a student COVID-19 survey study.
Group 1
Jane Park, Victor Su, Josephine Whittington
Worked OnGroup 2
Michael Medeiros, Tony Shen, Adam Chazan
Worked OnGroup 3
Callie Sullivan, Aya Hussein, Maryam Jameel
Worked OnTechnical Implementation & Repository
Repository Architecture & Technical Implementation by Namit — The data curation platform runs on a CKAN-based repository with PostgreSQL backend, live analytics dashboards, and automated quality metrics. Namit designed and deployed the full platform infrastructure, including CKAN customizations, the analytics dashboard, deployment pipeline, and site integrations that enable students to publish and manage curated datasets at scale.
Curation Workflow
Workflow Figure — Coming Soon
A visual diagram of the data curation pipeline, from dataset discovery through cleaning, documentation, and publication, will be added here.
Usage Guidelines
- Browse & Discover — Use the Groups page to explore datasets by thematic cluster or fellowship cohort.
- Evaluate Quality — Each dataset page includes an Instructor Snapshot sidebar showing row/column counts, null percentages, and quality scores.
- Download Data — CSV and JSON exports are available directly from dataset pages. Use the Data Preview to inspect records before downloading.
- Read Documentation — Use the dataset notes, codebooks, and supporting files on each dataset page to understand provenance, variables, and curation decisions.
- Explore Analytics — Visit the Interactive Dashboard for repository-wide trends, method breakdowns, and thematic coverage.
History & Origins
The first Data Curation Fellowship ran in Spring 2023, and the datasets that were curated as part of that fellowship formed the initial data repository. The current website hosted on CKAN was developed in 2024, with undergraduate students who were part of the 2024 fellowship contributing the maintaining the repository.
Funding & Support
The BSOS Data Repository and the Data Curation Fellowship were launched with support from the University of Maryland Teaching and Learning Transformation Center Experiemential Learning program-level grant. Continued support for the data repository and undergraduate fellowship come from the Social Data Science major.