Event box

Working with Hierarchical and Web-based Data In-Person

This intermediate-level workshop equips researchers with essential skills for collecting, managing, and analyzing web-based data through APIs and hierarchical data structures. Participants will learn to construct and execute API requests to retrieve data from web services, authenticate with OAuth protocols, and convert semi-structured formats like JSON and XML into rectangular datasets suitable for analysis in R. Using modern R tools including httr2, jsonlite, and the tidyverse, researchers will develop robust workflows for cleaning, manipulating, and merging datasets from multiple web sources. The workshop emphasizes practical applications with digital trace data, covering rate limiting strategies, error handling, and best practices for storing and visualizing large web datasets. Participants will work with real APIs including the World Bank Development Indicators, New York Times Article Search, and Reddit APIs to gain hands-on experience with contemporary data collection methods. For full participation in the hands-on exercises, attendees should register for a free New York Times developer account prior to the workshop. Basic familiarity with R and data manipulation using the tidyverse is recommended.

Date:
Wednesday, October 29, 2025
Time:
3:00pm - 5:00pm
Time Zone:
Eastern Time - US & Canada (change)
Location:
RKZ Library Classroom 01
Campus:
Science Hill
Categories:
  StatLab  

Registration is required. There are 15 seats available.

Event Organizer

Ted Ellsworth