Event box

Web Scraping with Python

Websites can be full of useful data that are not always downloadable or easily accessible. Rather than doing a manual copy/paste of a site, python allows you to access the raw HTML behind every webpage and automate the process of retrieving, structuring, and outputting data from pages across a domain. This workshop will cover identifying good candidates for scraping, discovering what data can be scrapped, and how python helps automate the process. Attendees are encouraged to bring in examples of sites they want to scrape as there may be some time to discuss individual projects.This class assumes a working knowledge of python (running code, installing libraries, etc) and familiarity with HTML structure. 

This workshop will use the Linux command line to run python code. While lab computers have Python IDLE installed, attendees can use personal laptops with any python environment to run code. 

Friday, February 24, 2017 Show more dates
1:30pm - 3:30pm
KBT C27 (Computer Classroom) Center for Science and Social Science Information (CSSSI)
Science Hill
  Center for Science and Social Science Information (CSSSI)     Data     Digital Scholarship     Digital Humanities     Programming     StatLab  
Registration has closed.

Event Organizer

Profile photo of Joshua Dull
Joshua Dull