Event box

Web Scraping with Python

This workshop is a first introduction to web scraping using Python with Spyder and covers the basic approach to most web scrapping 

ebsites can be full of useful data that are not always downloadable or easily accessible. Rather than doing a manual copy/paste of a site, python allows you to access the raw HTML behind every webpage and automate the process of retrieving, structuring, and outputting data from pages across a domain.

Software Used: Python, Spyder, Web Browser

Topics

This workshop will cover:

  • identifying websites for web scraping
  • automating scraping with Python
  • scraping HTML tables
  • scraping paginated search results
  • exporting results

During this Workshop

Participants will be able to download materials for the workshop and are encouraged to listen, practice, and interact as it suits their learning style.

The sessions will conclude with exercises with instructional staff available for help.

Preparing for this Workshop

his workshop is designed for attendees who have the following:

  • familiarity with HTML structure
  • working knowledge of python (running scripts, installing libraries, etc)
  • if using a lab computer:
    • a Yale NetID
    • ‚Äčexperience with Windows OS 
  • if using personal laptop:

In order to provide the best learning experience possible, we ask that you arrive before the scheduled start-time of the workshop. Arriving more than 5 minutes late may make it difficult to catch up with the other participants.

Please email joshua.dull@yale.edu with any questions about this workshop.

This workshop will not be recorded, but session materials will be available upon request.

Date:
Friday, November 9, 2018
Time:
1:30pm - 4:00pm
Location:
KBT C27 (Computer Classroom) Center for Science and Social Science Information (CSSSI)
Campus:
Science Hill
Categories:
  Data     Digital Scholarship     Programming     StatLab     python  
Registration has closed.

Event Organizer

Profile photo of Joshua Dull
Joshua Dull