Web Scraping with Python

Event box

Web Scraping with Python In-Person

This workshop is a first introduction to web scraping using Python with Spyder and covers the basic approach to most web scrapping

ebsites can be full of useful data that are not always downloadable or easily accessible. Rather than doing a manual copy/paste of a site, python allows you to access the raw HTML behind every webpage and automate the process of retrieving, structuring, and outputting data from pages across a domain.

Software Used: Python, Spyder, Web Browser

Topics

This workshop will cover:

identifying websites for web scraping
automating scraping with Python
scraping HTML tables
scraping paginated search results
exporting results

During this Workshop

Participants will be able to download materials for the workshop and are encouraged to listen, practice, and interact as it suits their learning style.

The sessions will conclude with exercises with instructional staff available for help.

Preparing for this Workshop

his workshop is designed for attendees who have the following:

familiarity with HTML structure
working knowledge of python (running scripts, installing libraries, etc)
if using a lab computer:
- a Yale NetID
- experience with Windows OS
if using personal laptop:
- Python 3 installed (Anaconda or Spyder recommended)
- 'BeauitifulSoup' & 'requests' for python installed

In order to provide the best learning experience possible, we ask that you arrive before the scheduled start-time of the workshop. Arriving more than 5 minutes late may make it difficult to catch up with the other participants.

Please email joshua.dull@yale.edu with any questions about this workshop.

This workshop will not be recorded, but session materials will be available upon request.

Date:: Friday, November 9, 2018
Time:: 1:30pm - 4:00pm
Time Zone:: Eastern Time - US & Canada (change)
Location:: KBT C27 (Computer Classroom) Center for Science and Social Science Information (CSSSI)
Campus:: Science Hill
Categories:: Miscellaneous Digital Humanities StatLab Computer Programming

Registration has closed.

Browse/Search for more events

Event Organizer

Joshua Dull