- This event has passed.
PDF & Wed Scraping | Houston R Users Meetup
January 8 @ 7:00 pm - 8:30 pm CST
Register here! https://bit.ly/1T7YWbB
We’ll have two talks on this theme:
The web scraping portion of the evening will focus on using the rvest and XML packages to pull data from a variety of web sources. A basic introduction to the topic will be followed by discussion of the few most central commands and examples of scraping various website types into clean data sets. Please feel free to bring sites or sources you are interested in scraping.
For the pdf scraping portion will focus on extracting information from machine readable PDFs and cleaning the raw document to isolate specific data. The talk will cover an overview of the topic, use cases, and cleaning the raw data. Additionally, we will work through examples, scraping actual PDFs.