Previous articles
Web Scraping for Fun and Profit: A Primer
Web Scraping for Fun and Profit: Part 1
Web Scraping for Fun and Profit: Part 2
Web Scraping for Fun and Profit: Part 3
Next article
Web Scraping for Fun and Profit: Part 5
Web Scraping for Fun and Profit: Part 6
Web Scraping for Fun and Profit: Part 7
Web Scraping for Fun and Profit: Part 8
Recap
In Part 1 we installed Python, pip, and the Requests library. We set up a basic program which fetches the content of this site. In Part 2, we installed BeautifulSoup and used it to parse the page in order to extract the data we care about. In Part 3, we then installed MongoDB and the pymongo library, and used them to determine if the data fetched is new.
In this part, we will set up a Twilio account for sending ourselves a text message (SMS) when new data arrives.
Why Twilio?
Twilio provides a simple mechanism for delivering SMS messages programmatically out of the box. It is fairly easy to get up and running, and the trial period provides you with a ton of free credit so long as you’re content with the “Sent from your Twilio trial account - “ prefix showing up in your text messages.
You could alternatively use an email account with Python’s smtplib library and send alert texts via email, which is the solution I ended up adopting when my first Twilio number got spam-blocked when I accidentally sent a bunch of texts all at once. Big takeaway: always add a quick sleep between sending text messages, just in case! Also, when performing an initial fetching of the data from the page, it’s useful to have a command-line option for just seeding the database with the initial entries, rather than sending out a massive flurry of notifications on the first run. We will add this and other options in subsequent posts.
Create a Twilio account
Head over to twilio.com/try-twilio to create an account. Once you’re set up and logged in, head over to the Twilio console and take note of your “Account SID” and “Auth Token”. You will need these to connect to Twilio via Python. You can get one free phone number with your Twilio trial account, which you should also take note of for later.
Install twilio-python
Let’s use pip to install the Python drivers for Twilio.
If using Python 3.x:
If using Python 2.x:
Twilio API: SMS basics
What you need to know to get the basics working here are pretty simple.
1) Instantiate a Twilio client passing in your “Account SID” and “Auth Token” as constructor parameters.
Alternately, a TwilioRestClient constructor without these parameters will look for TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN variables inside the current environment, which is the preferred method. Undoubtedly, I’ll come back around to such things as configuring your environment/shell in future posts. :)
2) Create a new message with parameters “to”, “from_”, and “body”.
A brief note on character encodings
In theory, Twilio recognizes the character encoding you need and it performs this encoding automatically, with full unicode support… but in practice, I’ve noticed some inconsistencies in this regard. If you don’t anticipate needing non-ASCII characters, you can strip them out or replace them with another character via something like the snippet below.
Putting it all together
We’ll create a simple method for formatting an extracted post for SMS delivery, and pass the returned value as the body of the message.
If you’re following my advice on version control, stage your changes for commit with “git add scraper.py”, and commit them with something like “git commit -m ‘Format new listing for SMS delivery and send via Twilio’”.
Hey, we got a basic, functional, alert program set up now! We’ll want to automate the execution of the program, either by executing the main logic in a perpetual loop which sleeps for a good while after each iteration, or by using a service such as cron to execute this program at specified times for you. We’ll also want to refactor the code to make it more extensible for doing this kind of thing for a wide variety of sites and data. These will be covered in coming posts. Stay tuned!
Previous articles
Web Scraping for Fun and Profit: A Primer
Web Scraping for Fun and Profit: Part 1
Web Scraping for Fun and Profit: Part 2
Web Scraping for Fun and Profit: Part 3
Next article
Web Scraping for Fun and Profit: Part 5
Web Scraping for Fun and Profit: Part 6
Web Scraping for Fun and Profit: Part 7
Web Scraping for Fun and Profit: Part 8