In earlier posts(here and here) I discuss how to use Python requests and beautifulsoup library to access and scrape a website. This time I am going to make a simple Gmail Autoresponder that responds to a certain mail. Before I discuss how to do it, a few words about Selenium and why is it going to make our life easier.
Advantages of Selenium
What one is going to achieve with Selenium by not opting for a lightweight solution based on Python requests and beautifulsoup?
- Selenium actually automates browser activities by simulating clicks and other events and makes easier to access information that is accessible after executing Javascript on page.
- Since it’s automating an existing browser, it helps to access AJAX based sites easier.
- It takes care of low level things like setting cookies, session handling, setting userAgent etc. If you are using Python requests you have to take care of these things yourself.
- There are chances your Python requests based crawler will be blocked by a site after labeling it a *bot* unless and until you go to extreme level and implement a ditto request that is sent to a website. In Selenium it’s not an issue because for the server it will be a regular request by a user that is being generated from a browser( Firefox, Chrome etc).
- If you are trying to automate a complex web app like Gmail, it’s easier to do in Selenium than other libraries. Hence, I find it more developer friendly.
- Unlike other Python libraries Selenium is available in other languages as well. So if you write code in PHP, Java, Vb, Perl, Javascript and Ruby you can just port code from one language to other without much efforts.
Ok without further ado, let’s get into code.
Gmail Autoresponder
I am going to write a simple Gmail Autoresponder. What it does that it checks a certain sender name, if it matches, it replies with a predefined message. It has a couple of caveats, like it assumes that Gmail account does not have 2 Step verification implemented and the user is entering correct user/password.
Like any other Python script I need to import required libraries:
from selenium import webdriver from time import sleep
sleep() is required to put a few delays so that Google does not get mad.
Before writing functions let me set a few constants that control entire script:
driver = None driver = webdriver.Firefox() SENDER = 'adnan' GMAIL_USER = <Your Gmail ID> GMAIL_PASSWORD = <Gmail Password> MESSAGE = 'I will get back to you soon. \n Thanks'
First line is obvious. In second line I am telling Selenium which browser I will use to run this script. You can see the list here about supported platforms. SENDER is name of the person who is being monitored. GMAIL_USER and GMAIL_PASSWORD will hold credentials of the account that is being monitored. First, write method to log in.
def login_google(): is_logged_in = False google_login = 'https://accounts.google.com/Login#identifier' try: driver.get(google_login) sleep(5) html = driver.page_source.strip() # email box user_name = driver.find_element_by_id('Email') if user_name: user_name.send_keys(GMAIL_USER) next = driver.find_element_by_id('next') if next: next.click() # give em rest sleep(5) # now enter passwd user_pass = driver.find_element_by_id('Passwd') if user_pass: user_pass.send_keys(GMAIL_PASSWORD) # rest again sleep(3) sign_in = driver.find_element_by_id('signIn') if sign_in: sign_in.click() # rest again sleep(3) html = driver.page_source.strip() is_logged_in = True except Exception as ex: print(str(ex)) is_logged_in = False finally: return is_logged_in
driver.get() is similar to requests.get() to access a URL. Once it’s loaded we can have source of it by calling driver.page_source. Now HTML is here and we need our required info. Now Selenium does not stop you to use Beautifulsoup here but it does not make any sense while Selenium is providing similar facility so let’s use the methods to access DOM elements.
user_name = driver.find_element_by_id('Email') if user_name: user_name.send_keys(GMAIL_USER)
Here the login page of Google account was accessed and then accessed the textbox to enter username/email, once you get you just enter your Gmail user. Send_Keys simply input the text in inputbox.
next = driver.find_element_by_id('next') if next: next.click()
Google shows NEXT button to enter password, for that we need to access the button element and click it. So, this is how you log in, once you logged in, you now need to monitor your INBOX.
def access_gmail(): try: driver.get('http://gmail.com') sleep(5) m = driver.find_elements_by_css_selector('.UI table > tbody > tr') for a in m: if SENDER.lower() in a.text: a.click() break # take rest sleep(5) reply = driver.find_element_by_css_selector('.amn > span') sleep(5) if reply: reply.click() sleep(1) editable = driver.find_element_by_css_selector('.editable') if editable: editable.click() editable.send_keys(MESSAGE) send = driver.find_elements_by_xpath('//div[@role="button"]') for s in send: if s.text.strip() == 'Send': s.click() except Exception as ex: print(str(ex)) finally: return True
m = driver.find_elements_by_css_selector('.UI table > tbody > tr')
Like BeautifulSoup you can use both xPath and CSS Selectors. Here I selected a CSS selector to iterate messages.
That’s it. As I said there are a few things that need to be taken care to make it a *Solid* Gmail Autoresponder. Selenium also throws a few exceptions like StaleElement etc which I have ignored for sake of simplicity. TBH, if I need a serious Autoresponder, I’d go for API route. This is a toy example for education purpose only.
As usual code is available on Github.
ScrapeUp helps you to automate your workflows or extract data from different websites. We also provide services that provide you recurring data without worrying about infrastructure. To learn more about it visit ScrapeUp website.