In this post of ScrapingTheFamous, I am going o write a scraper that will scrape data from eBay. eBay is an online auction site where people put their listing up for selling stuff based on an auction.
Like before, we will be writing the two scripts, one to fetch listing URs and store in a text file and the other to parse those links. The data will be stored in JSON format for further processing.
I will be using Scraper API service for parsing purposes which makes me free from all worries blocking and rendering dynamic sites since it takes care of everything.
The first script is to fetching listings of a category. So let’s do it!
import requests from bs4 import BeautifulSoup if __name__ == '__main__': API_KEY = None links_file = 'links.txt' links = [] with open('API_KEY.txt', encoding='utf8') as f: API_KEY = f.read() URL_TO_SCRAPE = 'https://www.ebay.com/b/Computer-Components-Parts/175673/bn_1643095' payload = {'api_key': API_KEY, 'url': URL_TO_SCRAPE, 'render': 'false'} r = requests.get('http://api.scraperapi.com', params=payload, timeout=60) if r.status_code == 200: html = r.text.strip() soup = BeautifulSoup(html, 'lxml') entries = soup.select('.s-item a') for entry in entries: if 'p/' in entry['href']: listing_url = entry['href'].replace('&rt=nc#UserReviews', '') links.append(listing_url) if len(links) > 0: with open(links_file, 'a+', encoding='utf8') as f: f.write('\n'.join(links)) print('Links stored successfully.')
Now let’s write the parse script for parsing individual listing information. Do remember that I am not here to write the entire script and each and everything. I already made many tutorials about it and you can check it all here.
SO below is the parse script:
import requests from bs4 import BeautifulSoup if __name__ == '__main__': record = {} price = title = seller = image = None with open('API_KEY.txt', encoding='utf8') as f: API_KEY = f.read() URL_TO_SCRAPE = 'https://www.ebay.com/p/5034585650?iid=202781903791' payload = {'api_key': API_KEY, 'url': URL_TO_SCRAPE, 'render': 'false'} r = requests.get('http://api.scraperapi.com', params=payload, timeout=60) if r.status_code == 200: html = r.text soup = BeautifulSoup(html, 'lxml') title_section = soup.select('.product-title') if title_section: title = title_section[0].text.strip() selleer_section = soup.select('.seller-persona') if selleer_section: seller = selleer_section[0].text.replace('Sold by', '').replace('Positive feedbackContact seller', '') selller = seller[:-6] price_section = soup.select('.display-price') if price_section: price = price_section[0].text image_section = soup.select('.vi-image-gallery__enlarge-link img') if image_section: image = image_section[0]['src'] record = { 'title': title, 'price': price, 'seller': seller, 'image': image, } print(record)
When I run the script it prints the following:
{ 'title': 'AMD Ryzen 3 3200G - 3.6GHz Quad Core (YD3200C5FHBOX) Processor', 'price': '$99.99', 'seller': 'best_buy\xa0(698388)97.2% ', 'image': 'https://i.ebayimg.com/images/g/ss8AAOSwsbhdmy2e/s-l640.jpg' }
Pretty straight forward.
Conclusion
In this post, you learned how you can scrape eBay data easily by using Scraper API in Python. You can enhance this script as per your need like writing a price monitoring script.
Writing scrapers is an interesting journey but you can hit the wall if the site blocks your IP. As an individual, you can’t afford expensive proxies either. Scraper API provides you an affordable and easy to use API that will let you scrape websites without any hassle. You do not need to worry about getting blocked because Scraper API by default uses proxies to access websites. On top of it, you do not need to worry about Selenium either since Scraper API provides the facility of a headless browser too. I also have written a post about how to use it.
Click here to signup with my referral link or enter promo code adnan10, you will get a 10% discount on it. In case you do not get the discount then just let me know via email on my site and I’d sure help you out.