Have you ever wondered how to pull out useful information from websites without the hassle? BeautifulSoup is your go-to tool for scraping HTML data effortlessly. In this article, we’ll walk you through the basics of web scraping using BeautifulSoup. No prior experience is needed! With its simple syntax and straightforward approach, you’ll quickly grasp the essentials of parsing HTML and extracting data from web pages. Join us as we explore the world of web scraping in a beginner-friendly way. By the end, you’ll be equipped with the skills to gather valuable insights from any website with ease. Let’s dive in and uncover the magic of BeautifulSoup together! BeautifulSoup Overview You…
-
-
Scraping dynamic websites using Scraper API and Python
Learn how to efficiently and easily scrape modern Javascript enabled websites or Single Page Applications without installing a headless browser and SeleniumIn the last post of the scraping series, I showed you how you can use Scraper API, an online data extractor to scrape websites that use proxies hence your chance of getting blocked is reduced. Today I am going to show how you can use Scraper API to scrape websites that are using AJAX to render data with the help of JavaScript, Single Page Applications(SPAs), or scraping websites using frameworks like ReactJS, AngularJS, or VueJS. I will be working on the same code I had written in the introductory post. Let’s work on a simple example. There is a website that tells your IP, called HttpBin. If you load…
-
Creating an e-commerce bot to buy online items with ScrapingBee and Python
I wrote about ScrapingBee a couple of years ago where I gave a brief intro about the service. ScrapingBee is a cloud-based scraping service that provides both headless and lightweight typical HTTP request-based scraping services. Recently I discovered that they are providing some cool features which other online services are not providing as such. What are those features? I thought to explore and explain them with a real use case. I used Python language to automate the Daraz group’s shopping website, a famous e-commerce website service in Asian countries like Pakistan, Nepal, Bangladesh, and Sri Lanka. I am automating DarazPK since I am in Pakistan. You can view the demo…
-
Develop Ali Express Scraper in Python with Scraper API
This is another post in ScrapeTheFamous, in which I will be parsing some famous websites and will discuss my development process. The posts will be using Scraper API for parsing purposes which makes me free from all worries about blocking and rendering dynamic sites since Scraper API takes care of everything. In this post, we are going to scrape AliExpress. AliExpress is a Chinese B2C portal to buy stuff. The script I am going to make consists of two parts, or I say, two functions: fetch and parse. The fetch will accept a category and return all links of individual items and parse will parse an individual entry and returns a few data points in…
-
Develop Google scraper in Python with Scraper API
This is another post in ScrapeTheFamous, in which I will be parsing some famous websites and will discuss my development process. The posts will be using Scraper API for parsing purposes which makes me free from all worries about blocking and rendering dynamic sites since Scraper API takes care of everything. So this post is about scraping Google search results, the script will accept a keyword and would return results across multiple pages. The data will be stored in a text file in JSON format. The code that is parsing the result is pretty straightforward and given below: def google_scraper(query, start=0): records = [] try: URL_TO_SCRAPE = "http://www.google.com/search?q=" + query.replace(' ', '+') +…
-
Create Ebay Scraper in Python using Scraper API
Learn how to create an eBay data scraper in Python to fetch item details and price.In this post of ScrapingTheFamous, I am going o write a scraper that will scrape data from eBay. eBay is an online auction site where people put their listing up for selling stuff based on an auction. Like before, we will be writing the two scripts, one to fetch listing URs and store in a text file and the other to parse those links. The data will be stored in JSON format for further processing. I will be using Scraper API service for parsing purposes which makes me free from all worries blocking and rendering dynamic sites since it takes care of everything. The first script is to fetching listings of a category.…
-
Create Amazon Scraper in Python using Scraper API
Learn how to create an Amazon scraper in python to scrape product details like price, ASIN etcIn this post of ScrapingTheFamous, I am going o write a scraper that will scrape data from Amazon. I do not need to tell you what is Amazon. You are here because you already know about it 🙂 So, we are going to write two different scripts: one would be fetch.py that would be fetching URLs of individual listings and save in a text file. Later another script, parse.py that will have a function taking an individual listing URL, scrape data, and save in JSON format. I will be using Scraper API service for parsing purposes which makes me free from all worries blocking and rendering dynamic sites since it…
-
Create your first Web scraper in Go with goQuery
A beginners tutorial for writing web scrapers in Go language for Yelp.Planning to write a book about Web Scraping in Python. Click here to give your feedback I have been covering web scraping for a long time on this blog for a long time but they were mostly in Python; be it requests, Selenium or Scrapy framework, all were based on Python language but scraping is not limited to a specific language. Any language that provides APIs or libraries for an Http client and HTML parser is able to provide you web scraping facility. Go also provides you the ability to write web scrapers. Go is a compiled and static type language and could be very beneficial to write efficient and…
-
Create your first web scraper with ScrapingBee API and Python
Learn how to use cloud based Scraping API to scrape web pages without getting blocked.In this post, I am going to discuss another cloud-based scraping tool that takes care of many of the issues you usually face while scraping websites. This platform has been introduced by ScrapingBee, a cloud-based Scraping tool. What is ScrapingBee If you visit their website, you will find something like below: ScrapingBee API handles headless browsers and rotates proxies for you. As it suggests, it is offering you all the things to deal with the issues you usually come across while writing your scrapers, especially the availability of proxies and headless scraping. No installation of web drivers for Selenium, yay! Development ScrapingBee is based on REST API hence it can…
-
Develop AirBnb Parser in Python
Planning to write a book about Web Scraping in Python. Click here to give your feedback So I am starting a new scraping series, called, ScrapeTheFamous, in which I will be parsing some famous websites and will discuss my development process. The posts will be using Scraper API for parsing purposes which makes me free from all worries blocking and rendering dynamic sites since Scraper API takes care of everything. Anyways, the first post is about Airbnb. We will be scraping some important data points from it. We will be scraping a list of rental URL and fetch and store data in JSON format. So let’s start! The URL we…