• Getting started with Celery and Python

    In this post, I am going to talk about Celery, what it is, and how it is used. What is Celery From the official website: Celery is a simple, flexible, and reliable distributed system to process vast amounts of messages, while providing operations with the tools required to maintain such a system. Wikipedia says: Celery is an open source asynchronous task queue or job queue which is based on distributed message passing. While it supports scheduling, its focus is on operations in real time. In short, Celery is good to take care of asynchronous or long-running tasks that could be delayed and do not require real-time interaction. It can also…

  • Develop Ali Express Scraper in Python with Scraper API

    This is another post in ScrapeTheFamous, in which I will be parsing some famous websites and will discuss my development process. The posts will be using Scraper API for parsing purposes which makes me free from all worries about blocking and rendering dynamic sites since Scraper API takes care of everything. In this post, we are going to scrape AliExpress. AliExpress is a Chinese B2C portal to buy stuff. The script I am going to make consists of two parts, or I say, two functions: fetch and parse. The fetch will accept a category and return all links of individual items and parse will parse an individual entry and returns a few data points in…

  • Develop Google scraper in Python with Scraper API

    This is another post in ScrapeTheFamous, in which I will be parsing some famous websites and will discuss my development process. The posts will be using Scraper API for parsing purposes which makes me free from all worries about blocking and rendering dynamic sites since Scraper API takes care of everything. So this post is about scraping Google search results, the script will accept a keyword and would return results across multiple pages. The data will be stored in a text file in JSON format. The code that is parsing the result is pretty straightforward and given below: def google_scraper(query, start=0): records = [] try: URL_TO_SCRAPE = "http://www.google.com/search?q=" + query.replace(' ', '+') +…

  • Getting started with CCXT Crypto Exchange Library and Python

    I already used CCXT Library in my Airflow-related post here. In this post, I am specifically going to discuss the library and how you can use it to pull different kinds of data from exchanges or trading automation. The demo can be seen here. What is CCXT CryptoCurrency eXchange Trading Library aka CCXT is a JavaScript / Python / PHP library for cryptocurrency trading and e-commerce with support for many bitcoin/ether/altcoin exchange markets and merchant APIs. It connects with more than 100 exchanges. One of the best features of this library is that it is exchange agonistic, that is, whether you use Binance or FTX, the signature of routines are…

  • Visualizing Python modules and dependencies with Neo4j

    I am taking a short break from the Blockchain Programming series and writing this post because I found it pretty interesting. The other day I found a tweet(which unfortunately I forgot to bookmark and can’t find it anymore) about visualizing python modules in Neo4J. Guido, the Python creator had responded to that tweet. That tweet got stuck in my mind and I thought it a great excuse to explore Neo4j. I had been thinking of exploring some Graph Databases other than Neo4j. For some weird reason, I had been ignoring Neo4j for a long time, most probably because of the Java thing which I do not like at all. I…

  • Develop and deploy your first Ethereum Smart Contract with Python
    Learn how to write a basic smart contract in Solidity and then integrate it with Python app by using Web3.py

    This post going to be a bit longer as I am going to cover multiple concepts. I will be covering the following things: Smart Contracts and how do they work in Ethereum blockchain. The basics of Solidity Programming language and how to use online and existing IDEs to write and test them. Using Truffle and Ganache for Ethereum development environment setup. Web3.py helps to integrate Smart Contract with Python applications. What is a Smart Contract According to Investopedia: A smart contract is a self-executing contract with the terms of the agreement between buyer and seller being directly written into lines of code. The code and the agreements contained therein exist…

  • Create a crypto Telegram bot in Python using Yahoo Finance API
    A step-by-step guide creating a Telegram bot in Python.

    So I was exploring Telegram APIs for a project someone asked me to work on it. The script was actually a cron job that would be sending messages on daily basis. While working on it I found that you could come up with your own commands that can pull data from some remote API and display the results to Telegram users. I found this an opportunity for my next blog post which I am writing here 🙂 Telegram is very much similar to WhatsApp for communication and it is quite popular among Crypto lovers. Many Crypto traders use both Telegram and Discord to send crypto and stock signals to their…

  • Getting started with Protobuffer and Python

    In this post, I am going to talk about Proto Buffers and how you can use them in Python for passing messages across networks.  Protocol Buffers or Porobuf in short, are used for data serialization and deserialization. Before I discuss Protobuf, I would like to discuss data serialization and serialization first. Data Serialization and De-serialization According to Wikipedia Serialization is the process of translating a data structure or object state into a format that can be stored (for example, in a file or memory data buffer) or transmitted (for example, over a computer network) and reconstructed later (possibly in a different computer environment) In simple words, you convert simplex and complex data structures and objects into byte…

  • Python Elasticsearch

    Getting started with Elasticsearch 7 in Python

    I had written about Elasticsearch almost 3 years ago in June 2018. During this time a new Elasticsearch version launched which has some new features and changes. I’d be repeating some concepts again in this post so one does not have to go to the old post to learn about it. So, let’s begin! What is ElasticSearch? ElasticSearch (ES) is a distributed and highly available open-source search engine that is built on top of Apache Lucene. It’s open-source which is built in Java thus available for many platforms. You store unstructured data in JSON format which also makes it a NoSQL database. So, unlike other NoSQL databases, ES also provides…

  • HTML

    Using Sitemap to write efficient web scrapers
    A step by step guide writing web scrapers without using extra resources.

    This post is the part of Scraping Series. Usually, when you start developing a scraper to scrape loads of records, your first step is usually to go to the page where all listings are available. You go to the page by page, fetch individual URLs, store in DB or in a file and then start parsing. Nothing wrong with it. The only issue is the wastage of resources. Say there are 100 records in a certain category. Each page has 10 records. Ideally, you will write a scraper that will go page by page and fetch all links. Then you will switch to the next category and repeat the process.…