Getting started with Apache Kafka in Python

In this post, I am going to discuss Apache Kafka and how Python programmers can use it for building distributed systems. What is Apache Kafka? Apache Kafka is an open-source streaming platform that was initially built by LinkedIn. It was later handed over to Apache foundation and open sourced it in 2011. According to Wikipedia: […]

Getting started with Elasticsearch in Python

In this post, I am going to discuss Elasticsearch and how you can integrate with different Python apps. What is ElasticSearch? ElasticSearch (ES) is a distributed and highly available open-source search engine that is built on top of Apache Lucene. It’s an open-source which is built in Java thus available for many platforms. You store […]

5 strategies to write unblock-able web scrapers in Python

People who read my posts in scraping series often contacted me to know how could they write scrapers that don’t get blocked. It is very difficult to write a scraper that NEVER gets blocked but yes, you can increase the life of your web scraper by implementing a few strategies. Today I am going to […]

Getting started with Python and IPFS

In this post I am going to discuss how you can use  decentralized IPFS in your Python apps for storing different kind of data. What is IPFS? From Wikipedia: InterPlanetary File System (IPFS) is a protocol and network designed to create a content-addressable, peer-to-peer method of storing and sharing hypermedia in a distributed file system. […]

Introduction to Exploratory Data Analysis in Python

  Recently I finished up Python Graph series by using Matplotlib to represent data in different types of charts. In this post I am giving a brief intro of Exploratory data analysis(EDA) in Python with help of pandas and matplotlib. What is Exploratory data analysis? According to Wikipedia: In statistics, exploratory data analysis (EDA) is an approach […]

Implementing beanstalk to create a scaleable web scraper

Image Credit (http://blog.hqc.sk.ca/wp-content/uploads/2012/12/Queue-2012-12-11.jpg) Queues are often used to make applications scaleable by offloading the data and process them later. In this post I am going to use BeansTalk queue management system in Python. Before I get into real task, allow me to give a brief intro of Beanstalk. What is Beanstalk? From the official website: Beanstalk […]

Develop database driven applications in Python with Peewee

It is not uncommon that most of the applications these days are interacting with database these days. Specially with RDBMS based engines( DB engines that support SQL). Like any other languages Python also provides native and 3rd party libraries to interact with database. Normally you have to write SQL queries for CRUD operations. That’s OK […]

How to automate your deployment and SSH activities with Fabric

It is not uncommon for developers to interact with remote servers. Beside FTP clients many use terminals or consoles to carry out different tasks. SSH is usually used to connect with remote servers and execute different commands; from running git to initiating a web or db server, almost every thing can be done by using […]