Like the chant!!

Python is now a very popular programming language, and crawler is the most typical application of Python language, the following Dragon to share several efficient Python crawler framework, see if you have used it?

1.Scrapy

Scrapy is an application framework designed to crawl and extract structured data from web sites. It can be used in a variety of applications, including data mining, information processing or storage of historical data.

2.PySpider

Pyspider is a powerful web crawler system implemented by Python. It can write scripts on the browser interface, schedule functions and view the crawl results in real time, store the crawl results using common databases at the back end, and also set tasks and task priorities regularly.

3.Crawley

Crawley can crawl the contents of corresponding websites at high speed, support relational and non-relational databases, and export data to JSON and XML.

4.Portia

Portia is a visual crawler rule authoring tool from ScrapyHub. It provides a visual Web page where you can develop rules without any programming knowledge by clicking on the data you need to extract from the annotated page.

5.Beautiful Soup

Beautiful Soup is a Python library that extracts data from HTML or XML files. It allows you to navigate, find, and modify your favorite document with your favorite converter. Beautiful Soup can save you hours or even days at work.

6.Grab

Grab is a Python framework for building Web templates. With Grab, you can build a variety of complex web scraping tools, from simple 5-line scripts to complex asynchronous web scraping tools that handle millions of web pages. Grab provides an API for performing network requests and processing received content, such as interacting with the DOM tree of AN HTML document.

Reprinted by: Shenlong Agent IP source: Zhihu