[This article turns to self-music bytes]

The reason why Python is so popular today, and why so many people are learning it, is because it’s easy to learn, it’s powerful, and the whole community is very active and there’s a lot of information. And the language covers all aspects, such as automated testing, operations, crawlers, data analysis, machine learning, finance, back-end development, cloud computing, game development.

Front end time is idle and boring, and interest in Python is increasing. Learning this thing, just look useless, or if the actual combat, conveniently searched some good open source libraries, also read some blogs. Sum up some and share them with you.

Learning Python, you must have started with a crawler. After all, there are plenty of similar resources and open source projects online.

In short, this process takes place in the following four steps:

  • Find the IP address corresponding to the domain name.

  • Sends a request to the server corresponding to the IP address.

  • The server responds to the request and sends back the web content.

  • Browsers parse web content.

    Interested friends can go to the Internet to search for more detailed content.

So what libraries do you need to master to learn a crawler?

General:

  1. Urllib – Network library (STdlib).
  2. Requests – Network library.
  3. Grab — Web library (based on PyCurl).
  4. Pycurl — A network library (bound to libcurl).
  5. Urllib3 – Python HTTP library, secure connection pooling, post support, high availability.
  6. Httplib2 – Network library.
  7. RoboBrowser – a simple, very Python-style Python library that allows you to browse the Web without a separate browser.
  8. MechanicalSoup – a Python library that automatically interacts with web sites.
  9. Mechanize – stateful, programmable Web browsing library.
  10. Socket – Underlying network interface (STdLIB).
  11. Unirest for Python – Unirest is a set of lightweight HTTP libraries that can be used in multiple languages.
  12. Hyper – HTTP/2 client for Python.
  13. PySocks — an updated and actively maintained version of SocksiPy, including bug fixes and some other features. As a direct replacement for the socket module.

Text processing

A library for parsing and manipulating simple text.

  • Difflib — (the Python standard library) helps with differentiation comparisons.
  • Levenshtein – Quickly calculates Levenshtein distance and string similarity.
  • Fuzzywuzzy — Fuzzy string matching.
  • Esmre – Regular expression Accelerator.
  • Ftfy – Automatically collates Unicode text to reduce fragmentation.

Natural language processing

A library that deals with human language problems.

  • NLTK – The best platform for writing Python programs to process human language data.
  • Pattern — Python’s network mining module. He has natural language processing tools, machine learning and more.
  • TextBlob – provides a consistent API for in-depth natural language processing tasks. It’s based on NLTK and Pattern’s Shoulders of giants.
  • Jieba — Chinese word segmentation tool.
  • SnowNLP – Chinese text processing library.
  • Loso – Another Chinese thesaurus.

asynchronous

Asynchronous network programming library

  • Asyncio — (Python standard library above Python 3.4 +) asynchronous I/O, time loops, coroutines, and tasks.
  • Twisted – An event-driven network engine framework.
  • Tornado — a network framework and asynchronous network library.
  • Pulsar — Python event-driven concurrency framework.
  • Diesel – Python’s GREEN event-based I/O framework.
  • Gevent – a Coroutine based Python network library that uses greenlet.
  • Eventlet – Asynchronous framework with WSGI support.
  • Tomorrow – The fancy embellishment syntax of asynchronous code.

The queue

  • Celery – Asynchronous task queues/job queues based on distributed messaging.
  • Huey – Small multithreaded task queue.
  • MRQ — Mr. Queue — Python distributed work task Queue using Redis & Gevent
  • RQ – Lightweight Task queue manager based on Redis.
  • Simpleq – A simple, infinitely scalable queue based on Amazon SQS.
  • Python-gearman – The Python API for Gearman.

E-mail

A library for sending and parsing E-mail messages.

  • Django-celery -ses: Django email backend with AWS SES and celery.
  • envelopes: Email library for human use.
  • Flanker: An email address and Mime resolution library.
  • Imbox: Python IMAP library.
  • Inbox. py: Python SMTP server.
  • Inbox: An open source E-mail toolkit.
  • Lamson: SMTP application server in Python style.
  • Mailjet: Mailjet API implementation, used to provide bulk mail, statistics and other functions.
  • Marrow. Mailer: High-performance extensible mail distribution framework.
  • Modoboa: A mail hosting and management platform with a modern, minimalist Web UI.
  • Pyzmail: Creates, sends, and parses E-mail.
  • Talon: Mailgun library for extracting information and signatures.
  • Yagmail: YagMail is a GMAIL/SMTP client designed to make it as easy as possible to send email.

URL processing

A library that parses URLs

  • Furl: A small Python library that makes handling urls easier.
  • Purl: A simple, immutable URL class with a concise API for querying and processing.
  • Pyshorteners: a pure Python URL shortening library.
  • Shorturl: Python implementation that generates short urls and bit.ly like short chains.
  • Webargs: A library for parsing HTTP request parameters, with built-in support for popular Web frameworks including Flask, Django, Bottle, Tornado, and Pyramid.

There are so many libraries in Python that you can search the web for more details.

There are many Web development frameworks available in Python, and Django is the largest and most widely used. There are a number of companies that use the Django framework, such as Xfox, Xcom, etc. Web. Py and Flask are all very easy to use, Tornado is famous for asynchronous high performance, the source code is beautifully written, Zhihu and Quora are all used.

Some frameworks for Web development

1, Django

2, Flask

3, Web2py

4, Tornado

5, CherryPy

The last
I wish you a happy learning, learning fast.
Helpful words, you can click a “like” collection support! ❤ ️
Also welcome lili, a programmer who is becoming bald, but can lead you to become stronger
So much for today, I am Lybyte-Lili, an interesting soul! See you next time!

Finally, I would like to recommend three Java and Python self-study courses on site B:

From Java to project practice

Ten enterprise-level project self-study courses -B station: BV14K411F7HJ

BV1Sp4y1W77E: Get started with Python to Master Full Version B

Copy bV to station B