Python simulates the login methods of major websites, as well as some crawlers

  • For practice only, the code comments are detailed

  • This project is used to study and share the simulated login method of each major website, and crawler, will continue to update…

  • I worked overtime yesterday to reconstruct and test some old codes. Most of them can be used.

  • Welcome to star

Simulate logging in to some common websites

  • If you have some websites that are difficult to log in, such as selenium+ WebDriver, you still cannot log in, please send me the issue
  1. requests
  2. selenium
  3. rsa
  4. phantomjs

The project address

Github

about

The simulated login basically adopts the method of direct login or Selenium + WebDriver. Some websites, such as Qzone and B site, are difficult to directly log in, but selenium is relatively easier.

Although Selenium is used at login time, for efficiency, we can maintain cookies after login and call requests or scrapy for data collection, thus ensuring the speed of data collection.

Has been completed

  • Facebook
  • Twitter
  • Weibo web version
  • zhihu
  • QQZone
  • CSDN
  • taobao
  • Baidu
  • The nut
  • JingDong
  • 163mail
  • retractor
  • Bilibili
  • douban
  • Baidu2
  • Cooperated network
  • Wechat web version
  • github
  • Figure worm

tips of pull request

  • Welcome to pull Request

The problem

  • Some of the captchas have to be manual and I will try to correct them later
  • Code failure: code failure is caused by the change of website strategy or style. Please send me an issue. If you have solved it, you can raise PR, thank you!

In addition

  • If you have some websites that are difficult to log in, such as selenium+ WebDriver, you still cannot log in, please send me the issue
  • If the REPO is helpful, give it a star

You are welcome

  1. After writing the project for a period of time, I found that there were some problems in the style of the code and the ease of use, scalability and readability of the code, so the next most important thing was to reconstruct the code, so that we could make some small functions of our own more easily.
  2. If you think a site’s login is representative, feel free to mention it in issue

test

Bilibili automatic login test normal, 98% success rate

Web WeChat

Figure insect crawler

The project address

Github

specific

  • Please jump to the project address to view, welcome star!

The last

  • Big guys slow down, little brother a little unbearable, — –