The Python development environment of Annaconda has been installed before the installation of common Python libraries in Windows. Once you’ve installed Anaconda, it’s easy to install other libraries. The PIP installation tool comes with the Python installation package and can be viewed in the Python scripts installation directory. The installation of common python libraries is the cornerstone of Python crawler development.
1. Urllib and RE library installation
These two libraries are python’s own libraries. As long as Python is correctly installed, you can directly call the two libraries. In Python mode, the verification is as follows
>>> import urllib
>>> import urllib.request
>>> urllib.request.urlopen('http://www.baidu.com')
<http.client.HTTPResponse object at 0x0000024222C09240>
>>> import re
>>>
Copy the code
2.requests library installation
Pip3 Install requests DOS install Python for request library>>> import requests
>>> requests.get('http://www.baidu.com')
<Response [200] > > > >Copy the code
3. Installation of Selenium library
Mainly used to drive the browser, do testing and so on, JS rendering debugging
Pip3 install Selenium Executes the installation and deletes the library. Pip3 uninstall Selenium tests whether the installation is correct>>> import selenium
>>> from selenium import webdriver
>>> driver = webdriver.Chrome()
If this fails, you need to install the Chromdriver driver, unzip it, and place it in the directory where the Python environment variables are configured
DevTools listening on ws://127.0. 01.:12052/devtools/browser/1f2faef9-0748-40f0-b955-9e41362ce55e
>>>> driver = webdriver.Chrome()
DevTools listening on ws://127.0. 01.:12722/devtools/browser/5ba65a50-df4a-47fd-b2d6-d313578d539d
>>> driver.get('http://www.baidu.com') The browser opened at this time will jump to baidu home page.
>>>driver.page_source Can directly print the current Baidu web page code
Copy the code
4. Phantomjs library installation
The no-interface browser, browser-driven implementation on the command line, is complementary to Selenium, which opens the browser
1. In phantomjs’s official website to download phantomjs installation package, phantomjs.org/download.ht…
2. In the specified installation directory, configure environment variables in the bin directory
3. Run Phantomjs in DOS to check whether the configuration is successful:
C:\Users\Robot_CHEN>phantomjs
phantomjs>
Copy the code
4. Installation and functional testing
>>> import selenium
>>> from selenium import webdriver
>>> driver = webdriver.PhantomJS() # Notice the difference between Selenium webDrive.chrom ()
>>> driver.get('http://www.baidu.com')
>>> driver.page_source
Copy the code
5. Installation of LXML library
Xpath web page parsing library to achieve web page parsing. Pip3 install LXML Install LXML
In python interaction, verify the installation with import LXML
The installation of BeatifulSoup web page parsing library relies on LXML library
Install pip3 install beatifulSoup4
Test installation:
>>> from bs4 import BeautifulSoup Beautifulsoup is imported from the BS4 module
>>> soup = BeautifulSoup('<html></html>'.'lxml') > > >Copy the code
7. Pyquery web page parsing library installation
Pip3 Install PyQuery Performs the installation.
>>> from pyquery import PyQuery as pq
>>> doc = pq('<html></html>')
>>> doc = pq('<html>Hello World</html>')
>>> result = doc('html').text()
>>> result
'Hello World'
>>>
Copy the code
8. Pymysql repository installation
Run the pip3 install pymysql command to install the driver library of the mysql database. After the installation is complete, use the code Python to operate the mysql database and perform CRUD.
import pymysql # import pymysql
Open a database connection
db= pymysql.connect(host="localhost",user="root",
password="123456",db="mydatabase",port=3306)
Get the cursor using the cursor() method
cur = db.cursor()
#1. Query operations
SQL > alter table name = user
sql = "select * from emp3"
try:
cur.execute(sql) Execute SQL statement
results = cur.fetchall() Get all records of the query
print("id"."name"."password")
Pass through the result
for row in results :
id = row[0]
name = row[1]
password = row[2]
print(id,name,password)
except Exception as e:
raise e
finally:
db.close()
Copy the code
9. Install PyMonGo and operate mongodb data
PIP install Pymongo
import pymongo
client = pymongo.MongoClient('localhost')
db = client['mymongodb']
coll = db['mycoll']
mydict = { "name": "RUNOOB"."alexa": "10000" }
coll.insert_one(mydict)
print(coll)
The test results are as follows: Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'mymongodb'), 'mycoll') '''
Copy the code
10. The installation of redis
PIP install redis
import redis
result = redis.Redis('localhost'.6379)
result.set('name'.'jack')
print(result.get('name')) #b'jack'
Copy the code
11. Flask installation, mainly used when setting up web
Can view website document, in flask docs.jinkan.org/docs/flask/
Install flask: PIP install flask: import flask in Python interactive mode
Django installation, Web server framework
PIP install Django, import Django
13. Jupyter installation, powerful notepad
PIP install Jupyter can be installed, if the use of Anaconda, then the default has been installed jupyter, mainly used for online writing code and documentation, very powerful and convenient.
Unified statement: About the original blog content, there may be some content reference from the Internet, if there is an original link will be quoted; If can not find the original link, in this statement if there is infringement please contact to delete ha. About reprint blog, if have original link will declare; If can not find the original link, in this statement if there is infringement please contact to delete ha.