Hello, I am Xiao Wu 🧐

Life is too short. Learn Python!

As a crawler rookie, the general website is relatively simple to climb.

The F12 interface captures packets and then crawls requests.

You also often need to copy headers, cookies, and so on.

It’s too much trouble. Is there an easier way for us?

Curl curl curl curl curl curl curl curl curl

Turn the curl Python

Curl is an open source file transfer tool that works on the command line using URL syntax. It supports file uploading and downloading.

For example, the list of cat’s eye movies, first in Google Explorer, right-click to copy the network caught by the network to cURL(bash).

Then open any translatable website, such as this *curl.trillworks.com/*.

Fill in what you just copied on the left side of the site with the Corresponding Python Requests code generated on the right.

The rest is up to you to choose the appropriate selectors, such as regular expressions, BeautifulSoup, Xpath, CSS selectors, and so on.

That’s a lot easier.

But we still don’t want to go back and forth, so it’s better to use everything directly in Python code.

Used in Python code

Filestools: Filestools: Filestools: Filestools: Filestools: Filestools: Filestools: Filestools

pip install filestools --index-url=http://mirrors.aliyun.com/pypi/simple -U
Copy the code

The library integrates four functions, all of which have been migrated to the Filestools library, so you can use all four functions by installing one library. You just need to import the corresponding modules after using the corresponding functions.

Amway watermarks images with 2 lines of Python.

The curl2py command converts the curl command to Python code.

The help documents are as follows:

E:\>curl2py -h usage: Curl2py [-h] [-f FILE] [-o OUT] [-t] [-c] Curl2py [-h] [-f FILE] [-o OUT] [-t] [-c] The result will be saved to the clipboard optional arguments: -h, --help show this help message and exit -f FILE, -- FILE FILE Specifies the curl command FILE to be saved to the corresponding py script of the same name. -o OUT, -- OUT OUT Generates the save location of the py script -t, -- TMP Whether the py script is saved in the current directory tmp.py -c, --copy Always copies the result to the clipboardCopy the code

But what we’re going to do today is call it directly from Python.

Just copy what you just right-clicked into cURL(bash) and insert it into the code below.

from curl2py.curlParseTool import curlCmdGenPyScript

curl_cmd = "" right click and copy to cURL(bash)" "
output = curlCmdGenPyScript(curl_cmd)
print(output)
Copy the code

After running, you can get the same effect as the conversion site in the previous article.

Copy the generated code to the code box, with the selector, run.

response.encoding='utf-8'

# regular match
re_name = re.findall('data-val="{movieId:.*? } "> (. *?) ',response.text)
re_star = re.findall('

\n (.*?) \n

'
,response.text,re.S) re_releasetime = re.findall('

(.*?)

'
,response.text) re_integer = re.findall('(.*?) ',response.text) re_fraction = re.findall('(.*?) ',response.text) score =[] for n in range(len(re_integer)): score.append(re_integer[n]+re_fraction[n]) for i in range(len(re_name)): content = re_name[i]+' '+score[i]+' '+re_star[i]+' '+re_releasetime[i] print(content) Copy the code

Successfully climbed the cat’s eye list data.

If you are using Jupyter Notebook, you can customize code blocks using the plug-in Snippets and set the curL2py code as a template for reuse. Other editors have similar features that allow you to search by yourself. This greatly saves our tool efficiency!

In addition, the curl2py command not only converts requests.get(), but also requests.post().

Note: If you’re also interested in reverse conversions, that is, converting the Requests code to CURL, check out the Curlify module.

If you want to learn more about the Filestools library, check out the following website: pypi.org/project/fil…