Resume Template Download

    • Tools to prepare
    • Project idea analysis
    • Easy source sharing

Tools to prepare

Development environment: Win10, PYTHon3.7 development tools: PyCharm, Chrome

Project idea analysis

Find the hyperlink to the details page and the name of your resume

Extract parameter information



When using xpath syntax, note that the source code of the web page may differ from the page rendered by the browser page, and that data must be extracted from the source code

    html_data = etree.HTML(page) 
    a_list = html_data.xpath("//div[@class='box col3 ws_block']/a")  
    for a in a_list:
        resume_href = 'https:' + a.xpath('./@href')[0]  
        resume_name = a.xpath('./img/@alt')[0]  
Copy the code

Enter the Details page

Find the address of the corresponding details page

Extract the download address of the rar

        resume_tree = etree.HTML(resume_page)  
        resume_link = resume_tree.xpath('//ul[@class="clearfix"]/a/@href')[0]
Copy the code

Easy source sharing

Import requests from LXML import etree headers = {' user-agent ': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; Rv :86.0) Gecko/20100101 Firefox/86.0',} for I in range(2, 10): Url = f 'https://sc.chinaz.com/jianli/free_ {STR (I)}. HTML' # set corresponding routing response = requests. I get (= url url, headers=headers) html_data = etree.HTML(response.text) a_list = html_data.xpath("//div[@class='box col3 ws_block']/a") for a in a_list: New_url = 'HTTPS :' + a.path ('./@href')[0] name = a.path ('./img/@alt')[0] res = requests. Get (URL =new_url) # resume_tree = etree.HTML(res.text) resume_url = resume_tree.xpath('//ul[@class="clearfix"]/a/@href')[0] result = Get (URL = resume_URL, headers={' user-agent ': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; Rv :86.0) Gecko/20100101 Firefox/86.0'}).content # Obtain binary data path = './moban/' + name + '.rar' with open(path, 'wb') as fp: fp.write(result) # save fileCopy the code

\