Article | so-and-so rice \

Source: Python Technology “ID: pythonall”

What is worth buying is a website that shares information about e-commerce offers. Each offer comes with a number and a comment on whether the offer is worth it, which adds some trust to consumers. Xiaobian also often find some daily necessities, but often will miss the large discount content. Today we will push the preferential information to QQ. \

The server chooses Tencent cloud function, which can run script tasks regularly. It is free and a good assistant for crawler scripts.

Push to QQ select QMSG sauce, because its operation is very simple, after logging in 1 minutes will learn, use HTTP way push content to QQ.

crawling

Capture and analysis began in the whistleblower interface of the website, and the ID of the whistleblower was used as a parameter to change the content of the capture page.

This page is relatively simple, just find the A tag hyperlink under div with class pandect-content-stuff and span content with class pandect-content-time.

import requests
from bs4 import BeautifulSoup
import time

userAgent = {
            "User-Agent""Mozilla / 5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.72 Safari/537.36"
        }

def parse_html(event, context):
    now = time.time()
    authorIds = ['1222805984']
    for author in authorIds:
        url = 'https://zhiyou.smzdm.com/member/' + author + '/baoliao/'


        html_content = requests.get(url, headers = userAgent).content

        soup = BeautifulSoup(html_content, 'html.parser', from_encoding='utf-8')
        infos = soup.find_all(name='div',attrs={'class''pandect-content-stuff'})


        for info in infos:
            a = info.find(name='div', attrs={'class''pandect-content-title'}).a
            t = info.find(name='span', attrs={'class''pandect-content-time'}).text # push only5Content_time = time.mktime(time.strptime('2021 -' + t + '00'."%Y-%m-%d %H:%M:%S"))
            if((now - content_time) < 5 * 60):
                content = a.text.strip() + '\r\n' + a['href']
                push_qmsg(content)


def push_qmsg(msg):
    key = 'xxx'
    url = 'https://qmsg.zendee.cn/send/' + key
    msg = {'msg':  msg}
    requests.post(url, params=msg)
Copy the code

Use Tencent Cloud functions

Prepare the cloud functions directory

Put the bs folder of the third-party BeautifulSoup module in the same directory as the script files,

PIP install BeautifulSoup4 -t SMZDM /Copy the code

use

In the custom create https://console.cloud.tencent.com/scf/list-create?rid=1&ns=default pages, and the function code and trigger configuration panel write below.

Click the Finish button to open the function code screen, and test whether the deployment is correct. The successful test indicates that the deployment has been successful and you can grab the discount information every 5 minutes.

conclusion

With what is worth buying crawler to help everyone use Tencent cloud function for free, save money and effort spent in crawler server. If this article is useful to you, please give it a thumbs up and read it again.

PS: Reply “Python” within the public number to enter the Python novice learning exchange group, together with the 100-day plan!

Old rules, brothers still remember, the lower right corner of the “watching” click, if you feel the content of the article is good, remember to share moments to let more people know!

[Code access ****]

Identify the qr code at the end of the text, reply: 210421