This article has been authorized by the author Zhang Gengyuan netease cloud community.
Welcome to visit netease Cloud Community to learn more about Netease’s technical product operation experience.
Since the company’s e-trust public service account has the function of inquiring today’s menu, I gradually formed the habit of checking the menu in each window before going to dinner, and then deciding where to eat.
But the more I use this feature, the more I find it inconvenient. At present, the steps to query the menu on the official account are as follows:
-
Open the easy letter
-
Open the netease Genie official account
-
Click On Convenient Services
-
Click on today’s menu
-
Wait for the entry link to return to today’s menu
-
Click the entry link to view today’s menu
The process is passive and a bit complicated as a fixed action that needs to be performed at least twice a day. In particular, the fifth step, the sixth step need network access, if the mobile phone network access is not stable (WiFi, 4G signal is not good, etc., when sitting, waiting for the elevator is easy to encounter this situation), any step will be stuck and can not be queried; There are also some students for a variety of reasons did not pay attention to netease genie public number, unable to check today’s menu.
So I thought, is there an easier way to directly push the daily menu content to the mobile phone actively, and you can view the menu with a simple click?
-
Click push message
-
View today’s Menu
I had the idea and I did it.
This can be broken down into three steps:
-
Data capture
-
The data processing
-
Data push
More on that below.
Data capture
To crawl recipe data, you first need to know where the information is being queried. I have not done the development of wechat official accounts, but according to general experience, whether wechat or wechat official accounts published articles are generally a simple HTTP page. To find the source of the daily menu data, find the pattern of these HTTP page addresses.
There are many ways to grab a mobile network request. The most convenient way is to run a tool like tcpdump in the background of the mobile phone, and access the Today menu of easy to catch the desired results. However, as my phone is iOS without jailbreak, it is impossible to do so due to sandbox mechanism.
The last method is to run a MITmProxy1 service on the computer of the same network, and specify the HTTP Proxy address of the mobile phone as the address of the computer. Open the link of today’s menu in Yixin, and you can see a string of HTTP access records of the mobile phone in MitmProxy. This contains the HTTP URL that we want to grab today’s menu.
It can be found that the HTTP URL link of today’s menu is the following pattern:
http://numenplus.yixin.im/singleNewsWap.do?companyId=1&materialId=${id}Copy the code
There is only one variable, ${id}, which is a positive integer and should be the id of the article. The contents of the daily menu we want to climb are all in these links, as well as some other advertising articles published by netease Genie public account. This page can be climbed by simple HTTP GET without additional processing.
I have studied the rule of ID generation of articles that did not find the menu of today, and speculated that the ID should be generated in the back end of E-trust, and the mobile client can not directly get this ID. So simply on each ID climb again, check the content is today’s menu on the article processing, is not ignored. That’s the data source for today’s menu.
I am familiar with Python, so I use Python to implement:
import requestsdef http_get(url, timeout=3):
try:
res = requests.get(url, timeout) except:
LOG.exception("Failed to GET: %s" % url) else: ifres.status_code ! = 200:return None
else: return resdef fetch(start, step=300):
last_id = start for i in iter(range(start, start + step)):
url = ("http://numenplus.yixin.im/singleNewsWap.do?"
"companyId=1&materialId=%d" % i)
response = http_get(url) if not response: continue
# handle menu data hereCopy the code
The data processing
Data processing, there are two main tasks:
-
As mentioned above, you need to check whether the content of the article you are crawling is today’s menu
-
Parsing the HTML content to get the menu information we want
The first problem is relatively simple and can be checked directly with simple keyword regular expression matching. For example, if the article contains the words “Today’s menu”, we assume that the article is today’s menu.
The second problem is a little more complicated. We need to extract the text data from the HTML source data that we crawl, and then generate the date, breakfast, lunch, and dinner information for this menu. This can also be done with slightly more complex regular expressions.
BeautifulSoup, a well-known third-party library in Python that handles HTML-formatted content, is very easy to use:
Gets the menu content
def _parse(self, content):
try:
bs = BS(content, "html.parser") if bs.find_all(class_="m-error") :return None
else: return bs except:
LOG.exception("Failed to Parse content: %s" % content)def _handle_menu(bs):
try:
content = bs.find(id="divCNT") except:
LOG.warn("Failed to get content") return None
else: return contentCopy the code
This is what it looks like before HTML parsing
This is what it looks like when you parse it, you’ve taken all the tags out of the HTML
Check if it’s today’s menu
def _is_menu(text):
# u4eca\ 499 e5\u83dc\ U5355 => U5355 => U5355
if re.findall(ur"\u4eca\u65e5\u83dc\u5355", text, re.UNICODE): return True
else: return FalseCopy the code
Extract menu date
def _handle_date(content):
# \u6708 => month \ x499 e5 => days
res = re.findall(ur"(\d+)\u6708(\d+)\u65e5", content.text, re.UNICODE) if not res:
LOG.warn("Failed to parse date") return None
else:
month, day = tuple([int(i) for i in res[0]])
year = datetime.datetime.now().year return datetime.datetime(year, month, day)Copy the code
Extract menu contents for breakfast, lunch and dinner
def _menu_to_text(content):
# \ x499 \u9910 => Breakfast
# u4e2d\u9910 => Chinese food
# \u665a\u9910 => Dinner
# \u591c\ U5bb5 => Midnight snack
text = content.get_text()
res = re.findall(ur"\u65e9\u9910([\s\S]+)\u4e2d\u9910([\s\S]+)"
ur"\u665a\u9910([\s\S]+)\u591c\u5bb5",
text, re.UNICODE | re.MULTILINE) if not res:
LOG.warn("Failed to match menu") return None
else:
menu = {}
menu[BREAKFAST] = res[0][0]
menu[LUNCH] = res[0][1]
menu[SUPPER] = res[0][2] return menuCopy the code
Data push
Now we have solved the data crawling and processing of today’s menu, just how to push the menu content to the mobile phone.
According to the survey, some useful third-party push services on iOS platforms include Pushover, Pushbullet, Boxcar, Amazon SNS, etc.
Amazon SNS does not provide a ready-made client first rejected; Pushover looks like the best option, but a one-time license fee of $5 per mobile client is rejected. All things considered, Pushbullet is complete, free, well-documented, and supported across all platforms.
To do this, write an HTTP POST request according to the API documentation provided by Pushbullet:
def send_notification(subject, content, channel=PUSHBULLET_CHANNEL):
try:
res = requests.post( "%s/pushes" % PUSHBULLET_API,
headers={"Access-Token": PUSHBULLET_TOKEN},
data={"title": subject, "body": content, "type": "note"."channel_tag": channel},
timeout=30) except:
LOG.exception("Failed to send notification") else: ifres.status_code ! = 200: LOG.warn("Error when pushing notification")Copy the code
Here’s what the menu looks like:
PC/Mac also supports:
Put the above code snippets together to create a small project that can grab and push today’s menu. The final runnable code is here (it also includes the ability to email the contents of the menu) :
https://g.hz.netease.com/hzzhanggy/what2eat2day_ntesCopy the code
automation
The whole process of data crawling and push has been written, and the last thing we need to do is to automate the whole process. We just need to look at the mobile phone push messages every meal.
In fact, it is the data crawling, push to make it a scheduled task. I use systemd Timer to implement:
Run the wrapper run.sh for the script in Virtualenv
#! /bin/bashBASE=/home/stanzgy/workspace/what2eat2day_ntes$BASE/.venv/bin/python $BASE/fetch.py $@Copy the code
Today menu fetch service file menu_fetch
[Unit]Description=Fetch NetEase menu today[Service]Type=oneshotExecStart=/home/stanzgy/workspace/what2eat2day_ntes/run.sh f[Install]WantedBy=multi-user.targetCopy the code
Today menu grabs the timer file menu_fetch. Timer
[Unit]Description=Fetch NetEase menu everyday[Timer]OnCalendar=Mon-Fri *-*-* 10:00:00Unit=menu_fetch.service[Install]WantedBy=multi-user.targetCopy the code
The timer configuration for pushing the today menu is similar to the above, except that the parameters passed in from the command line are different, which is omitted here. The final effect is as follows
Free experience cloud security (EASY Shield) content security, verification code and other services
For more information about netease’s technology, products and operating experience, please click here.
Related articles: [Recommended] The service introduction of netease cloud Verification code V1.0