Python automatically finds articles you like

In the water when inevitably to dig gold, CSDN around a circle, this time to look for some articles.

There are many articles and videos on the front page, but which ones do we like?

I’m so tired of trying to solve this problem by myself. I’m so used to paddling that I’m too lazy to do anything.

I opened Eclipse Theia

The problem is simple, essentially taking Python and going through all the articles. But it took me two days to finish a big open source project. Well, after all, it was my first time Posting code on GitHub. Github.com/wjhtwx/pyth…

Make ideas

Set up favourite. Json to store favorite keywords, links.json to store found links, forbid.json to store prohibited sites, sites.json to store crawled sites. Printc is used for color output. Visit_articles.py is the main file. Process:

  • Start.
  • Check whether the JSON file is intact.
  • Start visiting the site according to Sites.json.
  • Check whether the current page has been visited to avoid loops.
  • Store your favorite pages in links.json to open them together

The required environment

python3 Must be CPython, Pypy3 may have some modules that don’t fit.
colorama Displays color text, which is what printc is wrapped in.
requests The request page
beautifulsoup4 Parsing HTML
lxml Beautifulsoup4 rely on
urllib Python comes in, parses urls

File directory

readme

printc:https://github.com/wjhtwx/python_requests_articles_finding/blob/visit_articles/printc.md

json:https://github.com/wjhtwx/python_requests_articles_finding/blob/visit_articles/json.md

Github.com/wjhtwx/pyth…

The code is well commented and can be viewed directly on GitHub.

Running effect

Next time we’ll look at how the code works.