Two third-party plug-ins are referenced
- Superagent a lightweight, flexible client request proxy module
- Cheerio nodejs jq
Simple code
const superagent = require('superagent')
const cheerio = require("cheerio")
const fs = require("fs")
superagent.get('https://www.acfun.cn/v/list123/index.htm').then(res => {
const $ = cheerio.load(res.text)
let el = $('body').find('img')
el.map((index, i) => {
let img_src = $(i).attr('data-original') || $(i).attr('src'),
news_title = $(i).attr('alt') | |' '
if(! img_src)return
superagent.get(img_src).pipe(fs.createWriteStream('./image/' + index + '.jpg')); })},function (error) {
console.log(error)
})
Copy the code
Train of thought
Using superagent to visit the website, Cheerio parses the page and gets what we want
Then use superagent to fetch the file stream and finally write it to the file
remind
Some sites have restrictions, you can try it yourself
A simple defense is to add headers and so on through the text set method
Github:github.com/CHU295/node…