preface

Idle at home over the weekend, brush the WeChat, playing with a mobile phone, found himself WeChat head the changed, go online to find the head, looking at pictures, his thought as a code, the agriculture, can make the pictures are climbing down a a WeChat small programs, to start, know about the basic all know how to do it, share a wave to everyone.

directory

  • Install Node and download the dependencies
  • Set up service
  • Request the page we want to climb, return JSON

Install the node

To start installing Node, you can go to the node official website to download nodejs.org/zh-cn/ and run Node after downloading.

node -vCopy the code

The version number you installed will appear after successful installation.

Next we use Node, print hello World, create a new file named index.js and enter it

console.log('hello world')Copy the code

Run this file

node index.js
Copy the code

It prints Hello World to the control panel

Setting up the server

Create a new folder named node.

First you need to download the Express dependency

npm install express 
Copy the code

Create a new directory named demo.js as shown below:



Introduce download express in demo.js

const express = require('express');
const app = express();
app.get('/index'.function(req, res) {
res.end('111')
})
var server = app.listen(8081, function() {
    var host = server.address().address
    var port = server.address().port
    console.log("Application instance, access at http://%s:%s", host, port)

})
Copy the code

Run Node Demo.js and the simple service is set up, as shown below:



Request the page we want to climb

Request the page we want to climb

npm install superagent
npm install superagent-charset
npm install cheerio
Copy the code

Superagent is a lightweight, progressive Ajax API that is easy to read and has a low learning curve. It relies on the nodeJS native REQUEST API and can be used in nodeJS environments, as well as HTTP requests

Superagent-charset prevents crawled data from garbled characters and changes the character format

Cheerio for server custom, fast, flexible, implementation of jQuery core implementation. Once the dependencies are installed, you can import them

var superagent = require('superagent');
var charset = require('superagent-charset');
charset(superagent);
const cheerio = require('cheerio');
Copy the code

After introducing the request our address, https://www.qqtn.com/tx/weixintx_1.html, as shown in figure:



Declare address variable:

const baseUrl = 'https://www.qqtn.com/'
Copy the code

Now that this is done, it’s time to send the request. Look at the complete code demo.js

var superagent = require('superagent');
var charset = require('superagent-charset');
charset(superagent);
var express = require('express');
var baseUrl = 'https://www.qqtn.com/'; // Cheerio = require(const cheerio = require('cheerio');
var app = express();
app.get('/index'.function(req, res) {// Set the request header res.header("Access-Control-Allow-Origin"."*");
    res.header('Access-Control-Allow-Methods'.'PUT, GET, POST, DELETE, OPTIONS');
    res.header("Access-Control-Allow-Headers"."X-Requested-With");
    res.header('Access-Control-Allow-Headers'.'Content-Type'); / var/typetype= req.query.type; Var page = req.query.page;type = type || 'weixin';
    page = page || '1';
    var route = `tx/${type}tx_${page}.html '// the page information is gb2312, so the chaeset should be.charset('gb2312'), general web pages are UTF-8 and can be used directly. Charset ('utf-8')
    superagent.get(baseUrl + route)
        .charset('gb2312')
        .end(function(err, sres) {
            var items = [];
            if (err) {
                console.log('ERR: ' + err);
                res.json({ code: 400, msg: err, sets: items });
                return;
            }
            var $ = cheerio.load(sres.text);
            $('div.g-main-bg ul.g-gxlist-imgbox li a').each(function(idx, element) {
                var $element = $(element);
                var $subElement = $element.find('img');
                var thumbImgSrc = $subElement.attr('src');
                items.push({
                    title: $(element).attr('title'),
                    href: $element.attr('href'),
                    thumbSrc: thumbImgSrc
                });
            });
            res.json({ code: 200, msg: "", data: items });
        });
});
var server = app.listen(8081, function() {

    var host = server.address().address
    var port = server.address().port

    console.log("Application instance, access at http://%s:%s", host, port)

})
Copy the code

Running demo.js returns the data we received, as shown in the following figure:

A simple Node crawler is done. I hope you can click a STAR on the project as your recognition and support for this project, thank you.

Project address: github.com/Mr-MengBo/R…