Recently, all my friends are looking at houses, and I am immersed in the major housing price websites every day. After looking at them for a few days, I complained to me that I still have no specific concept of the whole housing price in Hangzhou. Excellent and sensitive, I heard as if I smelled a taste of demand, since they have such an excellent me, how can they keep looking at the house like this!

The finished effect is as follows:


To say any more! Your house prices are guarded by me, and it’s time to bring out my dinner guy.


First, take a look at magic equipment and missions

Great, we built the basic skeleton based on NUxT, then added the files we needed, and the final structure of the whole project is as follows:

The first step is to optimize the server/index.js generated by Nuxt

  • We are going to create a Server class
  • Extract the middleware and insert our middleware during server creation

The code is as follows:

 import Koa from 'koa';
import { Nuxt, Builder } from 'nuxt';
import R from 'ramda';
import { resolve } from 'path'

// Import and Set Nuxt.js options
let config = require('.. /nuxt.config.js') config.dev = ! (process.env ==='production')
const host = process.env.HOST || '127.0.0.1'
const port = process.env.PORT || 4000
const MIDDLEWARES = ['database'.'crawler'.'router']
const r = path =>resolve(__dirname,path)

class Server {
  constructor(){ this.app = new Koa(); This.usemiddlewares (this.app)(MIDDLEWARES)} useMiddleWares(app){// Load different middlewarereturn R.map(R.compose(
        R.map( i =>i(app)),
        require,
        i => `${r('./middlewares')}/${i}`
    ))
  }

  async start () {
    // Instantiate nuxt.js
    const nuxt = new Nuxt(config)
    // Build in development
    if (config.dev) {
      const builder = new Builder(nuxt)
      await builder.build()
    } 
    this.app.use(async (ctx, next) => {
      await next()
      ctx.status = 200 // koa defaults to 404 when it sees that status is unset
      return new Promise((resolve, reject) => {
        ctx.res.on('close', resolve)
        ctx.res.on('finish', resolve)
        nuxt.render(ctx.req, ctx.res, promise => {
          promise.then(resolve).catch(reject)
        })
      })
    })
    this.app.listen(port, host)
    console.log('Server listening on ' + host + ':' + port) // eslint-disable-line no-console
  }
}

const app = new Server();

app.start()
Copy the code

In this case, we need to create start.js in the root directory

  • The project uses es6 syntax such as modifiers, to introduce the decoding of Babel, I am lazy here, directly reference start.js.
  • Import index.js from server

The code is as follows:

const { resolve } = require('path')
const r = path => resolve(__dirname, path)

require('babel-core/register') ({'presets': ['stage-3'['latest-node', {
                "target": "current"}]],'plugins': [
        'transform-decorators-legacy'['module-alias'[{'src': r('./server'), 'expose': '~'},
          { 'src': r('./server/database'), 'expose': 'database'}
        ]]
      ]
})

require('babel-polyfill')
require('./server/index')
Copy the code

The groundwork is ready, then we can happily kill chicken ~~~

Let’s analyze the page

  • Page address
  • Information to crawl to

So, in line with that, we’re going to start crawling.

Bring out our treasures:

import cheerio from 'cheerio'// Jquery in node to help us parse the pageCopy the code

Let’s analyze the idea first:

  • We first request the page address, because the data is more than one page, so we need to do a loop, using the next page field on the page to determine whether we have reached the last page.
  • We use class name to get the data we want on the page.
  • Here, I did some processing to filter the data, some incomplete data will be directly discarded.
  • Data refinement, text crawling down, lots of Spaces and 【】, we don’t want that.
  • The sleep method, which is basically a timer, we rest for 1 second after each page climb and then continue. Avoid too many requests and disabling IP addresses.

Here is an example of a file:

import cheerio from 'cheerio'
import rp from 'request-promise'
import R from 'ramda'
import _ from 'lodash'
import { writeFileSync } from 'fs'
import { resolve } from 'path';

const sleep = time => new Promise(resolve => setTimeout(resolve,time)) // Start a restlet _house = [];
let _area = ' '
let _areaDetail= [];
export const gethouse = async ( page = 1,area = ' ') =>{
    const options={
        uri:`https://hz.fang.anjuke.com/loupan/${area}/p${page}/`,
        transform: body => cheerio.load(body),
    }
    console.log("Climbing."+options.uri);
    const $ = await rp(options)
    let house = [];
    
    $(".key-list .item-mod").each(functionConst name = $(this).find(){// This const name = $(this).find()".infos .lp-name .items-name").text();
        const adress =  $(this).find(".address .list-map").text();
        const huxing = $(this).find(".huxing").text();
        const favorPos = $(this).find(".favor-pos .price-txt").text();
        const aroundPrice = $(this).find(".favor-pos .around-price").text(); house.push({ name, huxing, favorPos, aroundPrice, Const fn = R.compose(r.ap ((house) =>{const r1 = house.huxing.replace(/\s+/g,""); / / remove the blank space const r2 = house. AroundPrice. Replace (/ \ s + / g,"");
            const index1 = r2.indexOf("The price");
            const index2 = r2.lastIndexOf("/");
            const price = r2.slice(index1+1,index2-1)
            const reg = /[^\[]*\[(.*)\][^\]]*/;
            const r3 = house.adress.match(reg);
            const i = house.adress.lastIndexOf("]") + 1; house.adress = house.adress.slice(i).replace(/\s+/g,"");
            house.huxing = r1;
            house.aroundPrice = price;
            house.area = r3[1]

            returnhouse }), R.feather (house => house.name && house.adress && house.huxing && house.favorpos && house.aroundprice) House = fn(house); _house = _.union(_house,house)if($('.next-page').attr('href')){
        //writeFileSync("./static/House.json",JSON.stringify(_house,null,2),'utf-8')
        console.log(`${area}A total of${_house.length}') await sleep(1000); page++; await gethouse(page,_area) }else{
        console.log("That's it!+_house.length)

        return_house}} // Get the partition of the region, now go to the house price of each partitionexport const getAreaDetail = async () =>{
    const area = require(resolve(__dirname,'.. /database/json/AreaDetail.json'))
    for(let i = 0; i<area.length; i++){
        let areaDetail = area[i]['areaDetail'];
        _areaDetail = _.union(_areaDetail,areaDetail)
        for(letj = 0; j< areaDetail.length; J++){_house=[] console.log(' climbing${areaDetail[j].text}`)
            _area = areaDetail[j]._id
            console.log(_area)
            await gethouse(1,_area)
            if(_house.length >0){
                areaDetail[j]['house'] = _house
            }
        }
    }
    writeFileSync("./server/database/json/detailHouse.json",JSON.stringify(area,null,2),'utf-8')}Copy the code

This is when crawler.js is added to the middleware file

  • This introduces the crawler logic in the crawler file and then executes the methods inside

The code is as follows:

exportConst database = async app =>{/** ** / const area = require('.. /crawler/area')
    const house = require('.. /crawler/house')
    const areaHouse = require('.. /crawler/areaHouse')
    const detailhouse = require('.. /crawler/detailHouse'/** * If there is no json file locally, // await area.getarea() // await area.getareadetail () // await house.gethouse() // await areaHouse.getAreaDetail() // await detailhouse.getAreaDetail() }Copy the code

At this point, you can happily open the database/ JSON file to display the data you have crawled

  • At this time, I did not rush to put the data into the library, but took the JSON to render with Echart first
  • I am not very familiar with the API in Echart, so I will first practice with JSON to see what data I need
  • Here I finish the front-end code, for the back only need to write asynchronous requests on the line, feel so in the heart some bottom
  • It is also important to note that in nuxt, I have introduced a plugins file to manage third-party plugins

The code is as follows:

The root directory nuxt. Config. Js

module.exports = {
  /*
  ** Headers of the page
  */
  head: {
    title: 'starter',
    meta: [
      { charset: 'utf-8' },
      { name: 'viewport', content: 'width=device-width, initial-scale=1' },
      { hid: 'description', name: 'description', content: 'Nuxt.js project' }
    ],
    link: [
      { rel: 'icon'.type: 'image/x-icon', href: '/favicon.ico' }
    ]
  },
  /*
  ** Global CSS
  */
  css: ['~static/css/main.css'],
  /*
  ** Customize the progress-bar color
  */
  loading: { color: '#3B8070' },
  /*
   ** Build configuration
   */
  build: {
    /*
     ** Run ESLINT on save
     */
    extend (config, ctx) {
      // if (ctx.isClient) {
      //   config.module.rules.push({
      //     enforce: 'pre', / /test: /\.(js|vue)$/,
      //     loader: 'eslint-loader',
      //     exclude: /(node_modules)/
      //   })
      // }
    },
    vendor: ['~/plugins/echat']
  },
  plugins: ['~/plugins/echat']}Copy the code

plugins/echart.js

import Vue from 'vue'
import echarts from 'echarts'
Vue.prototype.$echarts = echarts

Copy the code

Page/minHouse. Vue

<template>
<div>
  <section class="container">
     <a @click="turnBack" class="back"> returns </a> <div id="myChart" :style="{width: 'auto', height: '300px'}"></div>
  </section>
</div>
</template>

<script>
  import { mergeSort } from '.. /util/index'
  import Footer from '.. /components/layouts/Footer'
  import Header from '.. /components/layouts/Header'
  import {
    getAreaList,
    getAreaHouseList,
    getDetailList
  } from '.. /serverApi/area'

  export default {
    name: 'hello'.data() {
      return{xAxis: [], //x axis data rate: [], //y axis data AreaHouse: [], // all data myChart:' ', //chart
        _id:[],
        detail:[]
      }
    },
    created() {
    this.getAreaHouse()
    },
    mounted() {/** * Initialize the echarts instance based on the prepared DOM */ this.mychart = this.$echarts.init(document.getElementById('myChart') this.clickbar ()}, methods: {/** * return logic */turnBack(){ this.formateData(this.AreaHouse); This.drawline ()}, /** * click the bar interaction */clickBar() {let that = this
        this.myChart.on('click'.function(params){ ... })}, / async getDetail({param}){await getDetailList(param). Then ((data)=>{if(data.code === 0){ this.detail = data.area.areaDetail; this.formateData(this.detail); This.drawline ()}})}, /** * get the price of a large area */ asyncgetAreaHouse(){
        await getAreaHouseList().then((data)=>{
          if(data.code === 0){ this.AreaHouse = data.areaHouse; this.formateData(this.AreaHouse); This.drawline ()}}, /** * formateData(data) {this.drawline ()}}, /** * formateData(data) {let textAry = [],_id=[],rate=[];
        for (let i = 0; i < data.length; i++) {
          textAry.push(data[i]['text'])
          _id.push(data[i]['_id'])
          let sortAry = mergeSort(data[i]['house'])
          data[i]['house'] = sortAry
          rate.push(sortAry[0]['aroundPrice'])
        }
        this.xAxis = textAry
        this._id = _id
        this.rate = rate
      },
      drawLine() {** ** graph */... }, components:{'my-footer': Footer,
       'my-header': Header
     }
  }
</script>


Copy the code

At this point, we have completed half of the project, and the rest is the extraction of routes, interface definition and JSON data storage. Take a break, good you see (and do) here, you can literally applaud. How about…

Ah ha ha ha ha ha ha ha ha ha ha