The original address: https://xeblog.cn/articles/9

preface

In this blog architecture, I use the mode of separating the front end and the back end with independent front-end projects. The back end provides the data interface, uses Ajax to interact with the server-side interface, and the front end renders the data returned by the interface. This mode is very bad for SEO optimization, because the acquisition of data needs to execute JS code, Baidu spider is not able to execute JS, so even if Baidu included my blog, there will be no real data display, display is full of HTML written dead some words. PS: It is said that The Google Spider can execute JS, but it seems to have to meet some conditions. I won’t go into the details here (because I can’t either).

The picture below is the information when Google included my website before optimization

How to optimize?

Go to prerender. IO/for Prerender information

Install the Prerender

Prerender is a Node.js based program that requires a Node.js environment before it is installed. The process of installing Node.js is not described here… Hahaha.. Hic.

Install and start Prerender

If you don’t have a Git environment, download your project here

git clone https://github.com/prerender/prerender.git
cd prerender
npm install
Start server.js and listen on port 3000 by default
node server.js
Copy the code

Execute the following code. If parsed data is returned, congratulations on your successful startup

Curl http://localhost:3000/ Full path to your websiteCopy the code

Forever daemon

Node.js applications will stop running when the command window is closed, so we need to add the application to the daemon to keep it working… Forever.. It is so hard to work all the time. Next year, I will be rewarded with a piece of dedication. Ha, ha, ha)

What is forever?

A simple CLI tool for ensuring that a given script runs continuously (i.e. forever).

The installation of forever

For details, please go to github.com/foreverjs/f…

npm install forever -g   # installation
forever start server.js  # Start the app
forever list  Display all running services
forever stop server.js  # Close the app
forever restartall  # Restart all applications
Copy the code

We simply go to prerender root and use forever start server.js… Then it will have the blessing of professionalism

Nginx configuration

We need to do separate processing for spider requests like Baidu, Google, and so on, so that requests go to prerender, whereas requests from normal users go directly to the original address

The main configuration is as follows

location / {
	# indicates whether a proxy is required
	set $prerender 0;
	# proxy address
	set $prerender_url "http://127.0.0.1:3000";

	Check if the request is from a spider, if so, proxy is required
	if ($http_user_agent~ *"baiduspider|Googlebot|360Spider|Bingbot|Sogou Spider|Yahoo! Slurp China|Yahoo! Slurp|twitterbot|facebookexternalhit|rogerbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator") {
 		set $prerender 1;
 	}
 
	if ($prerender = 1) {
		proxy_pass $prerender_url;
		rewrite ^(.*)$ /https://$hostThe $1 break; }}Copy the code

Once configured, reload the configuration file using nginx -s reload

We’ll test the effect later

test

Curl curl curl curl curl curl curl curl curl curl curl

As shown, no real data is parsed

Use curl -h ‘user-agent :Googlebot’ to test the full path of your website

Parse succeeded!

Article address REST style

/articles/? Id = XXX style urls are not friendly to spiders, who like /articles/ XXX

I implemented the REST style using the rewrite functionality of Nginx. The main configuration is as follows

rewrite ^(.*)/articles/(\d+)$ /articles/? id=$2 break;
Copy the code

Modify the proxy block to allow spiders to change the url to jump to when accessing the REST URL

/articles/ XXX or /articles/ id=xxx

Use Nginx’s URL rewriting feature to redirect a request to a real address without changing the URL in the browser’s address bar

# spider access processing
if ($prerender = 1) {
	proxy_pass $prerender_url;
	rewrite ^(.*)/articles/(\d+)$ /https://$host/articles/? id=$2 break;
	rewrite ^(.*)$ /https://$hostThe $1 break;
}
Normal user access to REST address processingrewrite ^(.*)/articles/(\d+)$ /articles/? id=$2 break;
Copy the code

Test the

Final SEO Effect

Google effect is particularly good, Google spider is too diligent, very dedicated! (PS: Baidu spider is too lazy… I think this problem can be solved with money!