preface
Some time ago, I did SEO optimization for the project, and now I write the summary. We know that SPA applications are commonly developed with Vue/React, but natural single-page applications such as SEO are not good. Although there are various techniques that can be improved, such as using pre-rendering, there are also various disadvantages. However, even this can not resist the trend of frameworks like Vue/React. Many products can also be popularized through other highlights without relying on SEO.
If a project really has rigid requirements for SEO and first-screen loading speed, and uses technologies like Vue/React, and wants to minimize the added difficulty of code development, a more direct way is to use the server rendering framework, Vue nuxt.js, The React of Next. Js/Gatsby.
However, in fact, learning a new framework is also an additional cost haha, but SSR rendering but the actual development of use, at least to understand. I am currently based on React technology stack, so I only know about React SSR rendering framework. If you are interested, please read my two articles:
- Getting Started with NextJs (V9.5)
- Take your hand in hand to get you started
Therefore, this article does not discuss SEO optimization of single page application, it is about SEO optimization based on server rendering (SSR)/static generation (SSG) website.
This article reviews both traditional SEO optimization and Optimization based on Gatsby SEO.
Server render SSR with static site render SSG
The server side is where the client makes a request to the server, and then the runtime dynamically generates THE HTML content and returns it to the client. Parsing of static sites is done at build time, and when a request is made, the HTML is statically stored and sent directly back to the client.
In general, static sites are faster to run because no server processing is required, but the downside is that any changes to the data require a complete rebuild on the server; Server-side rendering, on the other hand, processes the data dynamically and does not require a complete rebuild.
For Vue/React, the reason for their SSR/SSG framework is mainly SEO and first screen loading speed.
How search engines work
There is a huge database in the background of a search engine website, which stores a large number of keywords, and each keyword corresponds to a lot of urls, which are called “search engine spiders” or “web crawlers” collected from the Internet.
These “spiders” crawl the Internet from link to link, analyzing content and extracting keywords to add to databases; If the spider thinks it is garbage or repeated information, it will abandon and continue to crawl. When the user searches, the url related to the keyword can be retrieved and displayed to the user.
When a user searches in a search engine, for example, for “front end”, all pages containing the keyword “front end” pop up, and each page containing the keyword “front end” is ranked according to a specific algorithm to return the search results. The content that contains the “front end” can be title, description, keywords, content, or even links. Of course, it could be advertising first, you know.
A keyword pair with multiple sites, so there is a sorting problem, the corresponding when the keyword most consistent with the site will be in the front. In the “spider” crawl web content, extract keywords in this process, there is a problem: “spider” can understand. If the site content is flash, JS, etc., then it will not understand, will be confused, even if the keyword is appropriate. Accordingly, if the website content can be identified by the search engine, then the search engine will improve the weight of the site, increase the friendliness of the site. This process is called SEO(Search Engine Optimization).
SEO purpose
Make the site more conducive to the major search engines capture and collection, increase the friendliness of the search engine, so that users can be ranked in the front when searching the corresponding keywords of the site, increase the exposure rate and flow of the product.
SEO optimization methods
Here we mainly talk about the front-end can participate in and do optimization. For example, many SEO optimization methods are introduced: control the number of links to the home page, flat directory level, optimize the site structure layout, paging navigation writing, but in fact, daily front-end development can not serve as the role of the overall design of the site, can only be coordinated, most of these are set at the beginning of things.
For example, news media and other websites pay more attention to SEO, usually the company will have an SEO department or SEO optimization engineer position, as said above, and the page keywords, description is given to them to participate in and provide, some of the optimization way we are difficult to touch will not discuss in detail, interested can go to understand.
Webpage TDK TAB
- Title: The title of the current page (emphasis only, each page title should not be the same)
- Description: A description of the current page (just list a few keywords, don’t pile them up too much)
- Keywords: keywords of the current page (highly summarized web content)
The TDK of each page is different, and this requires refining the core keywords based on the product business.
If the TDK is different, you’ll need to set it up dynamically. React-helmet is used to set the header tag.
import React from 'react'
import { Helmet } from 'react-helmet'
const GoodsDetail = ({ title, description, keywords }) = > {
return (
<div className='application'>
<Helmet>
<title>{title}</title>
<meta name='description' content={` ${description} `} / >
<meta name='keywords' content={` ${keywords} `} / >
</Helmet>
<div>content...</div>
</div>)}Copy the code
The above is a demo, the actual project will still separate the content in Helmet into components.
Import Head from ‘Next/Head’
Semantic tag
According to the structure of the content, choose appropriate HTML5 tags to make the code as semantic as possible, such as using header, footer, section, aside, article, nav and other semantic tags can make the crawler better parse.
Properly use H1 to H6 labels
An H1 tag can appear at most once on a page, and the H2 tag is usually used as a secondary heading or subheading of an article. Other H3-H6 labels should be nested layer by layer in order if they are to be used, not fault or reverse order.
For example, an H1 tag is usually added to the logo on the homepage, but if the website design only displays the logo without text, the H1 text can be hidden by setting font size to zero
<h1>
<img src="logo.png" alt="jacky" />
<span>Jacky's personal blog</span>
</h1>
Copy the code
The Alt attribute of the image
In general, the Alt attribute can be null unless the image is a pure display class with no actual information. Otherwise use the IMG tag to add the Alt attribute, so that the spider can grab the image information. When the network fails to load or the image address fails, the content of the Alt attribute will be displayed instead of the image.
<img src="dog.jpg" width="300" height="200" alt="Husky" />
Copy the code
A Title of the label
Similarly, the title attribute of a label is actually a prompt text. When the mouse moves over the hyperlink, a prompt text will appear. Adding this attribute also has a small SEO benefit.
<a
href="https://github.com/Jacky-Summer/personal-blog"
title="Learn more about Jacky's personal blog"
>Learn more </ A >Copy the code
404 pages
404 page first is a good user experience, will not inexplicably reported some other tips. Secondly, the spider is also friendly, not because of page error and stop crawling, you can return to crawl other pages of the site.
Nofollow ignores tracing
- Nofollow can be used in two ways:
- Use meta meta tags to tell crawlers that all links on the page need not be tracked.
<meta name="robots" content="nofollow" />
Copy the code
- The A tag is used to tell the crawler that the page is not traceable.
<a href="https://www.xxxx?login" rel="nofollow">Copy the code
Usually used in a label, it has three main functions:
- The “spider” assigns a certain weight to each page, set to centralize the weight of the page and distribute the weight to other necessary links
rel='nofollow'
Tell “spider” not to crawl, to avoid crawler grasp some meaningless pages, affect crawler grasp efficiency; And once the spider climbs the external link, it doesn’t come back. - Paid links: To prevent paid links from influencing Google’s search results rankings, Google recommends using the Nofollow attribute.
- To prevent untrusted content, the most common is to obtain external links in spam messages and comments on blogs, in order to prevent pages pointing to some garbage pages and sites.
Create the robots.txt file
The robots. TXT file consists of one or more rules. Each rule disallows (or allows) a specific scraping tool to crawl a specified file path in the corresponding web site.
User-agent: *
Disallow:/admin/
SiteMap: http://www.xxxx.com/sitemap.xml
Copy the code
Key words:
- User-agent Indicates the name of the web page fetching tool
- Disallow indicates directories or web pages that should not be fetched
- Allow Directory or web page that should be crawled
- Sitemap Sitemap location
User-agent: *
This parameter is valid for all search enginesUser-agent: Baiduspider
Baidu search engine, Google Googlebot and other search engine names, through which you can set the content accessed by different search engines
For example, baidu’s robots.txt and JINGdong’s robots.txt
Robots files are the first to be accessed by search engines when they visit a website, and then crawl the website content according to the rules set in the files. Use Allow and Disallow to access directories and files to guide crawlers to crawl information from websites.
It is mainly used to keep your site from getting too many requests and to tell search engines which pages to crawl and which not to crawl. If you don’t want pages on your site to be crawled that might be useless to users, set it up with Disallow. Implement targeted SEO optimization, expose useful links to crawlers, and protect sensitive and useless files.
Set an empty Robot file even if everything on your site is expected to be captured by search engines. Because when the spider crawls the site content, the first crawls the file robot file, if the file does not exist, then when the spider visits, the server will have a 404 error log, multiple search engines crawls the page information, will produce multiple 404 errors, Therefore, it is generally necessary to create a robots.txt file to the root directory of the website.
Empty robots. TXT file
User-agent: *
Disallow:
Copy the code
If you want a more detailed understanding of the robots.txt file, you can look at:
- About robots. TXT
- About robots.txt and SEO: Everything you need to know
Generally involving more directories will find a website tool to dynamically generate robots.txt, such as generating robots.txt
Create a sitemap
When the site is first launched, there are not many external links to the site, and crawlers may not find these pages; Or there is no good connection between the web pages of the website, and the crawler is easy to miss some web pages. This is where sitemap comes in handy.
Sitemap is a file that categorizes website columns and links, allowing search engines to fully include site web addresses, understand the weight distribution of site web addresses and site content updates, and improve crawler crawling efficiency. A Sitemap file cannot contain more than 50,000 urls, and the file size cannot exceed 10MB.
Sitemap map files come in HTML (for users) and XML (for search engines). The most common one is XML. Sitemap in XML format uses six tags. Key tags include link address (LOC), update time (lastmod), update frequency (Changefreq) and index priority (Priority).
How does the crawler know if the site provides a sitemap file? That is, the path mentioned above is placed in robots.txt.
First find the root directory of the site to find the robots.txt, such as the following under Tencent’s robots.txt:
User-agent: *
Disallow:
Sitemap: http://www.qq.com/sitemap_index.xml
Copy the code
The sitemap path is found (listed only partially)
The < sitemapindex XMLNS = "http://www.sitemaps.org/schemas/sitemap/0.9" > < a sitemap > <loc>http://news.qq.com/news_sitemap.xml.gz</loc> <lastmod>2011-11-15</lastmod> </sitemap> <sitemap> <loc>http://finance.qq.com/news_sitemap.xml.gz</loc> <lastmod>2011-11-15</lastmod> </sitemap> <sitemap> <loc>http://sports.qq.com/news_sitemap.xml.gz</loc> <lastmod>2011-11-15</lastmod> </sitemap> <sitemap> </sitemapindex>Copy the code
- Loc: permanent link address of a page, which can be static or dynamic
- Lastmod: Time when the page was last modified. This parameter is not mandatory. Search engines use this, in combination with Changefreq, to determine whether to re-fetch the loC content
After a website is developed, the sitemap is automatically generated, such as sitemap generation tools
Structured data
Structured Data is a standardized format that helps Google understand a web page by giving it clear clues about its meaning. It is usually in jSON-LD format.
<html>
<head>
<title>Party Coffee Cake</title>
<script type="application/ld+json">
{
"@context": "https://schema.org/"."@type": "Recipe"."name": "Party Coffee Cake"."author": {
"@type": "Person"."name": "Mary Stone"
},
"nutrition": {
"@type": "NutritionInformation"."calories": "512 calories"
},
"datePublished": "2018-03-10"."description": "This coffee cake is awesome and perfect for parties."."prepTime": "PT20M"
}
</script>
</head>
<body>
<h2>Party coffee cake recipe</h2>
<p>
This coffee cake is awesome and perfect for parties.
</p>
</body>
</html>
Copy the code
List page categories such as “recipe”, author and publication date, description and cooking time, etc. There’s a chance that Google searches will contain these tips or that you’ll be better able to find results with key information.
There are various fields that describe “recipes”, so you just have to look up the fields and use them in the format.
Because the SEO optimization for Google search engine unique, so there is a set of the way the site is usually the user is not limited to the domestic, not only structured data unique, there is a SEO optimization way is AMP page, interested can see – AMP
Google also provides Structured Data Testing Tool, which allows you to type in the url of a test site to see if it has Structured Data Settings.
Performance optimization
For example, reduce HTTP requests, control page size, lazy loading, the use of caching and so on, there are many ways, all in order to improve the site load speed and good user experience, this is not specifically refers to SEO, is the development of things to do.
Because when a website is slow, when it times out, the spider leaves.
SEO optimization under Gatsby
Gatsby itself adopts the method of static generation. SEO is ok, but SEO optimization still needs to be done.
Knowing the above SEO optimization methods, how should Gatsby combat optimization? As the Gatsby community is relatively powerful and has many plug-ins, the above several plug-ins can be configured and generated quickly.
gatsby-plugin-robots-txt
Configure it in gatsby-config.js
module.exports = {
siteMetadata: {
siteUrl: 'https://www.xxxxx.com'
},
plugins: ['gatsby-plugin-robots-txt']
};
Copy the code
gatsby-plugin-sitemap
Configure it in gatsby-config.js
{
resolve: `gatsby-plugin-sitemap`,
options: {
sitemapSize: 5000,
},
},
Copy the code
TDK web pages
Both the Gatsby standard scaffolding and official documents have an seo.js file, which provides us with methods for setting TDK
import React from 'react'
import PropTypes from 'prop-types'
import { Helmet } from 'react-helmet'
import { useStaticQuery, graphql } from 'gatsby'
function SEO({ description, lang, meta, title }) {
const { site } = useStaticQuery(
graphql` query { site { siteMetadata { title description author } } } `
)
const metaDescription = description || site.siteMetadata.description
return (
<Helmet
htmlAttributes={{
lang,}}title={title}
meta={[
{
name: `description`,
content: metaDescription}, {property: `og:title`,
content: title}, {property: `og:description`,
content: metaDescription}, {property: `og:type`,
content: `website`,}, {name: `twitter:card`,
content: `summary`,}, {name: `twitter:creator`,
content: site.siteMetadata.author}, {name: `twitter:title`,
content: title}, {name: `twitter:description`,
content: metaDescription,},].concat(meta)} / >
)
}
SEO.defaultProps = {
lang: `en`.meta: [].description: ` `,
}
SEO.propTypes = {
description: PropTypes.string,
lang: PropTypes.string,
meta: PropTypes.arrayOf(PropTypes.object),
title: PropTypes.string.isRequired,
}
export default SEO
Copy the code
Then import seo.js in the page template file and pass in the variable parameters of the page to set TDK and other header information.
structured data
For example, if the project is presented in a news article, you can set up three types of structured data (data types and fields are not fabricated, you need to Google the corresponding match) — article details page, article list page, and a company introduction.
Create a new one in the project root directory
./src/components/Jsonld.js
Encapsulates externally wrapped script tags as components
import React from 'react'
import { Helmet } from 'react-helmet'
function JsonLd({ children }) {
return (
<Helmet>
<script type="application/ld+json">{JSON.stringify(children)}</script>
</Helmet>
)
}
export default JsonLd
Copy the code
./src/utils/json-ld/article.js
– Article details structured data description
const articleSchema = ({ url, headline, image, datePublished, dateModified, author, publisher, }) = > ({
'@context': 'http://schema.org'.'@type': 'Article'.mainEntityOfPage: {
'@type': 'WebPage'.'@id': url,
},
headline,
image,
datePublished,
dateModified,
author: {
'@type': 'Person'.name: author,
},
publisher: {
'@type': 'Organization'.name: publisher.name,
logo: {
'@type': 'ImageObject'.url: publisher.logo,
},
},
})
export default articleSchema
Copy the code
./src/utils/json-ld/item-list.js
– Article list structured data description
const itemListSchema = ({ itemListElement }) = > ({
'@context': 'http://schema.org'.'@type': 'ItemList'.itemListElement: itemListElement.map((item, index) = > ({
'@type': 'ListItem'.position: index + 1. item, })), })export default itemListSchema
Copy the code
./src/utils/json-ld/organization.js
– Structured data description of company organization
const organizationSchema = ({ name, url }) = > ({
'@context': 'http://schema.org'.'@type': 'Organization',
name,
url,
})
export default organizationSchema
Copy the code
And then introduce the page respectively, for example, we introduce the corresponding type file in the article details page, which is roughly like this:
// ...
import JsonLd from '@components/JsonLd'
import SEO from '@components/SEO'
import articleSchema from '@utils/json-ld/article'
const DetailPage = ({ data }) = > {
// Process data, disassemble related fields
return (
<Layout>
<SEO
title={meta_title || title}
description={meta_description}
keywords={meta_keywords}
/>
<JsonLd>
{articleSchema({
url,
headline: title,
datePublished: first_publication_date,
dateModified: last_publication_date,
author: siteMetadata.title,
publisher: {
name: siteMetadata.title,
logo: 'xxx',
},
})}
</JsonLd>
<Container>
<div>content...</div>
</Container>
</Layout>)}Copy the code
The above code is normal if it is confused, because you don’t know the content of structured data, but you can probably see it when you look at the documentation.
Lighthouse Performance optimization tool
Go to the Google Store and install LightHouse, open up F12, go to your site, click Generate Report, and it will Generate a report for your site
- Generated report:
Below it are some tips you can follow to improve your code for performance, SEO, etc.
This is the end of the article, I hope you understand SEO a little help. SEO exploration is not the above example is almost gone, in fact, there are all kinds of ways to optimize. The above list is more common, in fact, I think SEO optimization is nothing more than to attract more users to click and use the site, but if the site’s content quality user experience is good, plus good performance, then there are users after the use of their own promotion, then there is no doubt that SEO simple optimization is much stronger.
Refer to the article
- Front-end SEO optimization
- Ps: Personal technical blog Github warehouse, if you feel good welcome star, give me a little encouragement to continue writing ~