Introduction to the
SEO, robot.txt, search engine optimization
In the Internet world of Haohai:
- The Internet is the universe
- Sites like galaxies
- The web is a planet
- Web content is like anything
And the search engine crawler spider roaming the Internet is like a space rover, which is pretty romantic to think about. Each galaxy has its own rules, and if you don’t follow them, be careful that the automatic defense will destroy the wanderer
I once imagined that the world was made up of codes, which was quite interesting. Many supernatural events could be explained as bugs. Once, I had an evening chat with my classmates with rich imagination, and I had the opportunity to find some time to build a code world view.
Rover rule
At the entrance to each galaxy, the root directory of the website, there is a robot.txt, also known as the Wanderer Rules, which records the rules that the wanderer should obey. The Wanderer rule is more of a protocol, and it’s not written that all crawlers will follow it.
When there is no content output, many companies or individuals tend to crawl to other people’s site data through crawlers. If they obey the rules, they can also be called wanderers, but those who are not allowed to crawl unscrupulously are called pirate ships. Sites that are being crawled make certain judgments about these pirate ships, or access rating limits to protect themselves.
List of rules
In robot.txt, user-agent is used to specify which rules should be obeyed by those rovers. The * asterisk is used to indicate that all rovers should be obeyed, such as user-agent: *. Restrictions can also be imposed on specific roaming devices, such as Baidu’s user-agent: Baiduspider. Below the list rule are the corresponding permit and reject rules:
- Allow the law to pass
Allow:
With the path rulebots
Which links areShould be
Crawl access to. - Rejection rule passed
Disallow:
With the path rulebots
Which links areShould not be
Crawl access to.
Path rules
The path that forms query for pathName can be pieced together with the * and $symbols to form a website path rule. Here are a few examples:
- List of users
https://pushme.top/users
Express by path/users
- This article reviews
https://pushme.top/posts/1/comments
Express by path/posts/*/comments
- The style file
https://pushme.top/assets/styles/main.css
Express by path/assets/styles/*.css$
For more details on URLS, see URL Explosion
Galaxy recommendation
The Sitemap web map is introduced to tell the wanderer which sites and pages are worth visiting. Through a Sitemap: to specify a Sitemap: https://pushme.top/sitemap.xml.
Odd and even rule
Websites, like real life, have odd and even numbers, with rovers and pirate ship crawlers taking up server resources. If too many resources are occupied, normal users will not be able to access the website, so the odd-even rule is used to limit the visiting frequency of the roaming device:
Crawl-delay: n
Each grab interval n seconds.Request-rate: x/n
Crawl X pages in n seconds.
The Golden Rover rule
After talking about the overall structure of the Rover rule, let’s read the Golden Rover rule together. Visit https://juejin.im/robots.txt you will see the following content:
User-agent: * Request-rate: 1/1 Crawl-delay: 5 Disallow: /timeline Disallow: /submit-entry Disallow: /new-entry Disallow: /edit-entry Disallow: /notification Disallow: /subscribe/subscribed Disallow: /user/settings Disallow: /reset-password Disallow: /drafts Disallow: /editor Disallow: /user/invitation Disallow: /user/wallet Disallow: /entry/*/view$ Disallow: /auth Disallow: /oauth Disallow: /zhuanlan/*? sort=newest Disallow: /zhuanlan/*? sort=comment Disallow: /search Disallow: /equationCopy the code
It can be seen that the rule of gold digger is relatively loose, limiting the access rating rate and should not visit the web page. There are no restrictions on specific Baidu and Google wanderers, so students can also write a rover to crawl part of the gold digger content. You can see it in today’s boiling point:
SEO related content
- Little secret of H1 Mason
- At the beginning of SEO experience
- Img large
- A thousand miles of marriage
- Throw herself
- Rover rule
other
Robots file generation is simple and easy to use.
Small two here only discussed some of the SEO content and easy to do, about SEO related content is discussed here. Although the semantic tag part of the content is also helpful to SEO, but it is very difficult to do in practice, if small two want to simple and easy to understand the method then fill this article.
Grow up together
In the confused city, there is always a partner to grow up together.
- You can click on this if you want more people to see the article
give a like
. - If you want to inspire your mistress thereGithubGive a
Little stars
. - If you want to communicate more with small two add wechat
m353839115
.
PushMeTop originally contributed to this article