In the past few months, we’ve seen a surge in site owners trying to use 404 and other 4xx codes to reduce how often Googlebot crawls their site. Please don’t do this. It is better to read our documentation on this subject and follow the recommendations.
Googlebots ignore any content received from URLs that return a 4xx status code and do not index such URLs. As a result, pages with this status code may be excluded from Google search, says SearchEngines. And if the robots.txt file is also given with such an error, it seems to Google that it does not exist at all, and the robot will index everything. The only exception is status code 429, which means “too many requests”. It is it what will show the robot that the server is overloaded, and it is necessary to slow down.
To limit how often a site is crawled by Googlebot, it is recommended to:
- Adjust site crawl frequency in Google Search Console;
- If you urgently need to reduce your crawl rate for a short period of time (for example, a few hours or days), you can show an information page with an HTTP status code of 500, 503, or 429 instead of content. But Google does not recommend using this option for more than two days, since the URL , which Googlebot will detect these status codes for a few days, may be excluded from the Google index.
NIX Solutions reminds that Google has updated the documentation for webmasters on links: the section has added information about the correct use of anchor text, as well as internal and external links.