Everything You Need To Know About The X-Robots-Tag HTTP Header

Posted by

Seo, in its the majority of fundamental sense, relies upon something above all others: Search engine spiders crawling and indexing your website.

However almost every website is going to have pages that you don’t wish to consist of in this exploration.

For example, do you truly desire your privacy policy or internal search pages appearing in Google results?

In a best-case circumstance, these are doing nothing to drive traffic to your site actively, and in a worst-case, they could be diverting traffic from more vital pages.

Fortunately, Google allows web designers to tell search engine bots what pages and content to crawl and what to disregard. There are a number of methods to do this, the most common being utilizing a robots.txt file or the meta robots tag.

We have an outstanding and comprehensive explanation of the ins and outs of robots.txt, which you need to certainly read.

But in top-level terms, it’s a plain text file that lives in your website’s root and follows the Robots Exclusion Procedure (REP).

Robots.txt offers crawlers with instructions about the website as an entire, while meta robots tags include directions for particular pages.

Some meta robotics tags you may utilize consist of index, which tells online search engine to add the page to their index; noindex, which informs it not to include a page to the index or include it in search results page; follow, which advises a search engine to follow the links on a page; nofollow, which tells it not to follow links, and a whole host of others.

Both robots.txt and meta robots tags are useful tools to keep in your toolbox, however there’s likewise another way to instruct online search engine bots to noindex or nofollow: the X-Robots-Tag.

What Is The X-Robots-Tag?

The X-Robots-Tag is another way for you to control how your websites are crawled and indexed by spiders. As part of the HTTP header reaction to a URL, it manages indexing for an entire page, in addition to the particular components on that page.

And whereas utilizing meta robotics tags is fairly simple, the X-Robots-Tag is a bit more complex.

However this, naturally, raises the question:

When Should You Use The X-Robots-Tag?

According to Google, “Any regulation that can be utilized in a robotics meta tag can likewise be specified as an X-Robots-Tag.”

While you can set robots.txt-related directives in the headers of an HTTP response with both the meta robots tag and X-Robots Tag, there are particular circumstances where you would want to use the X-Robots-Tag– the two most common being when:

  • You want to control how your non-HTML files are being crawled and indexed.
  • You wish to serve regulations site-wide rather of on a page level.

For example, if you want to block a specific image or video from being crawled– the HTTP reaction approach makes this easy.

The X-Robots-Tag header is also helpful due to the fact that it allows you to integrate numerous tags within an HTTP reaction or utilize a comma-separated list of directives to define regulations.

Perhaps you don’t want a specific page to be cached and want it to be unavailable after a specific date. You can utilize a mix of “noarchive” and “unavailable_after” tags to instruct online search engine bots to follow these directions.

Basically, the power of the X-Robots-Tag is that it is far more versatile than the meta robotics tag.

The benefit of using an X-Robots-Tag with HTTP actions is that it permits you to utilize regular expressions to execute crawl regulations on non-HTML, along with use criteria on a larger, global level.

To assist you comprehend the distinction in between these regulations, it’s valuable to categorize them by type. That is, are they crawler directives or indexer instructions?

Here’s a handy cheat sheet to describe:

Crawler Directives Indexer Directives
Robots.txt– utilizes the user agent, allow, disallow, and sitemap instructions to define where on-site online search engine bots are permitted to crawl and not permitted to crawl. Meta Robots tag– permits you to specify and prevent search engines from revealing particular pages on a website in search results.

Nofollow– enables you to specify links that must not hand down authority or PageRank.

X-Robots-tag– allows you to manage how defined file types are indexed.

Where Do You Put The X-Robots-Tag?

Let’s state you want to block particular file types. An ideal technique would be to include the X-Robots-Tag to an Apache setup or a.htaccess file.

The X-Robots-Tag can be contributed to a website’s HTTP reactions in an Apache server setup via.htaccess file.

Real-World Examples And Uses Of The X-Robots-Tag

So that sounds excellent in theory, however what does it look like in the real life? Let’s have a look.

Let’s say we wanted online search engine not to index.pdf file types. This setup on Apache servers would look something like the below:

Header set X-Robots-Tag “noindex, nofollow”

In Nginx, it would look like the listed below:

place ~ * . pdf$ add_header X-Robots-Tag “noindex, nofollow”;

Now, let’s take a look at a different circumstance. Let’s say we want to use the X-Robots-Tag to obstruct image files, such as.jpg,. gif,. png, and so on, from being indexed. You might do this with an X-Robots-Tag that would look like the below:

Header set X-Robots-Tag “noindex”

Please note that comprehending how these directives work and the effect they have on one another is crucial.

For example, what happens if both the X-Robots-Tag and a meta robotics tag are located when spider bots find a URL?

If that URL is blocked from robots.txt, then certain indexing and serving directives can not be found and will not be followed.

If instructions are to be followed, then the URLs consisting of those can not be disallowed from crawling.

Look for An X-Robots-Tag

There are a couple of various approaches that can be utilized to look for an X-Robots-Tag on the site.

The easiest method to examine is to install a web browser extension that will tell you X-Robots-Tag info about the URL.

Screenshot of Robots Exemption Checker, December 2022

Another plugin you can utilize to figure out whether an X-Robots-Tag is being used, for example, is the Web Developer plugin.

By clicking the plugin in your web browser and navigating to “View Response Headers,” you can see the different HTTP headers being utilized.

Another method that can be utilized for scaling in order to pinpoint concerns on sites with a million pages is Yelling Frog

. After running a site through Yelling Frog, you can navigate to the “X-Robots-Tag” column.

This will reveal you which sections of the website are using the tag, along with which specific directives.

Screenshot of Yelling Frog Report. X-Robot-Tag, December 2022 Using X-Robots-Tags On Your Website Understanding and managing how online search engine interact with your website is

the foundation of search engine optimization. And the X-Robots-Tag is a powerful tool you can use to do just that. Just understand: It’s not without its threats. It is extremely easy to make a mistake

and deindex your entire website. That said, if you’re reading this piece, you’re most likely not an SEO beginner.

So long as you use it carefully, take your time and inspect your work, you’ll discover the X-Robots-Tag to be a beneficial addition to your arsenal. More Resources: Included Image: Song_about_summer/ Best SMM Panel