FYI logo

Meta robots Tag- how to use it?

Why it is important for SEO

By Joe SmithPublished 2 years ago 9 min read
Like

Robot meta directives (sometimes referred to as robots meta tags) are pieces of code that provide instructions on how robots should crawl or index the content of a website's pages.

While the directives in the robots.txt file give robots suggestions on how to crawl the pages of a website, the tags in question provide firmer instructions as to how to crawl and index content. of a given page.

What are these robots meta tags for?

Meta robots tags give crawlers instructions on how to crawl and index the information they find on a specific webpage. When these directives are discovered by crawlers, their parameters serve as strong suggestions about crawlers' indexing behavior.

Unfortunately, as with the instructions placed in the robots.txt file, crawlers are not required to follow your directions, so it's a safe bet that some malicious bots will ignore your instructions and "swallow » the content of your pages without any qualms.

Indeed, it is worth remembering that robots beacons are not a good security mechanism: if you have private information and you do not want it to be publicly available, choose a more secure approach, such as password protection , to prevent visitors and robots from consulting these confidential pages.

The 2 types of robots meta tags

There are two types of beacons:

those that are part of the HTML page (“robots”)

those that the web server sends as HTTP headers ("x-robots-tag")

The same parameters (like "noindex" and "nofollow") can be used by both the robots meta tag and the x-robots tag: the only difference is how these parameters are communicated to robots.

meta robots tag

The robots meta tag is part of the HTML code of a web page. It appears as code elements in the <head> section of a web page, such as:

<code> <meta name="robots" content="[PARAMETER]"></code>

If the tag <meta name="robots" content="[PARAMETER]">is standard, you can also provide guidelines to specific crawlers by replacing "robots" with the name of a specific user agent.

For example, if you want to specify a specific directive for Googlebot, you can use the following code:

<meta name="googlebot" content="[PARAMETER]">

Want to use more than one directive on a page? As long as they're aimed at the same crawler, multiple directives can be included in a single meta tag – just separate them with commas.

Here is an example :

<meta name="robots" content="noimageindex, nofollow, nosnippet">

As we will see below, this piece of code tells robots not to index the images on the page, not to follow any of the links and not to display an excerpt from the page, when it appears in search results. .

If you want to give different instructions to different search crawlers, you will need to use separate tags that will address each crawler.

X-robots beacon

While the meta robots tag allows you to control indexing behavior at the page level, the x-robots tag is included in the HTTP header to control the indexing of a page as a whole, as well as very specific elements of a page.

Although you can use the x-robots tag to perform the same indexing directives as the robots meta tag, the x-robots directive offers more flexibility and functionality than the latter.

Indeed, the x-robots directive allows the use of regular expressions, the execution of indexing directives on non-HTML files and the application of parameters at a global level.

To use the x-robots tag, you must have access to your website's header.php, .htaccess , or server access file. From one of these files, add the x-robots-tag markup for your specific server configuration, including any parameters. Here are some examples of what using the x-robots tag allows you to do:

Control indexing of non-HTML content (like video)

Block the indexing of a particular element of a page (like an image or a video), but not that of the page itself.

Check indexing if you don't have access to a page's HTML code (especially the <head> section) or if your site uses a global header that can't be changed.

Add rules to determine if a page should be indexed or not (for example, if a user has commented more than 20 times, index their profile page).

What are the robot tag settings?

Below are the parameters search engine crawlers understand and follow (or not) when used in robots meta tags.

These parameters are not case sensitive, however note that some search engines may follow only a subset of these parameters, or treat some directives slightly differently.

All

This is the default tag, you don't need to put it: it tells the search engine to index a page.

Follow

Even if the page is not indexed, the crawler must follow all the links contained in this page and pass equity to the linked pages. This parameter does not need to be specified. This is also a default!

Noindex

Tells a search engine not to index a page. With this setting, the page will not appear in search results, but the links on it will be followed by the crawler.

"Subscribers only" pages are an example of using this tag. You don't want search engines indexing paid content… However, the links they link to can benefit from its authority.

You can use the noindex meta robot tag to avoid being penalized for duplicate content. Likewise, if two or more URLs on your website point to the same page, this setting will prevent crawlers from deciding for themselves which link to prioritize (or penalize). You remain in control of your SEO.

Nofollow

Tells the crawler not to follow links on a page and therefore not pass on link equity. These links may be contained in navigation buttons, images or other resources.

You can also add a nofollow tag to links in your blog comments to prevent spammers from taking advantage of your content. Consider also using it on paid links in banners and advertisements, as well as on the logos of your customers and partners.

None

Equivalent to using the noindex and nofollow tags together. You are telling the crawler to completely ignore the page. It will not be indexed and the links it contains will not be followed.

This tag is useful for outdated pages that you want to update or those under construction, for example.

Noimageindex

As we saw earlier, this setting tells the crawler not to index images on a page. This protects your images against use without your prior permission. You can also set the meta name as googlebot-image to specifically prevent Google bots from crawling your site for visuals.

However, note that images may still be indexed if links from other pages direct crawlers to them.

noarchive

Search engines should not display a cached link to this page on a search engine. This prevents Internet users and the crawler from accessing sensitive content that you want to protect.

You can use this tag for paid landing pages or internal documents. It is beneficial for news sites that wish to reserve some of their content for their subscribers or set up a paywall.

Same as noarchive, but only used by Internet Explorer and Firefox.

Nosnippet

Tells a search engine not to display an excerpt from this page (i.e. the meta description) on a search result. When using this tag, data cannot appear in SERP rich snippets.

Instead of displaying the metadata you selected, search engines may choose to display another snippet, not always relevant to your SEO strategy.

It's also possible to mark specific parts of text that you don't want to be used as snippets with the data-nosnippet parameter.

Max-snippet: [number]

With this setting, you tell the crawler the specific number of characters to show in SERP snippets, although it doesn't prevent your content from being indexed.

Replace the [number] variable with the maximum number of characters you want to apply to a text snippet for this search result.

Putting [0] is equivalent to the nosnippet instruction. Search engines won't show any of your content as snippets in the SERPs.

Putting [-1] will give Google the task of determining the length of the snippet itself. No limit will be applied in this case.

Note that this command is no longer applicable if you use structured data on your page. Apart from images and videos, this setting applies to all other types of results: Google Images, Google Assistant, Discover, etc.).

Unavailable_after [DATE and TIME]

Search engines should no longer index this page after a particular date.

Since the crawler may still visit the page occasionally, it may remain in the index, but with less chance of ranking well.

Attention, it is not because the content of a page does not change that it should not be explored regularly. Use this tag for event pages or time-limited job postings.

Example: <meta name=”robots” content=”unavailable_after: 2020-09-21″>

Notranslate

When this instruction is not specified, Google may display a link next to the result to help users view translated content on your page.

If you don't want to offer a translation for this page in search results, use this directive.

max-image-preview: [PARAMETER]

This statement is used to set the maximum size of an image preview for this page in search results. If you don't specify the max-image-preview statement, Google will display a preview image with the default size.

Three values ​​are accepted:

“none”: no image preview should be displayed

“standard”: a default image preview can be displayed

“large”: a large image preview may appear

The values ​​standard and none allow you to prevent the use of large thumbnails when displaying AMP pages in search results.

3 mistakes to avoid when using robots meta tags

To take advantage of meta robots tags and improve your SEO, here are 3 mistakes to avoid.

robot meta tag

1. Bad typography

Crawlers recognize attributes, values, and parameters in both upper and lower case. However, it is recommended to put your tags in lowercase to improve their readability, especially at the code level.

Also get into the habit of including commas and spaces to make your parameters easier to decipher.

2. The use of conflicting tags

Using conflicting tags results in indexing errors.

For example, if you have several meta tags like: <meta name=”robots” content=”follow”> and <meta name=”robots” content=”nofollow”>, only “nofollow” will be considered.

Why ? Because crawlers favor restrictive values.

The “follow” tag being a default value, avoid using it. Just put "nofollow" for the links where you want to prevent tracking.

The confusion between noindex and disallow

Noindex prevents robots from indexing a page, not crawling it! To prevent your page from being crawled, you must apply the disallow command in the robots.txt.

And to unindex a page, add noindex then disallow in the header thereof.

Our tip

There you go, you know all about using these robots meta tags. If you do not know how to take advantage of it to improve your SEO, do not hesitate to call on a professional !

Mystery
Like

About the Creator

Joe Smith

Joe Smith is a professional researcher and write in different niches to engage visitors. Moreover, writing that is useful, error-free, and adheres to SEO standards.

Industries

  • Automotive
  • Business
  • Moving and Storage
  • Rustic Wood
  • Tech
  • Travel etc.

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments (1)

Sign in to comment
  • TheBestIndexer12 months ago

    The article explains the importance of the Meta Robots Tag in SEO and how to use it to give search engine crawlers instructions on how to crawl and index the content of a webpage. It discusses two types of Meta Robots Tag, i.e., Meta Robots Tag and X-Robots-Tag, and how they can be used to control indexing behavior. The article also provides an overview of the parameters that search engine crawlers understand and follow, such as "Noindex" and "Nofollow," and how they can be used to prevent duplicate content and control SEO. However, the article cautions that robots beacons are not a reliable security mechanism for confidential information, and a more secure approach, such as password protection, should be used to protect sensitive data.

Find us on social media

Miscellaneous links

  • Explore
  • Contact
  • Privacy Policy
  • Terms of Use
  • Support

© 2024 Creatd, Inc. All Rights Reserved.