The Last-Modified header speeds up the indexing of new pages significantly. SEO Myths: All about the Last-Modified header Let's check how things work with Last-Modified in various CMSs

In area search engine optimization There are a lot of different myths circulating around websites (SEO). Some of them have a basis, some of them came from nowhere. In this note we will look at one of them - using the last-Modified response header.

Some time ago we received a document entitled “Ingate Recommendations for Web Studios on Promoted Sites.” And one of the “recommendations” was the following:

After a redesign or on a new site being developed, the date of the last modification of the site pages (Last Modified) must be indicated.

To add information about the date of the last modification of pages to a site in PHP, you need to go to the very beginning source code insert a script into each page


header("Last-Modified: " . date("D, d M Y H:i:s", time()) . " GMT");
?>

It was this wild nonsense, this utter nonsense and frankly crazy code that prompted me to write this note. Here I will try to explain what Last-Modified is, why it is needed and how browsers and search engines use it.

What is Last-Modified

When transmitting information to the client (browsers or search robot), the web server reports quite a lot of additional data. They can be viewed in the browser console, for example:

configure the server to issue correct response headers (for example, if the page does not exist, issue a 404 error, and if an If-Modified-Since request is received, then issue a 304 code if the page has not been changed since the date specified in the request).

You can also see that if the server does not respond in any way to a conditional GET request, then it is no different from a regular request. That is, the Last-Modified header with the current time, which is also incorrectly formed (hello Integgate!) is not needed at all!

So is Last-Modified necessary or not?

Generally necessary. But it is important to understand that it is not the header itself that plays any role, but the entire conditional request scenario, which must be fully implemented by the site. It is in this case that we will get high speed site indexing.

But it is often very difficult to implement this in a ready-made CMS. This may require quite significant changes to the code of the CMS itself.

Although for a number of CMS this can be achieved by enabling page caching. If the CMS caches pages, creating and serving essentially static files, then the web server itself will respond correctly to conditional requests. For example, in WordPress this can be achieved using the WP Super Cache plugin:

Let's check it in action. I enabled this plugin, opened the browser in anonymous mode and made two requests for the same page. It is clearly seen that the second answer is correct - 304 Not Modified:

Instead of a conclusion

Thus, we have dealt with the Last-Modified header. First, it must convey information about the date and time the document was actually modified. Secondly, the server’s response to a conditional request with the If-Modified-Since header is extremely important.

Well, listen less to SEOs who don’t know the basics of how the Internet works.

Why is this post in the SEO section? Last modified, as search engines claim, is a very important http header, which is needed to indicate the date of the last modification of the document, that is, the date last change On the page.

Accordingly, if this header does not exist, or rather it will not be given, then the site will be deprived of some advantages. In particular, here is what I read on the Internet about the benefits of last modified:

  1. The speed of indexing new pages improves, and in 1 visit the robot can pick up more pages to index.
  2. The speed of re-indexing of pages to which you have made changes improves. This is very useful, but without this header it will take longer for your edits to be recorded.

In principle, this is already enough to want to check and, if necessary, customize this header.

How to check last modified?

There are several tools, I liked this one the most - http://www.tools.seo-auditor.com.ru/if-modified-since/
Here I just need to enter the address home page or any article, and then select search robot- Yandex.

Last Modified was found on my website, it’s great. But initially it wasn’t there, how did I set it up?


How to configure last modified?

To be honest, nothing helped me. Maybe due to the fact that the nginx server. I installed AddHeaders - this plugin installs all useful http headers, including last modified, but this plugin did not help me, although about a year ago it successfully activated this header on my site.

I also installed the premium Clearfy plugin on this blog, which I wrote about. A useful thing, there is also a function that would allow you to put last modified.

I activated the option, but the header was not returned based on the scan results. But in the end, everything was decided by contacting those. plugin support - there I described the configuration of my server and they gave me specific advice - go to the server control panel, disable this and that. No sooner said than done and now the title is given away.

I think adding a header will have a positive effect on my sites.

Universal solution— the AddHeaders plugin will most likely suit you if you have an Apache server. If nginx, then try disabling ssi in the domain settings and activating this plugin again.

“In particular, the content of the response that the server gives to the “if-modified-since” request is important. The Last-Modified header must indicate the correct date the document was last modified."

Let's check how things work with Last-Modified in various CMSs.

# telnet www.example.com 80

and enter the following:

GET /index.html HTTP/1.0 User-Agent: Mozilla/5.0 From: something.somewhere.net Accept: text/html,text/plain,application/* Host: www.example.com If-Modified-Since: Wed, 19 Oct 2005 10:50:00 GMT

if the server returns 304 (Not modified), then it supports If-Modified-Since, but the page has not been modified. Code 200 (Ok) means that the page has been changed.

If-Modified-Since check in C#

You can check how If-Modified-Since works using the following C# code:

Private HttpWebResponse GetPage() ( string url = @"http://....."; // Place the web request to the server by specifying the URL HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url); // No need for

Using this method you can ensure that Joomla always returns StatusCode=200 (OK), regardless of the value of request.IfModifiedSince.

Checking If-Modified-Since via Yandex service

If in Yandex Webmaster we click on the “Check server response” button, then we get here:

here again you can see that the site is a site and, accordingly, WordPress without the WP Super Cache plugin does not add the Last-Modified header.

Well, we’ve sorted out the CMS, but how does Yandex itself work?

Here we can give the following example: today is July 7, 2011, the content in Joomla was updated on June 20, 2011, and Yandex has a version dated June 11, 2011 in its cache, although after this date the robot has arrived more than once. In this case, Yandex downloads updates with a very significant delay. The question is why?

Here is what Platon Shchukin says about this:

As the robot crawls the site, it will also crawl the specified page, after which it will update search databases it will be updated in the output. We are working to make this happen as quickly as possible.

For your part, you can also help the robot index the site faster by using the following recommendations:

Syntax

If-Modified-Since: , ::GMT

Directives

One of "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", or "Sun" (case-sensitive). 2 digit day number, e.g. "04" or "23". One of "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" ( case sensitive). 4 digit year number, e.g. "1990" or "2016". 2 digit hour number, e.g. "09" or "23". 2 digit minute number, e.g. "04" or "59". 2 digit second number, e.g. "04" or "59".

GMT

Greenwich Mean Time. HTTP dates are always expressed in GMT, never in local time.

Examples

If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT

Specifications Specification
Title RFC 7232, section 3.3: If-Modified-Since

Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests

Browser compatibility

The compatibility table in this page is generated from structured data. If you"d like to contribute to the data, please check out https://github.com/mdn/browser-compat-data and send us a pull request.

Update compatibility data on GitHubDesktop
MobileChromeEdgeFirefoxInternet ExplorerOperaSafariAndroid webviewChrome for AndroidFirefox for AndroidOpera for AndroidSafari on iOS
Samsung InternetIf-Modified-SinceChrome Full support YesEdge Full support 12Firefox Full support YesIE Full support YesOpera Full support YesSafari Full support YesWebView Android Full support YesFirefox Android Full support YesOpera Android Full support YesSafari iOS Full support YesSamsung Internet Android Full support Yes

The note: The adaptive version of the site is activated, which automatically adapts to the small size of your browser and hides some details of the site for ease of reading. Enjoy watching!

Hello dear blog readers We continue the topic, one of the most important SEO factors. This article will touch on what can be called the intricacies of internal optimization, since we will talk about the response code that will be received search engines and visitors in response to their access to the page.

Correct server response

Despite the fact that this is a rather small detail when building and optimizing the site as a whole, it is very important! Namely, it is important that a page on which there have been no changes since the last visit of a robot or a person gives a 304 code, which means that the page has remained unchanged. When the server gives this code to the client, then execution of all PHP scripts does not even start on the page; instead, the page is loaded from the cache, which significantly reduces the load on the server and speeds up page loading for the user.

Thus, by setting up the correct responses from our server, we kill at least five birds with one stone:

  • We speed up page loading for visitors (people).
  • We reduce the load on the server.
  • The date will be shown in the search results (for Yandex for sure) latest update page, which can attract the user's attention, especially if the date is recent.
  • Site pages will be involved in sorting search engines by date.
  • We significantly speed up the indexing of your site by search engines!

For some reason for me last point seems the sweetest (as it affects SEO and increases the credibility of your site among search engines), although without a doubt the other points are also extremely important.

How to configure 304 and 200 server responses?

We have already said that in response to a request to unchanged pages, the server must return 304 Not Modified, and what code should the server give if the client accesses the page for the first time or accesses a changed page? In such cases, the server must give the status 200 OK. Specially this code There is no need to send, if everything is in order with the page, then it always gives 200.

Therefore, we only need to take care of the 304 code, since the server will not send it without our intervention. To do this, the title will help us Last-Modified and request .

Headings Last-Modified

Last-Modified is the header we send with using PHP, this header contains exact time Last page change (in seconds). For this purpose, a generally accepted measure of time is used: Unix Time Stamp.

Unix time stamp is the number of seconds that have passed since the beginning of the Unix era: January 1, 1970. At the time of writing this sentence, the Unix time stamp is equal to 1370597447 seconds - this is 06/07/2013 09:30:47 GMT (+00:00).

That is, all we need to do is just send a PHP header with instructions Last-Modified and the desired date:

Header("Last-Modified: ".gmdate("D, d M Y H:i:s", $last_modified_time)." GMT");

Where header is a construct for sending an HTTP header, Last-Modified– what we send and immediately after the colon comes its value:

Gmdate("D, d M Y H:i:s", $last_modified_time)." GMT".

The Last Modified value is the function gmdate(), which contains a variable I invented $last_modified_time(you can call it whatever you want). In variable $last_modified_time and contains the time of the last change in the format Unix Time Stamp, and the function gmdate() serves us to bring the date into proper form (Greenwich Mean Time).

For clarity, here is an example: if we use a function gmdate() let's put the value 1365003142 , then the output will be: Wed, 03 Apr 2013 15:32:22.

Now that we have learned how the whole process works, the question may arise: “Do we have to manually specify the last modified time for each page?” Answer: “Yes!” Personally, I do exactly this - manually, the most reliable option. However, specifically for this blog, I have provided for everything, for example, if new comment on the page, then into a variable $last_modified_time The time this comment was added is recorded so that search engines can index new comments and know that the site is “live.” Each site is individual and you will have to come up with your own algorithm for indicating the date the page was last modified, or always specify it manually.

Let me emphasize again, my algorithm is as follows:

1) I indicate the date of creation of the material manually; if I change something in the article (typos or additions), then I again manually enter the new time of the last update.

2) If the visitor adds a comment, then in the variable $last_modified_time automatically, without my knowledge, the time the comment was added is entered, since in fact this will be the date the page was last modified.

What I didn’t take into account: in the right column of the site I have latest articles, recommended And top 10. They change constantly and at the same time for all pages. If I had every change right column site changed (automatically or manually - it doesn’t matter) the date of the last modification of the page, then the whole meaning of this action would be lost. I decided that these changes should be tracked and taken into account when specifying $last_modified_time not worth it, as they have no SEO benefits.

As I wrote before, I can't tell you exactly how to automate the last modified date of a page, but I will tell you how NOT to do it!

Errors when specifying the last modified date

The first thing that might come to mind for most people is to send in the header the date of the last modification of the file with the contents of the page. Personally, I have the texts of articles in files, and not in a database, so for me this method might seem like an excellent way out of not having to enter it every time Unix Time Stamp manually. But no! Most hosting sites, and maybe even all of them, take the date of its creation as the date of the last change of a file; they do not take into account subsequent changes.

I think the consequences in such cases are clear to you. One popular Ukrainian hosting provider (and I think he’s not the only one) in his FAQ writes something like: “Instead of the date the file was last modified, use the function time(), which returns current time in Unix time stamp format." This is so absurd! He'll just shoot himself on the spot! And this hosting provider is considered “one of the best”, after I read this, I immediately wanted to become their client.

This is just anti-SEO, think for yourself, a search engine comes to your page and looks: “Wow! The last time the page changed was just now, I guessed when to come, great!” A couple of days later he comes to this same page: “Look, it just changed again, what a coincidence... Wait, why don’t I see any changes? Okay, I’ll come another time.” He comes again: “Well, no, guys, this is no longer funny, you definitely can’t be trusted.” This is such a fairy tale :)

And then people wonder why the results are search results not as we would like, but because your site is missing the banal confidence(trust). Just like in the parable “About the Shepherd and the Wolves.”

So, we’ve sorted out the main errors: you cannot specify the current time and I do not recommend specifying the file modification time. Now let's continue to look at how it all works.

Configure sending headers Last-Modified this is exactly 1/3 of the work, we still have to: make a response to the request and enable page caching. Both of these actions will not take much time and lines of code.

is a client request to your server, in which the client asks: “has the page changed since my last visit?” If the page has not changed, then we must stop further loading of the page with the command:

In this case, the body of the page should not begin to draw; this all happens BEFORE the first output of anything to the page! At the same time, it is necessary to return the server response to the client 304 Not Modified, thereby saying that the page needs to be taken from the cache. Let's get straight to the point:

If (isset($_SERVER["HTTP_IF_MODIFIED_SINCE"]) && strtotime($_SERVER["HTTP_IF_MODIFIED_SINCE"]) >= $last_modified_time)( header("HTTP/1.1 304 Not Modified"); die; ) header("Last-Modified : ".gmdate("D, d M Y H:i:s", $last_modified_time).");

So, in the first line, we check whether the HTTP_IF_MODIFIED_SINCE request has arrived to our server, and we also immediately check the number of seconds in the incoming HTTP_IF_MODIFIED_SINCE is greater than in $last_modified_time or not? If it is greater, then the date of the client’s last visit is later than the date of the last page change, from here we draw a purely logical conclusion that the page has not changed, which means we send the server response in the second line 304 Not Modified and with line 3 we kill (stop) the execution of all scripts on the page. In other words, we stop downloading it.

If the client did not send us the HTTP_IF_MODIFIED_SINCE request or his last visit was earlier than the date the page was last modified, then we (by default) send the code 200 OK and in the fifth line we send him the CURRENT date of the page change, instead of the one he had.

I told you everything you need about IF_MODIFIED_SINCE and how the code works, except what the strtotime() function does:

Strtotime($_SERVER["HTTP_IF_MODIFIED_SINCE"])

An attentive and savvy reader could already guess that this function converts an ordinary date into a Unix time stamp, since we set the $last_modified_time variable in it, and therefore for comparison we need to bring everything to a common denominator in a common measurement system.

And lastly, all we have to do is enable caching, this is done using the following lines:

Header("Cache-Control: public"); header("Expires: " . date("r", time()+10800));

Where the number 10800 is the time (in seconds) for which we want to cache the page, that is, in in this example for 3 hours.

And as always, for those who don’t understand anything, I’ll post everything in full, as it’s done on my blog:

= $last_modified_time)( header("HTTP/1.1 304 Not Modified"); die; /* killed everything below */) header("Last-Modified: ".gmdate("D, d M Y H:i:s", $last_modified_time)." GMT"); ?> And off went the rest of the page

I think you might have noticed that this whole Last-Modified story is an analogue of the tag in -. So lastmod is for informational and advisory purposes only, and no one can argue with your server’s answers. Naturally, it is not uncommon for the lastmod in the sitemap to differ from the Last-Modified header, but from now on they should be the same for you! After all, what kind of science have we now studied, not in order to be like unfortunate webmasters who have not advanced further than sitemap.xml.

Personally, I'm in this moment I don’t use the lastmod tag at all in my sitemaps, maybe later I’ll reconsider what I’m doing, but for now I don’t see the point in being so meticulous about having the right headers Last-Modified :)

And finally, check the correctness Last-Modified and you can using this service: click .

Thank you for your attention, special thanks to the ever-growing number of subscribers, for me this is the greatest incentive to write on the blog more often. So, whoever has not yet subscribed to new articles, welcome!