Wednesday, 2 September 2015

HOW DISPATCHER RETURNS DOCUMENTS : How DISPATCHER caching Works ?



file



The Dispatcher always requests the document directly from the AEM instance in the following cases:

  • If the HTTP method is not GET. Other common methods are POST for form data and HEAD for the HTTP header.
  • If the request URI contains a question mark "?". This usually indicates a dynamic page, such as a search result, which does not need to be cached.
  • The file extension is missing. The web server needs the extension to determine the document type (the MIME-type).
  • The authentication header is set (this can be configured)
Determining if a document is cached
The Dispatcher stores the cached files on the web server as if they were part of a static website. If a user requests a cacheable document the Dispatcher checks whether that document exists in the web server's file system:
  • if the document is cached, Dispatcher returns the file.
  • if it is not cached, the Dispatcher requests the document from the AEM instance.
Determining if a document is up-to-date
To find out if a document is up to date, the Dispatcher performs two steps:

  1. It checks whether the document is subject to auto-invalidation. If not, the document is considered up-to-date.
  2. If the document is configured for auto-invalidation, the Dispatcher checks whether it is older or newer than the last change available. If it is older, the Dispatcher requests the current version from the AEM instance and replaces the version in the cache.
Methods for Caching
The Dispatcher has two primary methods for updating the cache content when changes are made to the website.
  • Content Updates remove the pages that have changed, as well as files that are directly associated with them.
  • Auto-Invalidation automatically invalidates those parts of the cache that may be out of date after an update. i.e. it effectively flags relevant pages as being out of date, without deleting anything.
Content Updates
In a content update, one or more AEM documents change. AEM sends a syndication request to the Dispatcher, which updates the cache accordingly:
  1. It deletes the modified file(s) from the cache.
  2. It deletes all files that start with the same handle from the cache. For example, if the file /en/index.html is updated, all the files that start with /en/index. are deleted. This mechanism allows you to design cache-efficient sites, especially in regard to picture navigations.
  3. It touches the so-called statfile; this updates the timestamp of the statfile to indicate the date of the last change.
The following points should be noted:
  • Content Updates are typically used in conjunction with an authoring system which "knows" what must be replaced.
  • Files that are affected by a content update are removed, but not replaced immediately. The next time such a file is requested, the Dispatcher fetches the new file from the AEM instance and places it in the cache, thereby overwriting the old content.
  • Typically, automatically generated pictures that incorporate text from a page are stored in picture files starting with the same handle - thus ensuring that the association exists for deletion. For example, you may store the title text of the page mypage.html as the picture mypage.titlePicture.gif in the same folder. This way the picture is automatically deleted from the cache each time the page is updated, so you can be sure that the picture always reflects the current version of the page.
  • You may have several statfiles, for example one per language folder. If a page is updated, AEM looks for the next parent folder containing a statfile, and touches that file.
Auto-invalidation
Auto-invalidation automatically invalidates parts of the cache - without physically deleting any files. At every content update, the so-called statfile is touched, so its timestamp reflects the last content update.
The Dispatcher has a list of files that are subject to auto-invalidation. When a document from that list is requested, the Dispatcher compares the date of the cached document with the timestamp of the statfile:
  • if the cached document is newer, the Dispatcher returns it.
  • if it is older, the Dispatcher retrieves the current version from the AEM instance.

Again, certain points should be noted:
  • Auto invalidation is typically used when the inter-relations are complex e.g. for HTML pages. These pages contain links and navigation entries, so they usually have to be updated after a content update. If you have automatically generated PDF or picture files, you may choose to auto-invalidate those too.
  • Auto-invalidation does not involve any action by the dispatcher at update time, except for touching the statfile. However, touching the statfile automatically renders the cache content obsolete, without physically removing it from the cache.






No comments:

Post a Comment