Blogs

Caching strategies for the Kreatio CMS: 1. Introduction

  • 15 January 2020 by
  • Senthil Kumar
Caching strategies for the Kreatio CMS: 1. Introduction

All websites do some form of caching. Caching plays a major role in enhancing the speed and performance of the website. It also reduces the infrastructure required for running the site. The content of the site is cached at many different places on the internet, all the way from the origin server to the user's browser. One of the key elements of your caching strategy is determining for how long each item / file is cached. It is important to understand that all items cannot be cached for ever. In fact, all elements of the site are not even cached for the same duration! The duration for which you cache an element depends among other things, on how frequently you expect it to change. For example, the site logo rarely changes; so you could cache it say for 10 years!

What all can be cached ?

  • All HTML pages
  • Images, videos and other media content
  • JS, CSS and other supporting files.

Caching HTML pages

HTML caching is mainly used to increase the page speed. On a website, if HTML takes a longer time to load, it will definitely delay every other request. Most of the time, the content which is once published is rarely edited again. Therefore, for an article, the HTML can be cached for a longer period. However, an article page as displayed on your website includes many other components and widgets as well. Data inside these keep changing . The frequency of data change inside the widgets/components are different. For example, the most read widget/component might change frequently than a related articles widget. Therefore each widget/component has to be cached separately.

At the same time, some of these components are used across the pages of the website. So once it is cached, the content can be reused across these pages. Keeping in mind these two different needs the caching Infrastructure for the Kreatio CMS has been designed to cache individual pages and components inside them, separately

Page cache

As mentioned earlier, the entire page can be cached, so if the user makes the same request again to the server, the server can simply take the old processed content/page and serve it to the user without doing any additional processing, This in turn will increase the throughput of the server as well as of the requested page.

In Kreatio we are using web publishing layer with a Web server to serve static content as well as to handle the dynamic HTML requests. This layer caches the page. By changing the caching headers set, the system defines whether to cache that particular page or not and for how long to keep the cached content.

Article pages and static pages can be cached for a longer period because it hardly changes once it is published. Listing pages which change frequently due to the publishing of new articles or curation of articles using Ranked lists will be cached for a shorter period. Responses having set-cookie headers are not cached. The assumption behind this is that if you are setting anything in the cookie then it is user specific content and cannot be cached for general use.

Subscriber only content and action cache

Traditional web servers are optimized to serve static content. They do not do much of processing or logical execution. Now it is challenging when we want to cache subscriber restricted content (paywall content) as we need to first know if the user has the rights to view the content. And this involves processing. A webserver has no way to know this as it does not understand the application data and does not do any complex logic to find if the subscriber have access to specific content. So the restricted content is not cached in the webserver. Instead, we rely on server-side web application framework for caching the restricted content. The framework checks if the subscriber has access to the requested resource and then decides to serve appropriate content from its own framework cache. So it does not do the actual process of rendering the template and checks only if the subscriber has access or not.

Component Cache

The content of each component is cached by the Kreatio CMS. So while rendering the template, if a component is cached, it will pick from the cache and will not again render and execute that piece of the template. These component caches are stored such that so that all application nodes in your hardware cluster can use common Cache storage. So if one node caches a component, it is available for other nodes as well. This maintains cache consistency. Elsewhere, each node might have its own cached version.

In the Kreatio CMS, each component can be cached globally or specific to a page's context. For example, the most read component, footer, header menu etc are the same for all pages so it can be cached globally. So once it is generated these component caches can be used in any of the pages.

If the cache is specific to a page context, then it is available only to those pages with the same page context. For example, we have taxonomy listing pages like category page, section page etc. So a component using an automated ranked list of a particular taxonomy which is taken from the current page will be cached within the same context of that taxonomy. So two different taxonomies will have two different cached versions of the same component.

Cache Expiry

A caching strategy is only as effective as how efficiently we can expire the cache. otherwise we might come across a lot of unhappy users who are being served stale content. We will see how this is done in the Kreatio CMS in the next piece.

Caching strategies for the Kreatio CMS: 2. Managing cache expiry