Cache or not to cache

Last Updated on 30/05/2021 by Patryk Bandurski

For high-workload applications, it is important to manage resources efficiently. One of the tricks that can save resource usage is caching. In RESTful services, this technic is used for the GET methods. However, it may be used in other cases when the operation of getting a particular resource can be reused. In this article, I will extend the previously design REST service by adding caching to two operations. Mule uses the Cache Scope component. Apart from describing it, I will show you possible obstacles and how to handle them.

Cache Scope

Introduction

Cache Scope allows reusing already saved responses for given requests. You may put as many event processors within Cache Scope as you like. As a response is considered payload after the last event processor. The main benefit of this scope is saving resources and handling messages quicker.

You should have in mind that this scope by default is using an in-memory caching strategy. This has such drawback that when mule starts caching large payloads it may reach memory limit and throw java heap exception. As a consequence, this default strategy should be replaced. This can be done by creating an object-store caching strategy.

Cache or not to cache this is the question

It may be tricky at first but not every payload is cached. You may even not notice it at first. Mule distinguishes repeatable and non-repeatable payload streams. The first one is a stream that can be read an unlimited number of times. Whereas non-repeatable can be read only once. More about comparison you can find in MuleSoft documentation Repeatable vs Non-Repeatable Streams article.

Caching strategy

Default caching strategy

Caching Scope behavior has been depicted in the UML diagram posted above. When payload/request enters Cache Scope it is checked whether it is the repeatable or non-repeatable stream. In the case of a non-repeatable stream, all event processors are executed within the Cache Scope and the response is nowhere cached. For repeatable payload is generated key using the SHA256 algorithm. This key identifies our payload/request. Then this key is compared with already saved values in Object Store. If the key already exists we have a situation called cache hit and the Mule returns cached values from the ObjecStore. When we missed cache all event processors within the Cache Scope are executed and the result is saved to Object Store.

Implementation

Below you can find flow implementing GET method for /accounts resource. During that call, we make a hit to MongoDB service and after that, we returned slightly transformed data. As agreed we decided to cache the response from the MongoDB database. This is fairly simple because we need to wrap the Find documents operation in the Cache Scope. To do it we need to right-click at event processor and from the drop-down select Wrap in… and then Cache. Here is the result:

Caching DynamoDB query
Caching DynamoDB query

MongoDB setup

To test the application you should set up a MongoDB cluster with database and sample collection. I have described it here, how to do it free of charge.

It works … but wait, it doesn’t work

As everything is in place it is time to test it.

First call

GET /api/accounts HTTP/1.1
Host: localhost:8081

Result:

We received 200 status code with JSON body.

Second call

GET /api/accounts HTTP/1.1
Host: localhost:8081

Result:

We received the 500 status code with the following message: Cannot open a new cursor on a closed stream. So our service’s cache is not working properly. But why?

Implicit transformations

Mule 4 brought revolution regarding transformations. In the earlier version, we needed to transform the payload explicitly. For example, to perform XPath operation, sometimes it was needed to perform DOM to XML transformation. Another example is transformation JSON to Map to access properties easily. Another commonly used transformation was Object to String. This was used mainly to consume stream and store extracted value in a String.  Now you may forget about it. The mule will do it behind the scene for you. I think that this is an outstanding improvement.

Do you remember the default Strategy? Cache Scope works only with repeatable payloads. So let us see what kind of payload do we get from MongoDB’s Find documents operation. We get ManagedCursorStreamProvider. This is a non-repeatable payload. We need to fix it to enable Cache Scope.

Cache that works

As we know why it does not work, we may fix it. In Mule 3.x, I would use Object to String to easily consume it or MEL expression message.payloadAs(java.lang.String). However, both methods were not available to me. First one because implicit transformations were provided, and transformation message processors are no longer available. In Mule 4, to resolve the issue is to use a repeatable stream if the connector supports it. I put the Transform Message component with type transformation, and it solves the problem.

Conditional caching

We should allow clients to decide whether they want to cache or not. There is a dedicated header for that called Cache-Control. It used to specify and control caching mechanisms in both requests and responses. When client sends

Cache-Control: no-cache

it instructs that caching mechanism should not be used. In Cache Scope properties under Filter section you can write condition that message needs to fulfill in order to be cached like below:

#[attributes.headers.'cache-control' != 'no-cache']

Using this condition we only cache requests that have Cache-Control different than no-cache.

Time To Live (TTL)

Client should be aware for how long resource will be valid. We may inform consumer by specifying header like:

  • Cache-Control: max-age=seconds containing a number of seconds before the resource is considered stale
  • Expires: containing the date-time value

Examples:

Cache-Control: max-age=120
Expires: Web, 12 Oct 2015 07:28:00 GMT

We need to do two things. First, we need to configure Caching Scope to expire messages and create header Expires and return it in each response. In order to enable TTL follow these steps:

  • Add ObjectStore module to mule project
  • Click at Caching Scope and select plus sign next to Reference to a strategy
  • In General Tab:
    • next to the Object Store select Edit inline
    • provide Alias like Cache_Object_Store
    • uncheck Persistent
    • in Entry ttl provide how many seconds do you want to consider response as a valid one
Custom Caching Strategy
Custom Caching Strategy

As the strategy is in place, we need to add a new Transform Message within Cache Scope. As cached is the payload we need to compose payload with metadata containing expires headers.  Paste following code

%dw 2.0
output application/json
---
{
	Metadata: {
		Expires: (now() + |PT2M|) as String {format: "EEE, dd MMM yyyy HH:mm:ss z"}
	},
	Content: payload
}

In line 7 we add two minutes to the current date and time and then we format it to HTTP date timestamp. Go back to HTTP listener and within add following element

<http:response statusCode="#[vars.httpStatus default 200]">
	<http:headers><![CDATA[#[output applicaton/java
---
{
	"Expires" : vars.outboundHeaders.Expires
}]]]></http:headers>
</http:response>

Finall flow:

Final flow with Caching Scope
Final flow with Caching Scope

Source Code

Code is available at GitHub.

Cache or not to cache

7 thoughts on “Cache or not to cache

  1. Great article Patryk with a good walk through and explanation! Indeed, MEL is not available anymore in Mule4 and Data weave becomes the main expression language. Just one note, you might want to look into using wiremock to mock the external service call (it’s simple and made for stubbing HTTP based calls): http://wiremock.org/docs/running-standalone/.

  2. Hello,

    Great post! Just one correction. Starting with Mule 4.1 cache scope does cache stream contents. Repeatable streaming is leveraged by this component. Furthermore, it also works with auto-paging operations such as sfdc:query or db:select.

    Thanks!

    1. Hi,
      I appreciate your comment. I test it now using Mule 4.1 runtime and you are right stream content is cached by default. If you specify Streaming Strategy to Non repeatable stream content won’t be cached.
      Patryk

  3. Thanks for the great article.
    When I imported your project, I see below error on os:private-object-store element.
    How do I resolve this?

    Element: private-object-store is not allowed to be child of element Caching Strategy

    Thanks.

    1. Hi, thanks for the comment. I have just arrived from holidays. I will take a look and response you promptly 🙂

    2. Hi,
      I found the possible culprit. Could you take a look if in Package Explorer exsists ObjectStore [v1.1.0] dependency? Sometimes newest Anypoint Studio does not load correctly all dependencies present in a pom file. I received the same error when I manually removed ObjectStore module from my project.
      Hopes it helps.

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top