Do you trust your cache? To meet the demands of the end-users and speed up content delivery, content caching by web servers and content delivery networks (CDN) has become a vital part of the modern web. To explain how this can create vulnerabilities when it comes to data security requires first asking another question.

Namely, how does microservice architecture work? This architectural style divides the monolithic model into independent, distributed services. That way, you can deploy and scale them separately. This makes a difference when it comes to data security, but also requires DevOps and security teams to adopt new security patterns and practices.

Developers used to build applications with a monolithic architecture, i.e., one large system, which had a single, large codebase. Monolithic applications and services tightly coupled together, which made scaling and code maintenance rather difficult. This led to the move from monolithic to microservice architecture, which allows teams to be more agile, cost-effective and better able to scale their systems.

The microservice architectural style is an approach to develop a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often HTTP-resource application programming interfaces (APIs).

The Data Security Concerns Around Content Caching

A scalable web caching solution helps to save bandwidth and deliver a better user experience for the product clients.

For example, a CDN features proxy servers located in multiple locations for faster content delivery. CDNs use multiple servers to retain copies of rich media and content.

Web browsers cache HTML files, JavaScript and images in order to load websites more quickly, while DNS servers cache DNS records for faster lookups. CDN servers cache content to reduce latency.

CDNs are servers that sit between your end-user and your server. Each of these servers will cache your content according to the cache rules you set in the various HTTP headers.

Web Cache

The cache is a hardware or software specification for the temporary storage of frequently accessed static content. Web caches sit between the user and the application server, where they save and serve copies of certain responses.

The systems performs web caching by retaining HTTP responses and web resources in the cache for the purpose of fulfilling future requests from the cache rather than from the origin servers.

Cache Control

Caching is handled by the server via the cache-control headers. These headers specify instructions for caching mechanisms in both requests and responses.

Standard cache-control directives that the client can use in an HTTP request:

Standard cache-control directives the server can use in an HTTP response:

 Types of Cache Directives

  •  Public: Any cache may store the response, even if the response is non-cacheable most of the time.
  • Private: Only a browser’s cache may store the response, even if the response is non-cacheable.
  • No-cache: Any cache may store the response, even if the response is normally non-cacheable. However, the stored response must always go through validation with the origin server first.
  • No-store: No cache may store the response. A good way to disable caching of a resource is to send the no-store response header.
  • Max-age=<seconds>: This is the maximum amount of time the system considers a resource fresh. Unlike Expires, this directive is relative to the time of the request.
  • S-maxage=<seconds>:This overrides max-age or the Expires header, but only for shared caches (e.g., proxies). Private caches ignore this.
  • Must-revalidate: This indicates that once a resource becomes stale, caches must not use their stale copy without successful validation on the origin server.

Web Cache Poisoning and Cache Keys

With web cache poisoning, an attacker exploits the behavior of a web server and cache so they serve a harmful HTTP response to other users. Whenever a cache receives a request for a resource, it needs to decide whether it has a copy of this exact resource already saved and can reply with that or if it needs to forward the request to the application server.

Hence, caches tackle this problem using the concept of cache keys. These are a few specific components of an HTTP request that the cache takes to fully identify the resource being requested, as shown in the below sample request.

Caches identify equivalent requests by comparing a predefined subset of the requests’ components. These are known collectively as the cache keys.

GET /totally/real/site?isItForReal=true HTTP/1.1
User-Agent: Mozilla/5.0…
Accept: */*
Cookie: language=en;

Note: The caches identify the highlighted part of the HTTP request as the cache keys. Components of the request that are not included in the cache key are said to be ‘unkeyed’.

Cache Keys and HTTP Requests

To explain the concept of the cache keys further, consider the two HTTP requests below. Caches assume the following two requests to be equivalent, but in the first HTTP request the response is requested to be in the English (en) language and in the subsequent request the requested language is Polish (pl).

Request 1

GET /blog/post.php?mobile=1 HTTP/1.1
User-Agent: Mozilla/5.0.
Cookie: language=en;

Connection: close

Request 2

GET /blog/post.php?mobile=1 HTTP/1.1
User-Agent: Mozilla/5.0.
Cookie: language=pl;

Connection: close

The response served to the second user (Request 2) will be in the wrong language, since the cache saves the response from the first user (Request 1) in English. Hence, any difference in the response triggered by an unkeyed input may be stored and served to other users. When a threat actor intentionally sets out the unkeyed input like HTTP headers to poison the caches, the basic web cache poisoning attack is favored.

Spotting Web Cache Poisoning

There are several ways of editing caches that may allow web cache poisoning.

For the basic poisoning technique, the first step is to identify the unkeyed input. This can be done manually or with an automation tool. The Burp suite tool extension, Param Miner, also can be used to identify the unkeyed parameters.

The Unknown Header Method

Attackers can also use the basic poisoning technique via an unknown header. This method takes advantage of how a modified HTTP request with a poisoned header or an injected payload in an existing header (example, X-Forwarded-Host) affects the application response.

GET /en?cb=1 HTTP/1.1
X-Forwarded-Host:  <unkeyedparamvalue>

HTTP/1.1 200 OK
Cache-Control: public, no-cache

<meta property=”og:image” content=”https://<unkeyedparamvalue>/cms/social.png” />

In the above request, the application has used the X-Forwarded-Host unkeyed header to generate an open graph URL inside a meta tag. <unkeyedparamvalue> can be any input that is reflected in the response. The below example shows the same.

GET /en?vulnerablerequest=1 HTTP/1.1
X-Forwarded-Host: A.”><script>alert(1)</script>

HTTP/1.1 200 OK
Cache-Control: public, no-cache

<meta property=”og:image” content=”https://A.”><script>alert(1)</script>”/>

In the modified response, the attacker injects a simple cross-site scripting payload in the unkeyed input. The poisoned cache response with arbitrary JavaScript code will execute to whoever views it.

In another example, the unkeyed input can vary from a query string in the request to an unkeyed cookie or an unknown header. Param Miner can identify the X-Forwarded-Host header shown in the above request as an unkeyed header.

Another Data Security Concern: Unkeyed Cookie

Applications are at risk for web cache poisoning with an unkeyed cookie because cookies are not included in the cache keys. If the cookie value is reflected in the response, an attacker can inject an arbitrary string into the cookie value (arbitrary string is reflected in the response) to poison the cache.

The input string of unkeyed cookie ‘fehost’ value shows in the response:

The attacker injects a string into the cookie ‘fehost’ to poison the web cache. Next, the poisoned web cache will be served to the authentic website user.

Web Cache Poisoning in CDN

Cache-poisoned denial-of-service (CPDoS) is another threat to data security from web cache poisoning. This is a zero-day attack that poisons the CDN cache. By changing certain header requests, the attacker forces the origin server to return a ‘bad request’ error that is stored in the CDN’s cache. Thus, every request that comes after the attack will get an error page. One of the common openings is HHO (HTTP header oversize) in CDN.

HTTP Header Oversize

HHO CPDoS attacks work when a web application uses a cache that accepts a larger header size limit than the origin server. To attack it, an attacker sends an HTTP GET request including a header larger than the size supported by the origin server but smaller than the size supported by the cache.

Impact and Mitigation

The data security impact of the web cache poisoning attack can also depend upon what the attacker can get cached and the amount of traffic on the affected page. It can be used to create stored cross-site scripting, open redirects and DoS attacks, depending on what parts of the application are at risk.

But there are ways to mitigate this. The most robust defense against cache poisoning is to disable caching. The best method to achieve this is via the cache-control headers directive below:

Cache-Control: no-store, max-age=0

In addition, avoid taking input from headers and cookies. Identify unkeyed inputs in your application and disable them if you can. Lastly, patch client-side vulnerabilities, even if they seem unexploitable. That will help you lock down openings for web cache poisoning to increase your overall data security.

More from Application Security

Patch Tuesday -> Exploit Wednesday: Pwning Windows Ancillary Function Driver for WinSock (afd.sys) in 24 Hours

‘Patch Tuesday, Exploit Wednesday’ is an old hacker adage that refers to the weaponization of vulnerabilities the day after monthly security patches become publicly available. As security improves and exploit mitigations become more sophisticated, the amount of research and development required to craft a weaponized exploit has increased. This is especially relevant for memory corruption vulnerabilities. Figure 1 — Exploitation timeline However, with the addition of new features (and memory-unsafe C code) in the Windows 11 kernel, ripe new attack…

Backdoor Deployment and Ransomware: Top Threats Identified in X-Force Threat Intelligence Index 2023

Deployment of backdoors was the number one action on objective taken by threat actors last year, according to the 2023 IBM Security X-Force Threat Intelligence Index — a comprehensive analysis of our research data collected throughout the year. Backdoor access is now among the hottest commodities on the dark web and can sell for thousands of dollars, compared to credit card data — which can go for as low as $10. On the dark web — a veritable eBay for…

Direct Kernel Object Manipulation (DKOM) Attacks on ETW Providers

Overview In this post, IBM Security X-Force Red offensive hackers analyze how attackers, with elevated privileges, can use their access to stage Windows Kernel post-exploitation capabilities. Over the last few years, public accounts have increasingly shown that less sophisticated attackers are using this technique to achieve their objectives. It is therefore important that we put a spotlight on this capability and learn more about its potential impact. Specifically, in this post, we will evaluate how Kernel post-exploitation can be used…

Detecting the Undetected: The Risk to Your Info

IBM’s Advanced Threat Detection and Response Team (ATDR) has seen an increase in the malware family known as information stealers in the wild over the past year. Info stealers are malware with the capability of scanning for and exfiltrating data and credentials from your device. When executed, they begin scanning for and copying various directories that usually contain some sort of sensitive information or credentials including web and login data from Chrome, Firefox, and Microsoft Edge. In other instances, they…