Scalable Web Cache Using Consistent Hashing

Sarthak Singhal
8 min readJan 14, 2021

Modern Problem

A key performance measure for the World Wide Web is the speed with which content is served to users. As traffic on the Web increases, users are faced with increasing delays and failures in data delivery. Two main reasons for these are:

  • congested networks
  • swamped servers

Date travels slowly through congested networks. Swamped servers (facing more simultaneous requests than their resources can support) will either refuse to serve certain requests or will serve them very slowly. Network congestion and server swamping are common because network and server infrastructure expansions have not kept pace with the tremendous growth in Internet use. Servers and networks can become swamped unexpectedly and without any prior notice. For example, a site mentioned as the “trending” on the evening news may have to deal with a ten thousand fold increase in traffic during the next day.

Solution? Caching

Caching has been employed to improve the efficiency and reliability of data delivery over the Internet. A nearby cache can serve a page(if cached) quickly even if the originating server is swamped or the network path to it is congested. Besides widespread use of caches also engenders a general good: if requests are intercepted by nearby caches, then fewer go to the source server, reducing load on the server and network traffic to benefit all users.

Drawbacks of using a single cache

Providing a group of users, specially on a network with a single shared caching machine has several drawbacks:

  • If this caching machine fails, all users are cut off from the cache and all the users again directly request the server
  • While running, a single cache is limited in the number of users it can serve
  • It may become a bottleneck during periods of intense use
  • Due to limited storage, the cache will suffer ”false misses” when requests are repeated for objects which it was forced to evict for lack of space

Multiple cache usage

To achieve fault tolerance, scalability, and aggregation of larger numbers of requests several solutions are proposed using systems of several cooperating caches. Advantages of using multiple cache servers are:

  • Failure of one or more servers, doesn’t affect the functioning of others and they can serve request normally
  • Multiple caches can handle more client request due to increased resources
  • Chances of false misses decreases

Orthodox Hashing

One of our desired goals when using multiple cache servers is that data should be distributed equally among all cache servers to prevent a particular server becoming a bottleneck for the entire system. This is why we use hash functions as they tend to distribute their inputs randomly among all possible locations. It is obvious that cache servers will come up and go down over time. This dynamic number of running cache servers makes normal hash schemes difficult to use. A normal hash scheme is generally given by:

Server Index = H(x)%N where N=number of running servers

Rehashing

Now if we want to use this scheme, we need to tackle one problem. If the servers keep on coming up and going down, N keeps on changing and under such a scheme, essentially every URL needs to be rehashed to a new cache. If this is not done:

  • We would keep getting a cache miss as the data would be present in different cache and we would be searching it in some other cache
  • This would place heavy load on the original server and thus the purpose of introducing cache would be defeated
  • This would also use up more cache space as multiple copies would be stored on different cache servers

Considering the above problems, it is necessary to rehash. Rehashing might be a herculean task because it will either require a scheduled system downtime to update mappings or create read replicas of the existing system which can service queries during the migration. In other words, a lot of pain and time wastage.

Drawbacks

Even rehashing might not be able to solve our problem in a distributed setting. The problem in a distributed system with simple rehashing is that there is state stored on each node; a small change in the cluster size for example, could result in a huge amount of work to reshuffle all the data around the cluster. As the cluster size grows, this becomes unsustainable as the amount of work required for each hash change grows linearly with cluster size.

The problem is further exacerbated by asynchronous information propagation through the internet. At any one time, different clients will have different information about what caches are up or down(referred to as client’s view). At any time, many different views will pervade the system which has potential drawbacks:

  • If each view causes a URL to map to a different cache, we will soon find that each URL is stored in all caches
  • Furthermore, with these multiple views in place it becomes difficult to argue that all caches will receive the same amount of load

Consistent Hashing

All these problems can be handled by using a distribution scheme that would not depend directly upon the number of cache servers. Consistent hashing is one such scheme that facilitates the distribution of data across a set of nodes in such a way that minimizes the re-mapping of data when nodes are added or removed. It is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table by assigning them a position on an abstract circle, or hash ring. This allows servers and objects to scale without affecting the overall system. It does away with all inter-cache communication, yet allows caches to behave together in one coherent system. Some advantages of consistent hashing over other cache schemes are:

  • Since all clients contact the same cache for a given page, the caching system suffers only one miss per page, regardless of the number of cooperating caches, rather than one miss per cache per page.
  • Miss rate is further reduced because we do not make redundant copies of a page in different caches (considering clients have same view)
  • This also leaves more space for other pages to be kept in the cache and increases chances of hit.

Setup

The setup for consistent hashing is as follows:

  • Creating Hash Key Space:- Consider we have a hash function that generates 32 bit integer hash values. We can represent this as an array of integers with 2³² slots. We’ll call the first slot x0 and the last slot xn-1.
  • Representing the HashSpace as a Ring:- Imagine that these generated integers are placed on a ring such that the last value wraps around.
  • Placing cache servers on the hash ring:- Using our hash function, we map each cache server to a specific place on the ring.

Determining cache server for data points

Using our hash function, we map each data-point/key to a specific place on the ring. After mapping the data on the hashring, we move in any one direction: clockwise or anti-clockwise and search for the nearest cache server on the ring. Then we place the data on the first server that we find. In our example, it looks like below figure:

Removing a cache server

A server might go down in production but consistent hashing ensures it has minimal effect on our cache system. Let’s say the green server goes down. This results in data belonging to this server being distributed between blue and purple servers. Whereas data belonging to blue and purple remains untouched. This is because when green went down, we needed to search for another closest server in the chosen direction which resulted in redistribution of server’s data. However if a value was mapped to any other server, even after the green server went down, they were the nearest server to the value on the hashring and thus no redistribution took place.

Adding a cache server

A server may again come up after going down in production but consistent hashing ensures it has minimal effect on our cache system. Let’s say we are adding another server and it gets mapped between the blue and green server. All the data points lying only between these two servers need to be remapped as only for them the nearest server can change. Specifically, keys between orange and blue servers need to be remapped. On an average we need to redistribute only K/N data points where K is the total number of data points and N is the total number of servers.

Server Replication (Virtual Node)

There might be a case that if few servers are there, all data points may hash between any two servers resulting in overloading of that server. To ensure even distribution of data among cache servers, we assign not one but many locations to a server on the hashring. Number of copies depends on the situation and may even be different for each server. Say if server S1 was twice as powerful as other servers, we could make twice the number of copies of S1. As the number of replicas in the ring increases, load on each server becomes more and more uniform. In real systems, the number of replicas is very large (>100).

Handling multiple views

Consider a system with m caching machines and c clients, each with a view of an arbitrary set of half the caching machines. There exists a theorem stating that if O(logm) copies of each caching machine are made and the copies and URLs are mapped to the ring using a good basic hash function, then the following properties hold:

  • Balance:- In any one view, URLs are distributed uniformly over the caching machines in the view.
  • Load:- Over all the views, no machine gets more than O(logc) times the average number of URLs.
  • Speed:- No URL is stored in more than O(logc) caches.

Do checkout the git repo below for a performance analysis between consistent and orthodox hashing along with the code for the same.

--

--