Podcast Notes: Cache Only with Mike Hurwitz
These are my rough notes on my podcast with Mike Hurwitz about caching. When I say rough I mean not all full sentences and if you want to read them I hope you get something out of it but if you don't it's ok because they are just my notes that I don't care to take much time to edit. Thank you enjoy the show xoxo
What is caching? Why do we care?
In real world systems, we have to worry about databases taking too long to give us data/it may take a while to query. Also, there may be many readers of some data (like reading posts on a social network). Having a different form of the post that you can access quickly may be beneficial. And not having to go back to the source of truth when a copy is good enough lessens the burden on the primary datastore.
What are the main things you achieve with caching?
Latency and bandwidth are the big ones.
You may want to have data in a different shape too. You may need to make five calls to get a piece of data, and storing it that way in the cache may help.
Caching is supposed to be ephemeral, and not the source of truth. So losing the cache should never mean sacrificing correctness, but you may lose up-time.
Rules around caching
There are a lot of rules around what makes cache safe operations.
Depending on your application, the cache may need to be reasonably update. For example, your bank statement shouldn't read from stale cache. Lifecycles and guarantees of an application lead to different use patterns of the cache.
You may need to replace the cache each time you make a write, for example.
Is cache always in-memory?
Not always true, CDNs are caches sitting in front of the web or applications out on a network. You can store images somewhere other than memory, and it's still considered a cache.
You can have tiers of caching depending on how latency tolerant you are. But at some point, the goal is to get it in memory somewhere.
What patterns do we use to access the cache?
Access patterns rule everything. So it depends on what you're access patterns are.
When caching, we are saying that there is a value that is associated with a key, and that value can be represented various different ways. The value may be mutable or immutable. Where it can go wrong is when you assume the value will always be the same, and you are wrong.
Is key/value typical for caching?
For generic caches, this is typically the case. In memcache, for example the value is a blob, so whatever you put in the blob is up to you. Some other caches are more content aware about what is in the blob and hold metadata on it.
Weird things can happen if the value has mutators/methods on it. You can be loading something that is stale.
Details of caching and the different policies?
What are you putting in the cache? -- defines your read story
Are you removing or replacing things?
Are you ejecting things because of time or resources?
Whenever I read anything, I will write it back to the cache. Check cache, if it isn't there, go to system of record, if I find a value, write to cache so next read doesn't go to primary datastore. Good for most cases.
What about when system of record can't handle that amount of read volume? Not just when I read, but also when written to primary datastore, also write to cache. Then you will find you write records that nobody ever reads.
Is it write-through cache or write-back cache appropriate? or maybe even a read-through cache? Lots to consider!!
Assuming that your data can go stale, you can say you have proactive deletion. If something changes, even if not often, and you need to be responsive to it, you can proactively change it in the cache when the something is changed/updated. Another way is to set a time to live on the something. Let's say it's evicted every 10 minutes to keep the something fresh. But what if the something is read once every hour? Then your cache is useless. So you read frequencies determine the usefulness of the eviction policy you set.
Proactive deletion/replacement causes you to be able to have a long TTL. So if it doesn't change, you can keep it in the cache a while, if it does, your proactive replacement will update it, ensuring the cache isn't stale.
"If you can ever take the easy way out with caching, do it"
Cache placement / replacement
Another reason to evict from cache is capacity. If you are running out of RAM, you may need to throw some things out. You can pick something random to delete, and it works surprisingly well, especially if you aren't going to be at capacity often.
What if you know the item you will use last? Then you could be smarter than random, but it's also kind of impossible to know. But it gives us somewhere to go, because we can do somewhere between random and having perfect knowledge of cache usage.
So, you can also implement least recently used (LRU). The disadvantage is that its hard to implement, because you need to maintain a list of what has been recently used and update it each time that changes.
Another interesting approach is least frequently used (LFU). It's about how often you read an item.
This is a huge area of research!
find it on github here
read the blog here
Keeping products up-to-date. Some products are really big, like 1500 attributes big. Deserialization got to be an issue. Protobuf didn't give opportunity to reuse some of the memory buffers. So the main reason was to avoid memory churn every time a call happens.
We have really high rate of requests that usually want the same products in a relatively short amount of time. And the products aren't changing as quickly as the
"it's always about the red shorts"
This could save ~1.8 million req to the database per second. But then how to clear out the memory in the cache? Proactive deletion, time-based would be best. If we know the order of access via LRU, given the read patterns, we can use that to throw items out of the cache and free up memory. The memory may not be full within the eviction time, but it will help keep the service from crashing from running out of memory. You can also do it in term of the number of object/bytes stored instead of time. For example, when there are 100,000 items in the cache, clean up and remove the least recently used.
Why did you write your own thing?
* handle concurrent access. Handling two different data structures -- map & priority queue. Concurrency control important.
* proactive TTL/expirey. A nice to have.
Most of the cache implementations out there had either one or the other, but not both. Hashicorp has a good implementation, but didn't have support for the proactive eviction that was needed.
Lazy part is not doing bookkeeping for the top 25%, because those items are safe and won't be removed. This saves a lot of activity because when the cache is less than 25% full, which is most of the time, we can skip the bookkeeping.
Lookups in the map are constant time and the lookup values are objects not blobs because Go doesn't have generics. Then, there is also the priority queue. When accessed, we lookup which position in the queue. If below the top 25% then the bookkeeping is done to move the item up. If not, we don't bother.
This means the wrter/exclusive lock isn't needed nearly as often. And most of the time, this lock is never needed because most read items are in the top 25%.
"Wu-Tang, and caching, are for the children"