Everybody struggles: A caching tale

I am currently working on a tool that that automates finding external links on Wikipedia that may not serve as valid references to an article because the link could be broken. So the tool basically makes requests to all the links on a Wikipedia page and marks the links whose response status code is not 200. An important part of optimizing this tool, that I'd also never implemented in a project before, is caching. For this project, I'm required to save the links after a request is made and return the cached result in case the same link is to be checked more than once in 24 hours

What is caching?

Caching the process of temporarily storing data so that that it can be easily accessed thus eliminating the need of frequent computations. This temporary data storage is called a cache. The commonest caching mechanism implemented in web applications is using in-memory key value stores such as Redis and Memcached.

After reading all about caching and understanding what it entails, the next step was to make a decision on what tool I'd use to implement it. I made research on the various tools available and settled for Redis. I chose Redis because of the following reasons:

  • It stores data in memory so read and write operations are extremely fast

  • Supports a variety of data structures such as strings, lists, hashes etc

  • Toolforge -the hosting environment for Wikimedia developers, has an Redis instance already set up for use

  • Redis has an active community and support with detailed free courses to help you get started at Redis University

Entering an unexpected learning detour.....

Up until making the decision to use Redis, I'd never used Redis at all in any project So I now had to dive into understanding how to implement caching using Redis. My learning style, with any technology is that I need to understand the fundamentals that get it to work. With this, I find that reading documentation from the source is better than say watching a video on Youtube (which could be even faster).

To get started with Redis, I took a course on Redis University; RU204 Storing, Querying, and Indexing JSON at Speed. This course covered how to setup Redis and perform JSON queries faster. While setting up Redis, discovering that I could not use Redis locally because my operating system (Windows) was not supported was kind of a bummer. Docker was the recommended workaround for Windows Users.

Docker is a tool that lets you run create an environment (container) where an executable package (image) is run eliminating the need for tedious set ups and configurations. Like Redis, I'd never actually actually used Docker in a project before and at this point, the learning curve was steepening. Before I could proceed with Redis, I needed to understand how docker works and how I'd set it up in my project. I found this 3 hour course on Youtube by Nana to be very helpful.

Conclusion

As this tool evolves, the journey's intricacies contribute not only to the success of the project but also to my personal and professional development. The struggles encountered in the pursuit of knowledge and implementation are not just obstacles but stepping stones, shaping a more resilient and capable developer.