I thought it’d be fun to revisit some of the best books and blogs I read this year.

  • Designing Data Intensive Applications by Martin Kleppmann. One of the most recommended books for learning distributed systems. I think it gives a pretty comprehensive overview of everything someone should know. I would use it more as a reference book and read chapters based on interest.
  • Bitcask. This was one of the first papers I read this year. It’s a pretty short read and the database is quite straightforward to implement. Would recommend this as the first paper for anyone looking to learn more about databases.
  • Delta Lake. A coworker recommended this to me, and it was very interesting to see the recent shift towards object-based storage like Warpstream. Though I didn’t finish reading it fully.
  • In Search of an Understandable Consensus Algorithm. I mostly read this while implementing Raft for the MIT 6.5840 labs. Also not a very difficult read, but you need to scrutinize the details to not mess up the implementation.
    • Also wanted to mention this series of blog posts that helped me better understand Raft.
  • Keeping CALM: When Distributed Consistency is Easy. The paper introduces the concept of monotonic programs, and how these programs have a consistent, coordination-free distributed implementation. I’ve never made this connection before, but it makes a lot of sense. Although these results don’t really help us find implementations for these problems.
  • Amazon’s Dynamo. This was also a highly recommended paper alongside Google’s Spanner. I just finished reading it and it definitely reinforced a lot of concepts and ideas for distributed systems, particularly about load distribution and availability vs. consistency trade-offs.
  • Cinnamon: Using Century Old Tech to Build a Mean Load Shedder is a series of blog posts from Uber on building a load shedder for graceful degredation. I had a lot of fun reading this and trying to replicate this with my own load shedder.
  • Google Prequal. This was a fairly recent paper and the headline is that you should not be balancing load based on CPU or memory usage because they are a lagging indicator. Instead, you should be balancing on more real-time signals like requests-in-flight and latencies. Very unexpected read.
  • Deterministic Simulation Testing. This is less of a single book or blog, and more of an overarching theme. I got super interested in the ability to deterministically replicate concurrency bugs.
    • Antithesis has some really good blogs on everything deterministic state testing and beating video games with it.
    • The sled page gives a pretty good overview of how someone might implement DST for their systems.
  • The One Billion Row Challenge in Go. Was fun reading about the various performance optimizations you can do to process a billion rows in Go. No SIMD, but hoping to explore this more next year.
  • Gossip Glomers is a series of distributed systems challenges from Fly.io. They were fun and a good introduction to distributed systems but I would have preferred something more difficult like the 6.5840 labs.

This was definitely not everything I read this year, but writing two sentence reviews for everything would take far too long. So, I’ve included some other good reads that I won’t be writing more about (in no particular order).