Here's what I think are the most intriguing bits:
- One pod should not affect another pod [great for testing new versions of a cluster] (p.3)
- Correlated failures are common [a switch failure may affect lots of computers] (p. 5)
- It must be easy to host the entire service on a single system (p. 7)
- Soft delete only [Great for debugging] (p. 9)
- Avoiding latencies is the thoughest problem (p. 10)