MySites Development Blog: June 2008

James Hamilton of Windows Live Services wrote a compelling paper On Designing and Deploying Internet-Scale Services (PDF).

Here's what I think are the most intriguing bits:

- Never shut down your services normally. (p. 2)
- One pod should not affect another pod [great for testing new versions of a cluster] (p.3)
- Correlated failures are common [a switch failure may affect lots of computers] (p. 5)
- It must be easy to host the entire service on a single system (p. 7)
- Soft delete only [Great for debugging] (p. 9)
- Avoiding latencies is the thoughest problem (p. 10)

MySites Development Blog

Sunday, June 15, 2008

James Hamilton's Take on Cluster Computing

Google's Lessons of Real Hardware

Links

Blog Archive

Contributors