The more things I can do with you, the more things I have to think about recovering from.[2] Handling failures is an important theme in distributed systems design.Failure is the defining difference between distributed and local programming, so you have to design distributed systems with the expectation of failure.Imagine asking people, "If the probability of something happening is one in 10, how often would it happen?This issue is discussed in the following excerpt of an interview with Ken Arnold.Ken is a research scientist at Sun and is one of the original architects of Jini, and was a member of the architectural team that designed CORBA.

The network has to be repaired or you have to come up, because maybe what happened was not a network failure but you died. For one thing, it puts a multiplier on the value of simplicity.In a cubic foot of air, those things happen all the time." When you design distributed systems, you have to say, "Failure happens all the time." So when you design, you design for failure. The other is the message never got to you because the network broke before it arrived.