Microservices Resilience Cover Image

How to Handle Microservice Failures: The Ambiguous Timeout Dilemma

🧩 The Classic Problem: When a Service Doesn’t Respond Imagine the following flow in a service-oriented system: Client → the-api → the-upstream → Database 1️⃣ The client calls the-api/v1/test 2️⃣ the-api calls the-upstream 3️⃣ the-upstream writes data into its own database 4️⃣ It replies OK to the-api 5️⃣ the-api stores the result in its database and replies OK to the client Everything works fine — until step 4 (D) fails because of a network partition or timeout. Now, the-api doesn’t know if the upstream actually wrote the data or not. You can’t safely commit your local write — and you can’t confidently reply to the client. ...

October 6, 2025 · 5 min · Frederico Gago