Skip to main content

Durable Execution

How would you change the way you code if your app couldn't fail? Durable Execution is crash-proof execution. Its abstraction ensures your code keeps running, even when conditions would normally cause it to fail.

A Durable Execution system keeps working when networks fail or an instance of your app crashes. It tracks progress and state of your workflows, so your work isn't lost and your processes keep running. Whether your app faces a service outage or unexpected shutdown, Durable Execution makes sure it picks up where it left off, with no progress lost. This reliability lets your app handle disruptions to keep delivering results.

Business logic focus

With Durable Execution, you focus on your workflows and business logic, not on handling errors. The following code is real and works:

Sample showing minimal code for a long-running process

You end up with streamlined code that's more durable:

  • Cleaner code. Move abnormal condition handling out of your logic. You won't need it with Durable Execution.

  • Run forever. Don’t worry about crashes or system outages, even over years or decades. Temporal mitigates it.

  • Runs under every condition. Durable Execution separates oversight like progress tracking from your running code instances. When things go wrong, you can wait for them to resolve, move processing to other systems or to other regions and centers.

  • Deploy and run at the same time. Durable Execution makes sure that each time your code runs, it follows the original logic and pathway. Ship updates and patches without changing outcomes for your existing long-running processes.

You gain these advantages by adopting the Temporal Service into your applications.

Temporal and Durable Execution

When using Temporal, Durable Execution separates your work's state and progress (called your "Event History") from its code. This abstracted oversight (called "orchestration") takes place on a central server. It uses a persistent state and progress data store. That means if your computing breaks, your workflows won't.

Temporal's approach offers specific advantages:

  • Separation of management and execution. The Temporal Service isn't tied to specific task workers or computing platforms.
  • Scale as needed. Durable Execution scales with your business. Each execution is a unique progress abstraction. Add more computing resources to match your needs. This lets you managing additional work without affecting the consistency or reliability of your execution process.
  • Reduce latency. Durable Execution is fast and reliable. It processes tasks quickly and efficiently, ensuring short and predictable response times.

These features combine to provide responsive and reliable services. They resolve problems so you don't have to hard-code it into your business logic.

Self healing and catastrophes

Imagine developing a system to handle reimbursements for your employees. Now, consider ways your process might get blocked -- and resolved. For example:

  • Your finance manager goes on vacation and can't approve a reimbursement. Set a time-out policy and use alternate routing (another coworker) or messaging ("Hey, I'll be out of the office") so every reimbursement gets addressed in time.

  • Your direct deposit with the reimbursed funds failed. For example, there might be an outage at the recipient's bank. After setting a retry policy that won't overload the API provider’s capacity, your process can keep trying until the deposit works. After giving the provider time to recover, you can run your code again and succeed.

  • The printer for paper checks is jammed or out of paper. Not every employee opts into direct deposit. You may need someone to manually walk over and take care of the printer issue before the check can be cut and sent. Once resolved, they can sign off to confirm the check printing task was completed.

With Durable Execution, any problem that recovers over time isn’t really a problem. You have a built-in way to retry your task later. Durable Execution keeps your tasks alive and moving, whether fully automated or integrated with human actions. It doesn't matter if your problems originate with computing, API calls, machinery, or personnel. Durable Execution is built to keep processes moving forwards, regardless.

To be clear, not all tasks heal over time. For example, one of your service providers might go out of business. Retrying your API calls won't get you anywhere if that happens. That's why Durable Execution is designed to handle catastrophes as well as intermittent issues.

When you run into outlier cases where something is truly broken, you need a solution like Temporal. With Temporal, you can patch your code to use a new provider and safely deploy your fixes. You can "replay" your flow's execution history to pick up real-world changes. This allows it to complete your process without losing or repeating work.

Temporal capably handles both the self-healing and catastrophic scenarios. To opt in, you need to be aware of the restrictions that allow Temporal to work its magic.

Temporal requirements

Temporal's use of Durable Execution depends on a few critical factors to ensure you won’t lose or repeat work. Temporal uses a technique known as History Replay, which depends on the following:

  • A durable store: Event History must be saved durably using your server's persistent store. A workflow run, or its abstract execution, must persist forever or until you explicitly no longer need it.

  • Idempotency: Idempotency means you design tasks to succeed once and only once. An idempotent approach prevents process duplication, like withdrawing money twice or accidentally shipping extra orders. Run-once actions maintain data integrity and prevent costly errors. Idempotency keeps operations from producing additional effects, protecting your processes from accidental or repeated actions, ensuring reliable execution.

  • Determinism: Durable Execution stores and tracks every workflow as an abstract entity. If you need to restart the process under extreme circumstances, that process must align with the original run. You can't change a random number or a real measurement (like temperature, time, or location) from the first run. If you do, you can't just pick up from where you left off because the work no longer matches the earlier history.

    Durable Execution requires your workflow code to be deterministic. Every time it runs or is replayed, the outcomes must be the same. This is the only way centralized control can provide all of Durable Execution's features.

    Does this mean you can’t use random numbers or run your work on different days or in different environments? Of course not. It means your code must reliably pick up from where it left off without changing the past in any logical way. This is called determinism. It ensures that given the same starting conditions, your workflows behave identically during each execution. Your results are reliable and assured.

With Temporal's pre-requisites in place, you're ready to start adopting Durable Execution into your "can't fail" applications.

Temporal and Durable Execution

Durable Execution offers a powerful solution for building reliable and scalable applications. It ensures that your workflows continue seamlessly, even when facing failures or disruptions. Durable Execution is:

  • Stateful and persistent: Durable Execution tracks progress and maintains state even when your service restarts or experiences failures. It stores checkpoints in external databases and logs, ensuring your system handles outages or crashes without losing progress.

  • Fault tolerant: Durable Execution handles failures automatically, keeping tasks running even when parts of your system go down. When a failure occurs, it recovers tasks without interrupting your entire application.

  • Designed to separate concerns: Durable Execution splits oversight (task orchestration) from infrastructure management. Focus your app's logic on on business processes and application-level logic, like managing fraud alerts or insufficient funds in a banking app, and not on status recovery. Durable Execution handles state and errors related to platform issues, such as network outages or infrastructure failures so you don't have to.

  • Won't repeat work: Durable Execution ensures tasks are not repeated unnecessarily. When a task fails, it retries it using policies designed to ensure success without duplicating work. This keeps the process consistent, eliminating redundant work even when errors arise. You won't be sending out seven pizzas when the customer ordered just one.

  • Naturally recoverable: Even in worst-case scenarios, Durable Execution recovers execution without losing progress. Moving to new hardware or service center deployments won't interrupt your workflows.

  • Inherently observable: Durable Execution makes the state, health, and progress of your app fully visible. It tracks tasks in real time, so you see progress, failures, and retries as they happen.

These features work together to make sure your process will keep moving forward and complete successfully. Temporal's implementation of Durable Execution, whether you're self hosting or using our world class Temporal Cloud service, provide the solution.

Durable Execution helps you build reliable and scalable applications. It keeps your workflows running smoothly, even through system failures or disruptions. By separating your application logic from task orchestration, Durable Execution ensures that your processes are consistent, reliable, and error-free.

With automatic recovery, Durable Execution guarantees that tasks complete without losing or repeating work. It simplifies your code, lets you scale easily, and ensures that your app can handle any challenges along the way. Durable Execution makes sure your critical processes keep moving forward, no matter what.

Getting started with Temporal helps ensure your work is reliable, efficient, and scalable.