Basic Error Handling
Errors in workflows are handled with standard try-catch statements. Log the error and re-throw it if necessary to put the workflow in a failed state.- Log errors with
logger.error() - Fail the workflow with
throw error - Failed workflows are automatically retried by the Worker
Step-Level Error Handling
Not all steps are required. Some steps can fail without stopping the workflow. Wrap these optional steps in try-catch to absorb errors.- Cache failure (data must still be saved)
- Notification failure (order must still complete)
- Logging failure (business logic continues)
Retry Patterns
Manual Retry
External API or network requests can fail temporarily. In such cases, multiple retries can improve success rates.- Maximum 3 attempts
- 5 second wait between attempts
- Throw error if all attempts fail
Exponential Backoff
Exponential Backoff gradually increases the delay between retries, preventing server overload while improving retry success rates.- 1st failure -> 2 second wait
- 2nd failure -> 4 second wait
- 3rd failure -> 8 second wait
- 4th failure -> 16 second wait
- Provides time to recover from temporary overload
- Distributes server load
- Improves retry success rate
Compensating Transactions
In distributed transactions, when some operations fail, already completed operations must be rolled back. This is called a Compensating Transaction.- Track completion status of each operation
- Check completed operations when failure occurs
- Cancel operations in reverse order
- Re-throw the error
- Payment refund
- Inventory restoration
- Reservation cancellation
- File deletion
Timeout Handling
If an external API doesn’t respond, the workflow could wait indefinitely. Set a timeout to fail after a certain duration.- Short timeout (5-10 seconds): Fast APIs
- Medium timeout (30-60 seconds): Standard APIs
- Long timeout (5-10 minutes): File processing
Error Type-Based Handling
Use different handling strategies based on error types. Network errors should be retried, but data validation errors should fail immediately.| Error Type | Handling | Examples |
|---|---|---|
| Transient errors | Retry | Network, timeout, 503 |
| Permanent errors | Fail immediately | Validation, auth, 404 |
| Partial failures | Selective handling | Some data corrupted |
Practical Examples
1. Email Sending with Dead Letter Queue
If email sending fails 3 times, add it to a Dead Letter Queue for manual processing later.- 3 retry failures
- Add to DLQ (process later)
- Notify administrator
2. API Call with Circuit Breaker
When an external API keeps failing, protect the system by blocking requests with a Circuit Breaker.- Closed: Normal operation
- Open: Blocked after 5 failures (for 1 minute)
- Half-Open: Retry after 1 minute
3. File Upload with Partial Retry
When uploading multiple files, retry each file independently so that some failures don’t prevent others from succeeding.- Each file has independent steps
- 3 retries per file
- Workflow succeeds even if some fail
- Returns list of failed files