Retry Policy
Retry Policy
When a webhook delivery fails, SeloraX automatically retries with an escalating backoff schedule. This page covers the retry schedule, what counts as a failure, delivery states, the dead letter queue, and how subscriptions are automatically deactivated after repeated failures.
Retry Schedule
Each webhook delivery gets up to 5 retries after the initial attempt, for a total of 6 delivery attempts. Retries use an escalating backoff schedule designed to handle both brief outages and longer downtime:
Attempt 1 --- Initial delivery (immediate)
|
| (1 minute)
v
Attempt 2 --- 1st retry
|
| (5 minutes)
v
Attempt 3 --- 2nd retry
|
| (30 minutes)
v
Attempt 4 --- 3rd retry
|
| (2 hours)
v
Attempt 5 --- 4th retry
|
| (12 hours)
v
Attempt 6 --- 5th retry (final)
| Attempt | Delay After Previous | Cumulative Time |
|---|---|---|
| 1 | Immediate | 0s |
| 2 | 1 minute | ~1 min |
| 3 | 5 minutes | ~6 min |
| 4 | 30 minutes | ~36 min |
| 5 | 2 hours | ~2.5 hr |
| 6 | 12 hours | ~14.5 hr |
Total delivery window: approximately 14.5 hours from the initial attempt to the final retry. This gives your endpoint ample time to recover from extended outages.
Each individual attempt has a 15-second timeout. If your endpoint does not respond with a status code within 15 seconds, the attempt is marked as failed.
What Counts as a Failure
A delivery attempt is considered failed if any of the following occur:
| Condition | Example |
|---|---|
| Non-2xx HTTP response | 400 Bad Request, 500 Internal Server Error, 403 Forbidden |
| Timeout | Your endpoint takes longer than 15 seconds to respond |
| Connection error | DNS resolution failure, connection refused, TLS handshake error |
A delivery attempt is considered successful if your endpoint returns any 2xx status code (200, 201, 202, 204, etc.) within the 15-second window.
Delivery States
Deliveries progress through the following states:
| Status | Meaning |
|---|---|
delivered | A 2xx response was received. No further retries needed. |
retrying | The delivery attempt failed but more retries are available. |
failed | All 6 attempts exhausted without a successful delivery. The event is moved to the dead letter queue. |
State Transitions
Attempt 1 succeeds -> delivered
Attempt 1 fails -> retrying -> Attempt 2 succeeds -> delivered
-> Attempt 2 fails -> retrying -> ...
-> Attempt 6 fails -> failed (dead letter)
Dead Letter Queue
When all 6 delivery attempts are exhausted, the event enters the dead letter queue. This means:
- The delivery record is permanently marked as
failed - The event payload is preserved in the delivery log for 30 days
- The subscription's
failure_countis incremented by 1 (once per failed event, not per attempt) - No further automatic retries will be made for this specific event
You can view failed deliveries in the delivery logs and use manual retry to re-deliver them after fixing the underlying issue.
Failure Counting
The subscription failure_count only increments once per failed event delivery (after all 6 attempts are exhausted), not once per failed attempt. This prevents a single bad event from rapidly accumulating failures.
Delivery Logging
Every delivery attempt is logged with the following details:
| Field | Description |
|---|---|
delivery_id | Unique identifier for this delivery record |
subscription_id | The subscription this delivery belongs to |
event_topic | The event topic (e.g., order.status_changed) |
event_id | The unique event identifier |
target_url | The URL the payload was sent to |
response_status | HTTP status code returned (or null on connection error) |
attempt_number | Which attempt this was (1-6) |
status | delivered, retrying, or failed |
duration_ms | How long the request took in milliseconds |
error_message | Error description if the attempt failed |
created_at | When the attempt was made |
completed_at | When the delivery reached a terminal state |
Delivery logs are retained for 30 days and then automatically cleaned up.
Automatic Subscription Deactivation
To protect both the platform and your infrastructure, subscriptions are automatically deactivated after repeated failures:
- Threshold: 20 consecutive failed event deliveries (across different events, not retries of one event)
- Effect: The subscription's
is_activeflag is set to0. No further deliveries will be attempted. - Counter reset: The failure counter resets to
0on any successful delivery. A single success clears all accumulated failures.
Deactivated Subscription
When a subscription is auto-deactivated, events published during the inactive period are not queued. You will need to manually sync any missed data after reactivating the subscription.
How to Check Subscription Status
GET /api/apps/v1/webhooks
{
"data": [
{
"subscription_id": 5,
"event_topic": "order.status_changed",
"target_url": "https://your-app.com/webhooks",
"is_active": 0,
"failure_count": 20,
"last_failure_at": "2024-02-28T12:30:00.000Z",
"last_success_at": "2024-02-27T08:15:00.000Z"
}
]
}How to Reactivate
If your subscription has been deactivated, fix the underlying issue (endpoint down, URL changed, certificate expired, etc.) and then update the subscription:
PUT /api/apps/v1/webhooks/:subscription_id
{
"is_active": true
}The failure counter is reset when the subscription is reactivated. The next delivery attempt will start fresh.
Manual Retry (Admin)
Store administrators can manually retry a specific delivery through the admin API:
POST /api/apps/webhooks/deliveries/:delivery_id/retry
This re-publishes the original event through the Inngest queue, creating a new delivery pipeline with its own retry chain. The original delivery record is not modified.
Use cases:
- Your endpoint was temporarily down and you want to replay missed events
- A deployment issue caused failures that have since been fixed
- Debugging a specific event that failed to process
Timeline Example
Here is a complete timeline showing what happens when an event is published and the first few delivery attempts fail, followed by a successful attempt:
T+0.0s Event published (order.status_changed)
T+0.1s Fan-out: 1 active subscription found
T+0.2s Attempt 1 -> POST https://your-app.com/webhooks
T+15.2s Attempt 1 TIMEOUT (15s) -> status: retrying
failure_count unchanged (retries remaining)
T+75.2s Attempt 2 (retry after 1 min)
T+75.5s Attempt 2 -> 502 Bad Gateway -> status: retrying
T+375.5s Attempt 3 (retry after 5 min)
T+375.8s Attempt 3 -> Connection refused -> status: retrying
T+2175.8s Attempt 4 (retry after 30 min)
T+2176.1s Attempt 4 -> 200 OK (300ms) -> status: delivered
failure_count: 0 (reset on success)
If all attempts fail:
T+0.0s Attempt 1 -> 500 Internal Server Error -> retrying
T+60s Attempt 2 -> 500 Internal Server Error -> retrying
T+360s Attempt 3 -> Timeout -> retrying
T+2160s Attempt 4 -> Connection refused -> retrying
T+9360s Attempt 5 -> 503 Service Unavailable -> retrying
T+52560s Attempt 6 (FINAL) -> 500 Internal Server Error -> failed
Event enters dead letter queue
failure_count +1 (now 1 out of 20 threshold)
Summary
| Parameter | Value |
|---|---|
| Max retries | 5 (6 total attempts) |
| Backoff delays | 1min, 5min, 30min, 2hr, 12hr |
| Timeout per attempt | 15 seconds |
| Total delivery window | ~14.5 hours |
| Auto-disable threshold | 20 consecutive failed events |
| Failure counter reset | On any successful delivery |
| Log retention | 30 days |