SeloraXDEVELOPERS

Retry Policy

Retry Policy

When a webhook delivery fails, SeloraX automatically retries with an escalating backoff schedule. This page covers the retry schedule, what counts as a failure, delivery states, the dead letter queue, and how subscriptions are automatically deactivated after repeated failures.

Retry Schedule

Each webhook delivery gets up to 5 retries after the initial attempt, for a total of 6 delivery attempts. Retries use an escalating backoff schedule designed to handle both brief outages and longer downtime:

Attempt 1 --- Initial delivery (immediate)
    |
    |  (1 minute)
    v
Attempt 2 --- 1st retry
    |
    |  (5 minutes)
    v
Attempt 3 --- 2nd retry
    |
    |  (30 minutes)
    v
Attempt 4 --- 3rd retry
    |
    |  (2 hours)
    v
Attempt 5 --- 4th retry
    |
    |  (12 hours)
    v
Attempt 6 --- 5th retry (final)
AttemptDelay After PreviousCumulative Time
1Immediate0s
21 minute~1 min
35 minutes~6 min
430 minutes~36 min
52 hours~2.5 hr
612 hours~14.5 hr

Total delivery window: approximately 14.5 hours from the initial attempt to the final retry. This gives your endpoint ample time to recover from extended outages.

Each individual attempt has a 15-second timeout. If your endpoint does not respond with a status code within 15 seconds, the attempt is marked as failed.


What Counts as a Failure

A delivery attempt is considered failed if any of the following occur:

ConditionExample
Non-2xx HTTP response400 Bad Request, 500 Internal Server Error, 403 Forbidden
TimeoutYour endpoint takes longer than 15 seconds to respond
Connection errorDNS resolution failure, connection refused, TLS handshake error

A delivery attempt is considered successful if your endpoint returns any 2xx status code (200, 201, 202, 204, etc.) within the 15-second window.


Delivery States

Deliveries progress through the following states:

StatusMeaning
deliveredA 2xx response was received. No further retries needed.
retryingThe delivery attempt failed but more retries are available.
failedAll 6 attempts exhausted without a successful delivery. The event is moved to the dead letter queue.

State Transitions

Attempt 1 succeeds  ->  delivered
Attempt 1 fails     ->  retrying  ->  Attempt 2 succeeds  ->  delivered
                                  ->  Attempt 2 fails     ->  retrying  ->  ...
                                                                        ->  Attempt 6 fails  ->  failed (dead letter)

Dead Letter Queue

When all 6 delivery attempts are exhausted, the event enters the dead letter queue. This means:

  • The delivery record is permanently marked as failed
  • The event payload is preserved in the delivery log for 30 days
  • The subscription's failure_count is incremented by 1 (once per failed event, not per attempt)
  • No further automatic retries will be made for this specific event

You can view failed deliveries in the delivery logs and use manual retry to re-deliver them after fixing the underlying issue.

Failure Counting

The subscription failure_count only increments once per failed event delivery (after all 6 attempts are exhausted), not once per failed attempt. This prevents a single bad event from rapidly accumulating failures.


Delivery Logging

Every delivery attempt is logged with the following details:

FieldDescription
delivery_idUnique identifier for this delivery record
subscription_idThe subscription this delivery belongs to
event_topicThe event topic (e.g., order.status_changed)
event_idThe unique event identifier
target_urlThe URL the payload was sent to
response_statusHTTP status code returned (or null on connection error)
attempt_numberWhich attempt this was (1-6)
statusdelivered, retrying, or failed
duration_msHow long the request took in milliseconds
error_messageError description if the attempt failed
created_atWhen the attempt was made
completed_atWhen the delivery reached a terminal state

Delivery logs are retained for 30 days and then automatically cleaned up.


Automatic Subscription Deactivation

To protect both the platform and your infrastructure, subscriptions are automatically deactivated after repeated failures:

  • Threshold: 20 consecutive failed event deliveries (across different events, not retries of one event)
  • Effect: The subscription's is_active flag is set to 0. No further deliveries will be attempted.
  • Counter reset: The failure counter resets to 0 on any successful delivery. A single success clears all accumulated failures.

Deactivated Subscription

When a subscription is auto-deactivated, events published during the inactive period are not queued. You will need to manually sync any missed data after reactivating the subscription.

How to Check Subscription Status

GET /api/apps/v1/webhooks
{
  "data": [
    {
      "subscription_id": 5,
      "event_topic": "order.status_changed",
      "target_url": "https://your-app.com/webhooks",
      "is_active": 0,
      "failure_count": 20,
      "last_failure_at": "2024-02-28T12:30:00.000Z",
      "last_success_at": "2024-02-27T08:15:00.000Z"
    }
  ]
}

How to Reactivate

If your subscription has been deactivated, fix the underlying issue (endpoint down, URL changed, certificate expired, etc.) and then update the subscription:

PUT /api/apps/v1/webhooks/:subscription_id
{
  "is_active": true
}

The failure counter is reset when the subscription is reactivated. The next delivery attempt will start fresh.


Manual Retry (Admin)

Store administrators can manually retry a specific delivery through the admin API:

POST /api/apps/webhooks/deliveries/:delivery_id/retry

This re-publishes the original event through the Inngest queue, creating a new delivery pipeline with its own retry chain. The original delivery record is not modified.

Use cases:

  • Your endpoint was temporarily down and you want to replay missed events
  • A deployment issue caused failures that have since been fixed
  • Debugging a specific event that failed to process

Timeline Example

Here is a complete timeline showing what happens when an event is published and the first few delivery attempts fail, followed by a successful attempt:

T+0.0s     Event published (order.status_changed)
T+0.1s     Fan-out: 1 active subscription found
T+0.2s     Attempt 1 -> POST https://your-app.com/webhooks
T+15.2s    Attempt 1 TIMEOUT (15s) -> status: retrying
           failure_count unchanged (retries remaining)

T+75.2s    Attempt 2 (retry after 1 min)
T+75.5s    Attempt 2 -> 502 Bad Gateway -> status: retrying

T+375.5s   Attempt 3 (retry after 5 min)
T+375.8s   Attempt 3 -> Connection refused -> status: retrying

T+2175.8s  Attempt 4 (retry after 30 min)
T+2176.1s  Attempt 4 -> 200 OK (300ms) -> status: delivered
           failure_count: 0 (reset on success)

If all attempts fail:

T+0.0s      Attempt 1 -> 500 Internal Server Error -> retrying
T+60s       Attempt 2 -> 500 Internal Server Error -> retrying
T+360s      Attempt 3 -> Timeout -> retrying
T+2160s     Attempt 4 -> Connection refused -> retrying
T+9360s     Attempt 5 -> 503 Service Unavailable -> retrying
T+52560s    Attempt 6 (FINAL) -> 500 Internal Server Error -> failed
            Event enters dead letter queue
            failure_count +1 (now 1 out of 20 threshold)

Summary

ParameterValue
Max retries5 (6 total attempts)
Backoff delays1min, 5min, 30min, 2hr, 12hr
Timeout per attempt15 seconds
Total delivery window~14.5 hours
Auto-disable threshold20 consecutive failed events
Failure counter resetOn any successful delivery
Log retention30 days