Webhook Events & Responses
This section outlines the full structure of all webhook events, the types of events your system can receive, and how to properly respond to and process those events.
Event Structure
All webhook events follow a consistent JSON structure:
{ "event": "event.name", "timestamp": "2024-12-09T10:30:00Z", "data": { // Event-specific data } }
Common Fields
| Field | Type | Description |
|---|---|---|
event | string | Event type (e.g., crawl.completed) |
timestamp | string | ISO 8601 timestamp when event occurred |
data | object | Event-specific payload data |
HTTP Headers
Every webhook request includes these headers:
Content-Type: application/json X-LinkHealth-Signature: t=1702123456,v1=abc123... X-LinkHealth-Event: crawl.completed User-Agent: LinkHealthMonitor-Webhook/1.0
| Header | Description |
|---|---|
X-LinkHealth-Signature | HMAC-SHA256 signature for verification |
X-LinkHealth-Event | Event type (same as event in body) |
User-Agent | Identifies requests from our system |
Event Types
1. Crawl Started
Event: crawl.started
When: A new crawl has been initiated and is beginning to process URLs.
Payload:
{ "event": "crawl.started", "timestamp": "2024-12-09T10:30:00Z", "data": { "crawl_id": "550e8400-e29b-41d4-a716-446655440000", "domain": "example.com", "domain_id": "123e4567-e89b-12d3-a456-426614174000", "url": "https://example.com", "max_pages": 500, "depth_limit": 3, "started_at": "2024-12-09T10:30:00Z" } }
Data Fields:
| Field | Type | Description |
|---|---|---|
crawl_id | string (UUID) | Unique identifier for this crawl |
domain | string | Domain being crawled |
domain_id | string (UUID) | Domain identifier |
url | string | Starting URL for the crawl |
max_pages | number | Maximum pages to crawl (based on plan) |
depth_limit | number | Maximum crawl depth |
started_at | string (ISO 8601) | When the crawl started |
Example Use Cases:
- Send notification: "Crawl started for example.com"
- Update dashboard status to "In Progress"
- Log crawl start in your database
2. Crawl Completed
Event: crawl.completed
When: A crawl has finished successfully and all results are available.
Payload:
{ "event": "crawl.completed", "timestamp": "2024-12-09T10:35:00Z", "data": { "crawl_id": "550e8400-e29b-41d4-a716-446655440000", "domain": "example.com", "domain_id": "123e4567-e89b-12d3-a456-426614174000", "url": "https://example.com", "started_at": "2024-12-09T10:30:00Z", "completed_at": "2024-12-09T10:35:00Z", "duration_seconds": 300, "stats": { "total_pages": 247, "pages_crawled": 247, "total_links": 1523, "broken_links": 12, "external_links": 89, "internal_links": 1434, "redirects": 5, "errors": 12 } } }
Data Fields:
| Field | Type | Description |
|---|---|---|
crawl_id | string (UUID) | Unique identifier for this crawl |
domain | string | Domain that was crawled |
domain_id | string (UUID) | Domain identifier |
url | string | Starting URL |
started_at | string (ISO 8601) | When crawl started |
completed_at | string (ISO 8601) | When crawl completed |
duration_seconds | number | Total crawl duration |
stats.total_pages | number | Total pages found |
stats.pages_crawled | number | Pages successfully crawled |
stats.total_links | number | Total links discovered |
stats.broken_links | number | Number of broken links found |
stats.external_links | number | External links |
stats.internal_links | number | Internal links |
stats.redirects | number | Redirect chains found |
stats.errors | number | HTTP errors (4xx, 5xx) |
Example Use Cases:
- Send success notification with stats
- Update database with crawl results
- Trigger report generation
- Send Slack message: "✅ Crawl complete: 12 broken links found"
3. Crawl Failed
Event: crawl.failed
When: A crawl encountered an error and could not complete.
Payload:
{ "event": "crawl.failed", "timestamp": "2024-12-09T10:32:00Z", "data": { "crawl_id": "550e8400-e29b-41d4-a716-446655440000", "domain": "example.com", "domain_id": "123e4567-e89b-12d3-a456-426614174000", "url": "https://example.com", "started_at": "2024-12-09T10:30:00Z", "failed_at": "2024-12-09T10:32:00Z", "error": "Connection timeout", "error_code": "TIMEOUT", "pages_crawled": 45 } }
Data Fields:
| Field | Type | Description |
|---|---|---|
crawl_id | string (UUID) | Unique identifier for this crawl |
domain | string | Domain being crawled |
domain_id | string (UUID) | Domain identifier |
url | string | Starting URL |
started_at | string (ISO 8601) | When crawl started |
failed_at | string (ISO 8601) | When crawl failed |
error | string | Human-readable error message |
error_code | string | Machine-readable error code |
pages_crawled | number | Pages crawled before failure |
Common Error Codes:
| Code | Description |
|---|---|
TIMEOUT | Crawl exceeded time limit |
DNS_ERROR | Domain not found or DNS resolution failed |
CONNECTION_REFUSED | Server refused connection |
SSL_ERROR | SSL/TLS certificate error |
ROBOT_BLOCKED | Blocked by robots.txt |
RATE_LIMIT | Rate limited by server |
UNKNOWN_ERROR | Unexpected error occurred |
Example Use Cases:
- Send alert notification
- Log error for debugging
- Trigger automatic retry logic
- Update status to "Failed" in dashboard
4. Broken Links Found
Event: broken_links.found
When: A completed crawl detected broken links on the site.
Payload:
{ "event": "broken_links.found", "timestamp": "2024-12-09T10:35:00Z", "data": { "crawl_id": "550e8400-e29b-41d4-a716-446655440000", "domain": "example.com", "domain_id": "123e4567-e89b-12d3-a456-426614174000", "broken_link_count": 12, "total_pages": 247, "severity": "medium", "top_broken_links": [ { "url": "https://example.com/missing-page", "status_code": 404, "found_on_pages": 3 }, { "url": "https://example.com/old-resource", "status_code": 410, "found_on_pages": 1 } ] } }
Data Fields:
| Field | Type | Description |
|---|---|---|
crawl_id | string (UUID) | Unique identifier for this crawl |
domain | string | Domain with broken links |
domain_id | string (UUID) | Domain identifier |
broken_link_count | number | Total broken links found |
total_pages | number | Total pages crawled |
severity | string | low, medium, or high |
top_broken_links | array | Up to 5 most common broken links |
Severity Levels:
| Severity | Criteria |
|---|---|
low | 1-5 broken links |
medium | 6-20 broken links |
high | 21+ broken links |
Example Use Cases:
- Send urgent alert for high severity
- Create task in project management tool
- Send email with broken link report
- Trigger automated fix workflow
Responding to Webhooks
Success Response
Return a 2xx status code to indicate successful receipt:
// Node.js/Express res.status(200).json({ received: true });
# Python/Flask return jsonify({'received': True}), 200
// PHP http_response_code(200); echo json_encode(['received' => true]);
Recommended: Return 200 OK as quickly as possible, then process the event asynchronously.
Response Timeout
Your endpoint must respond within 30 seconds, or the request will timeout and retry.
Best Practice:
app.post('/webhook', async (req, res) => { // 1. Verify signature if (!verifySignature(req.body, req.headers['x-linkhealth-signature'])) { return res.status(401).json({ error: 'Invalid signature' }); } // 2. Return success immediately res.status(200).json({ received: true }); // 3. Process asynchronously processWebhookAsync(req.body).catch(err => { console.error('Webhook processing error:', err); }); });
Error Responses
Return appropriate status codes for different error scenarios:
| Status Code | When to Use |
|---|---|
200 OK | Successfully received and processed |
401 Unauthorized | Invalid signature |
400 Bad Request | Malformed payload |
500 Internal Server Error | Processing error (will retry) |
503 Service Unavailable | Temporarily unavailable (will retry) |
Example Error Handling:
app.post('/webhook', async (req, res) => { try { // Verify signature if (!verifySignature(req.body, req.headers['x-linkhealth-signature'])) { return res.status(401).json({ error: 'Invalid signature' }); } // Validate payload if (!req.body.event || !req.body.data) { return res.status(400).json({ error: 'Invalid payload' }); } // Process event await processEvent(req.body); return res.status(200).json({ received: true }); } catch (error) { console.error('Webhook error:', error); return res.status(500).json({ error: 'Internal error' }); } });
Handling Failures
Automatic Retries
If your endpoint returns a non-2xx status code or times out, we'll automatically retry:
Retry Schedule:
- Immediate failure → Retry after 1 minute
- Second failure → Retry after 5 minutes
- Third failure → Retry after 15 minutes
- Final failure → Mark as failed, stop retrying
Exponential Backoff: Each retry waits longer than the previous one.
Idempotency
Webhooks may be delivered multiple times (due to retries or network issues). Make your webhook handler idempotent.
Bad Example (Not Idempotent):
// ❌ This will create duplicates on retry await db.insert('crawls', { id: data.crawl_id, status: 'completed' });
Good Example (Idempotent):
// ✅ This handles duplicates gracefully await db.upsert('crawls', { id: data.crawl_id, status: 'completed' }, { conflict: 'id' });
Using Deduplication:
const processedEvents = new Set(); app.post('/webhook', async (req, res) => { const eventId = req.body.data.crawl_id; // Check if already processed if (processedEvents.has(eventId)) { console.log('Duplicate event, skipping'); return res.status(200).json({ received: true, duplicate: true }); } // Process event await processEvent(req.body); // Mark as processed processedEvents.add(eventId); return res.status(200).json({ received: true }); });
Debugging Failed Deliveries
View delivery logs in the dashboard:
- Go to Settings → API → Webhooks
- Click on your webhook
- View Recent Deliveries
Each delivery shows:
- ✅ Status code (200, 401, 500, etc.)
- ⏱️ Response time in milliseconds
- 📄 Response body (truncated to 1000 chars)
- ❌ Error message (if failed)
- 🔄 Retry count
Processing Examples
Example 1: Send Slack Notification
async function handleCrawlCompleted(data) { const { domain, stats } = data; const message = { text: `Crawl completed for ${domain}`, blocks: [ { type: 'section', text: { type: 'mrkdwn', text: `✅ *Crawl Complete: ${domain}*` } }, { type: 'section', fields: [ { type: 'mrkdwn', text: `*Pages:* ${stats.total_pages}` }, { type: 'mrkdwn', text: `*Broken Links:* ${stats.broken_links}` }, { type: 'mrkdwn', text: `*Duration:* ${Math.round(data.duration_seconds / 60)}m` } ] } ] }; await fetch(process.env.SLACK_WEBHOOK_URL, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(message) }); }
Example 2: Update Database
async function handleCrawlEvent(event, data) { switch (event) { case 'crawl.started': await db.query( 'INSERT INTO crawls (id, domain, status, started_at) VALUES ($1, $2, $3, $4)', [data.crawl_id, data.domain, 'running', data.started_at] ); break; case 'crawl.completed': await db.query( 'UPDATE crawls SET status = $1, completed_at = $2, stats = $3 WHERE id = $4', ['completed', data.completed_at, JSON.stringify(data.stats), data.crawl_id] ); break; case 'crawl.failed': await db.query( 'UPDATE crawls SET status = $1, error = $2 WHERE id = $3', ['failed', data.error, data.crawl_id] ); break; } }
Example 3: Trigger Automated Actions
async function handleBrokenLinks(data) { // High severity = create urgent ticket if (data.severity === 'high') { await createJiraTicket({ title: `URGENT: ${data.broken_link_count} broken links on ${data.domain}`, priority: 'High', description: `Crawl found ${data.broken_link_count} broken links.\n\n` + `Top issues:\n${data.top_broken_links.map(link => `- ${link.url} (${link.status_code})` ).join('\n')}` }); } // Always send email report await sendEmailReport({ to: 'team@example.com', subject: `Broken Links Report: ${data.domain}`, crawlId: data.crawl_id, stats: data }); }
Complete Handler Example
Here's a complete webhook handler with all best practices:
const express = require('express'); const crypto = require('crypto'); const app = express(); // Middleware to preserve raw body for signature verification app.use(express.json({ verify: (req, res, buf) => { req.rawBody = buf.toString('utf8'); } })); // Webhook handler app.post('/webhooks/seo-crawler', async (req, res) => { try { // 1. Verify signature const signature = req.headers['x-linkhealth-signature']; if (!verifySignature(req.rawBody, signature, process.env.WEBHOOK_SECRET)) { console.error('Invalid signature'); return res.status(401).json({ error: 'Invalid signature' }); } // 2. Validate payload const { event, data } = req.body; if (!event || !data) { return res.status(400).json({ error: 'Invalid payload' }); } // 3. Return success immediately res.status(200).json({ received: true }); // 4. Process asynchronously processWebhook(event, data).catch(err => { console.error('Webhook processing error:', err); }); } catch (error) { console.error('Webhook handler error:', error); res.status(500).json({ error: 'Internal error' }); } }); async function processWebhook(event, data) { console.log(`Processing event: ${event}`, data); switch (event) { case 'crawl.started': await handleCrawlStarted(data); break; case 'crawl.completed': await handleCrawlCompleted(data); break; case 'crawl.failed': await handleCrawlFailed(data); break; case 'broken_links.found': await handleBrokenLinks(data); break; default: console.warn('Unknown event type:', event); } } function verifySignature(payload, signatureHeader, secret) { // See Security Guide for implementation // ... } app.listen(3000, () => { console.log('Webhook server running on port 3000'); });
Testing Events
Using the Test Feature
- Go to Settings → API → Webhooks
- Click Test on your webhook
- A test event is sent immediately
Test Event Payload:
{ "event": "test", "timestamp": "2024-12-09T10:30:00Z", "data": { "message": "This is a test webhook", "webhook_id": "wh_abc123" } }
Manual Testing
Use curl to simulate webhooks locally:
curl -X POST http://localhost:3000/webhook \ -H "Content-Type: application/json" \ -H "X-LinkHealth-Signature: t=1702123456,v1=..." \ -H "X-LinkHealth-Event: crawl.completed" \ -d '{ "event": "crawl.completed", "timestamp": "2024-12-09T10:30:00Z", "data": { "crawl_id": "test-123", "domain": "example.com", "stats": { "total_pages": 100, "broken_links": 5 } } }'
FAQ
Q: Can I filter which events I receive?
A: Yes! When creating a webhook, select only the events you want. You can update event subscriptions at any time.
Q: How long are delivery logs retained?
A: Delivery logs are kept for 30 days.
Q: What happens if my endpoint is down?
A: We'll retry 3 times with exponential backoff. If all retries fail, the delivery is marked as failed and you can view it in delivery logs.
Q: Can I replay a failed webhook?
A: Not currently. You'll need to use the API to fetch the crawl data directly.
Q: Are webhooks sent in order?
A: Webhooks are sent as events occur, but network conditions may cause them to arrive out of order. Use timestamps to determine event sequence.
Q: How many webhooks can I create?
A: Agency plan includes up to 10 webhook endpoints.
Related Documentation
- Webhook Overview - Introduction and concepts
- Setup Guide - Creating and configuring webhooks
- Security Guide - Signature verification (required reading!)
- API Reference - REST API documentation