HTTP Errors
Understand common HTTP error status codes (4xx/5xx) shown in crawl results, with linkable anchors per code.
Overview
When we crawl a link, the destination website responds with an HTTP status code.
This page explains what each code usually means, why your site (or a third-party site) might return it, and what you can do.
For external links, you may not be able to fix the destination. In that case, you can update or remove the link on your site.
Quick links
- 400 Bad Request
- 401 Unauthorized
- 403 Forbidden
- 404 Not Found
- 405 Method Not Allowed
- 408 Request Timeout
- 409 Conflict
- 410 Gone
- 413 Payload Too Large
- 429 Too Many Requests
- 500 Internal Server Error
- 502 Bad Gateway
- 503 Service Unavailable
- 504 Gateway Timeout
400 Bad Request
Meaning: The server couldn't understand the request.
Common causes:
- The URL is malformed
- The destination rejects unusual query parameters
What to do:
- Try opening the URL in a browser
- If it's your site, check server logs for the request path
- If it's an external link, replace or remove it
401 Unauthorized
Meaning: Authentication is required.
Common causes:
- The page is behind a login
- The resource requires an auth token/cookie
What to do:
- If the page should be public, remove the auth requirement
- If it's intentionally private, treat it as expected
- For external links, avoid linking to login-only pages unless intended
403 Forbidden
Meaning: The server refused to authorize the request.
Common causes:
- WAF / bot protection blocked the crawler
- IP/geo restrictions
- Private/admin areas that deny public access
What to do:
- If it's your site, allowlist our crawler (or relax WAF rules for public pages)
- Confirm the URL is meant to be publicly accessible
- For external links, consider replacing the URL with a public alternative
404 Not Found
Meaning: The page doesn't exist at that URL.
Common causes:
- Broken internal link
- Content moved/renamed without a redirect
What to do:
- Update the link to the correct URL
- Add a 301 redirect from the old URL to the new one
- Remove links to deleted pages
405 Method Not Allowed
Meaning: The URL exists, but the server doesn't allow the HTTP method.
Common causes:
- Misconfigured routing
- An endpoint that expects only POST/PUT, etc.
What to do:
- Public pages should generally respond to GET
- If it's your site, fix routing/middleware rules
408 Request Timeout
Meaning: The server timed out waiting for the request.
Common causes:
- Slow origin responses
- Intermittent network issues
What to do:
- Re-run the crawl to see if it was transient
- If it's your site, investigate slow endpoints and backend performance
409 Conflict
Meaning: The request conflicts with the current state.
Common causes:
- Uncommon for standard page requests; can indicate a misbehaving endpoint
What to do:
- If it's your site, inspect server logs for the URL
410 Gone
Meaning: The resource was permanently removed.
What to do:
- Remove the link or redirect it to a relevant replacement
413 Payload Too Large
Meaning: The server refused the request because it was too large.
Common causes:
- Typically related to uploads; uncommon for a normal page request
What to do:
- If it's your site and this is happening on a normal page URL, review proxy/WAF settings
429 Too Many Requests
Meaning: The server is rate limiting.
Common causes:
- The destination site blocks rapid crawling
What to do:
- If it's your site, allowlist the crawler or relax limits for public pages
- If it's an external site, re-run later or replace the link
500 Internal Server Error
Meaning: The server crashed or threw an unhandled error.
What to do:
- If it's your site, check server logs for the failing URL
- Retry later if the issue was transient
502 Bad Gateway
Meaning: A proxy/load balancer received an invalid response from an upstream server.
What to do:
- Investigate reverse proxy / load balancer / origin health
503 Service Unavailable
Meaning: The server is temporarily unavailable.
What to do:
- Retry later
- Check maintenance windows
504 Gateway Timeout
Meaning: A proxy/load balancer timed out waiting for an upstream server.
What to do:
- Investigate upstream performance and proxy timeouts