Back to Blog

404 Errors and SEO: How to Find, Prioritise, and Fix Broken Pages

Not all 404s are equal. Some are harmless; others are bleeding link equity and killing crawl coverage. Here's how to audit 404s, decide which ones to fix, and handle soft 404s that Google counts as crawl waste.

Marcus Webb6 min readApril 17, 2026

SEO consultant, 9 years experience, formerly Head of SEO at two Series B startups

A 404 error means a URL returns an HTTP status code of 404 (Not Found) — the server is confirming the page doesn't exist. In SEO terms, 404s matter when they have inbound links or internal links pointing to them, when they appear in your sitemap, or when they represent pages that previously ranked and still receive organic clicks. An isolated 404 on a page nobody has ever linked to is meaningless. A 404 on a URL with 40 backlinks is actively losing you authority.

Hard 404s vs. soft 404s

A hard 404 returns HTTP status 404. A soft 404 returns HTTP status 200 (OK) but displays a 'page not found' message — the server technically says the page exists, but the content tells users and Google it doesn't. Soft 404s are worse than hard 404s from an SEO perspective: Google crawls and processes the page (consuming crawl budget) but eventually identifies it as having no real content, which can depress your site's overall quality signals.

⚠️ Warning

Soft 404s are one of the most common causes of crawl budget waste on large sites. CMS platforms frequently generate soft 404s for deleted posts (showing a 200 with 'Post not found' content), empty search result pages (/search?q=nonsense returning 200), and empty category/tag pages. Check your GSC Pages report for the 'Soft 404' exclusion reason.

Finding 404s that actually matter

Not all 404s need fixing. Prioritise by impact: fix 404s that have backlinks (highest priority), fix 404s that appear in your XML sitemap, fix 404s that have internal links pointing to them, and fix 404s that appear in GSC Performance data with click history. Orphan 404s — pages that nobody has ever linked to, internal or external — can be left alone or allowed to expire.

  • GSC → Pages → filter 'Not found (404)': shows 404s Google has crawled. Sort by Impressions to find any that were previously ranking.
  • Ahrefs Site Explorer → Pages → Best by Links → filter for 404: shows your 404s with the most backlinks — these are the highest priority.
  • Screaming Frog → Response Codes → 4XX: shows 404s reachable via internal links — these are bleeding internal link equity.
  • Sitemap audit: cross-reference your sitemap URLs against your live site — any sitemap URL returning 404 needs immediate action.

Fix 1: 301 redirect to the most relevant live URL

If the 404 URL has backlinks or a history of organic traffic, 301-redirect it to the most relevant live page. 'Most relevant' is the page that most closely matches what the original URL was about. If no closely relevant page exists, redirect to the closest parent category or section. As a last resort, redirect to the homepage — but this is the weakest option and should only be used when no topically relevant alternative exists.

Fix 2: Restore the page

If the 404 was caused by accidental deletion — a CMS post erroneously unpublished, a URL changed without a redirect — restoring the page at the original URL is always better than a redirect. The restored page keeps its URL, its ranking history, and its backlink signal intact. Check your CMS trash, version history, or a Wayback Machine archive before assuming restoration isn't possible.

Fix 3: Return a true 410 for intentionally removed pages

A 410 (Gone) status code tells Google the page has been permanently and intentionally removed — faster de-indexation than a 404, which Google may continue re-crawling for weeks hoping the page comes back. Use 410 for pages you've deliberately discontinued with no redirect destination: discontinued product pages, deleted user profiles, old campaign landing pages. Don't use 410 for pages that were deleted by mistake.

Fixing internal links pointing to 404s

After identifying 404s via Screaming Frog, use the 'Inlinks' tab for each 404 URL to see every internal page linking to it. Update each of those internal links to point to the correct live URL — either the page you restored, the redirect destination, or another relevant live page. This is more valuable than the server-side redirect because it fixes the source of the broken equity flow rather than just patching the symptom.

💡 Tip

Run a broken internal link audit monthly. New 404s are created constantly by CMS updates, URL restructuring, and plugin changes. A monthly Screaming Frog crawl filtered for 4XX responses catches these before they accumulate into a crawl budget problem.


💡 Tip

SEOdisaster Level 1 includes a scenario where a site migration silently created 300+ broken internal links — all returning soft 404s with a 200 status. You'll crawl the site, identify the pattern, and fix the highest-impact links before the next Googlebot visit.

Learn this by doing — not just reading.

SEOdisaster.com teaches SEO through interactive disaster scenarios. Put these concepts into practice in the game.

Play Free →