linkinator Tutorial: Find Broken Links in Next.js Sites

Broken links are one of those problems that feel minor until you look at them systematically. A 404 on an internal page signals to Google that your site structure is inconsistent — even a handful of broken internal links can affect crawlability. A broken link in your documentation tells a developer you don't test your own content. An external dependency that quietly returned 410 six months ago is now part of every page that links to it, and nobody noticed.

I started using linkinator after discovering that a staging deploy of a client's site had 23 broken links — all introduced during a CMS migration that "went fine." We only caught them because someone happened to click through a few pages manually. That's not a process. This is the kind of quality gap I look for when I audit a codebase before taking on a rescue project.

Quick Start

You don't need to install anything. Run it against any publicly accessible URL:

npx linkinator https://iurii.rogulia.fi

That's it. It fetches the page, parses every <a href>, <link rel>, and <img src>, follows them, and reports back. For a simple page it takes seconds. For a site with hundreds of pages you'll want more control.

The Flags That Actually Matter

`--recurse`

Without this flag, linkinator only checks the links on the page you give it — not the pages those links point to. For anything beyond a single-page check, you need recursion:

npx linkinator https://iurii.rogulia.fi \
  --recurse

This crawls the entire site. It follows every internal link it finds, building up a queue. For a site with 50–100 pages this is fast (under a minute). For a large docs site or an e-commerce catalog you'll want to combine it with --timeout and --concurrency to avoid hammering the server.

One thing to know: by default --recurse stays within the origin you give it. It won't crawl linkedin.com when it finds your LinkedIn link — it will check that the URL responds, but it won't recurse into it. That behavior is correct and what you want.

`--silent`

By default, linkinator prints every single URL it checks, including the successful ones. On a site with hundreds of pages this is hundreds of lines of output that don't tell you anything useful.

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent

--silent suppresses all successful (2xx) responses and only shows failures. This is the flag I always use in CI — I want to see what broke, not a wall of green checkmarks.

`--timeout`

External services occasionally take longer than expected to respond. By default there is no timeout — requests wait indefinitely. In CI you almost always want to set one:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --timeout 15000

The value is in milliseconds. I typically use 15000 (15 seconds) in CI, which is generous enough for slow third-party docs but prevents the check from hanging on a dead server.

`--retry`

Some servers respond with 429 (Too Many Requests) or 503 (Service Unavailable) when you hit them repeatedly in quick succession. Without retry logic, these get reported as failures even though the links are valid.

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --retry

With --retry enabled, linkinator will back off and retry those responses before marking them as failures. This dramatically reduces false positives in CI, especially when your site links to GitHub, npm, or major documentation sites that have rate limiting.

You can pair it with --retry-errors to control which status codes trigger a retry:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --retry \
  --retry-errors

`--concurrency`

How many requests to make simultaneously. The default is fairly conservative. If you're running against a site you own (or localhost), you can push this up:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --concurrency 25

If you're running against a site you don't control, keep it low — 5 or 10 — to avoid triggering their rate limiting or looking like a DoS attack.

`--skip`

Not every URL that appears on your page is worth checking. Social media links get blocked by some CDNs. LinkedIn specifically returns 999 for bot requests. Certain documentation sites are notoriously flaky.

--skip takes a comma-separated list of regex patterns:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --skip "linkedin.com,twitter.com,x.com,facebook.com"

And to skip any URL that requires authentication — there's no point failing the CI check on links that intentionally return 401 to crawlers.

`--format`

The default output is plain text, which is fine for local use. For CI pipelines where you want to parse the output, or for reports, use JSON:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --format json

CSV is also available if you want to pipe results into a spreadsheet or another tool:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --format csv \
  > broken-links.csv

JSON output gives you the full picture: status code, source URL (the page that contains the broken link), and the target URL that failed. This is useful when you have a lot of failures and need to understand where they're coming from.

slug="technical-consultation"
text="Want a review of your CI/CD setup and quality gates? I can audit what's missing and help you build a pre-deploy pipeline that actually catches problems."
/>

The Best Feature: URL Rewriting

This is the one most people don't know about, and it changes how you use the tool entirely.

The problem with checking links before deployment: your site isn't at https://iurii.rogulia.fi yet — it's at http://localhost:3000. But the links in your content reference the production domain. If you generate a sitemap, internal links, or use absolute URLs in your MDX content, they all point to the real domain.

--url-rewrite-search and --url-rewrite-replace solve this by rewriting URL prefixes on the fly:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --url-rewrite-search "https://iurii.rogulia.fi" \
  --url-rewrite-replace "http://localhost:3000"

When linkinator encounters any URL matching the search pattern, it substitutes the replacement before making the request. Every internal link gets checked against your local server. External links are checked as-is against the real internet.

This means you can catch broken internal links in a local dev environment or a staging server before you deploy to production. In my workflow, I run this before every deployment. If the check fails, the deployment doesn't proceed.

You can chain multiple rewrites by repeating both flags:

--url-rewrite-search "https://iurii.rogulia.fi" \
--url-rewrite-replace "http://localhost:3000"

npm Scripts

Add it to your package.json so you don't have to remember the full incantation:

{
  "scripts": {
    "check-links": "linkinator https://iurii.rogulia.fi --recurse --silent --retry --skip 'linkedin.com,twitter.com,x.com'",
    "check-links:local": "linkinator https://iurii.rogulia.fi --recurse --silent --retry --url-rewrite-search 'https://iurii.rogulia.fi' --url-rewrite-replace 'http://localhost:3000' --skip 'linkedin.com,twitter.com,x.com'"
  }
}

Note: for local checks you need the dev server running first. I typically combine this with concurrently in a pre-deploy script.

CI Integration

Here's a real GitHub Actions workflow I use. It starts the dev server, waits for it to be ready, runs linkinator against it, and fails the build if any broken links are found:

name: Check Links

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  linkinator:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm

      - name: Install dependencies
        run: npm ci

      - name: Build site
        run: npm run build

      - name: Start production server
        run: npm start &
        env:
          PORT: 3000

      - name: Wait for server
        run: npx wait-on http://localhost:3000 --timeout 60000

      - name: Check links
        run: |
          npx linkinator https://iurii.rogulia.fi \
            --recurse \
            --silent \
            --retry \
            --timeout 15000 \
            --concurrency 10 \
            --url-rewrite-search "https://iurii.rogulia.fi" \
            --url-rewrite-replace "http://localhost:3000" \
            --skip "linkedin.com,twitter.com,x.com,facebook.com"

The wait-on package is the reliable way to wait for a server to be ready before running checks. Without it you'll get intermittent failures where linkinator starts before the server has fully initialized.

For static sites (Next.js with output: 'export'), you can replace the build + start sequence with npx serve ./out.

If your site requires a full next build before the server has valid output, add that as a separate step. The sequence matters: install → build → start → wait → check.

One strategic note: blocking the build on broken links works well for internal links — those are entirely within your control and should never be broken. External links are a different story. A third-party documentation site going down at 2am will fail your CI, block your deploy, and have nothing to do with your code. When that happens regularly, teams start ignoring failures altogether — which defeats the point. A common pattern is to run two separate jobs: a blocking check for internal links only (using --skip to exclude external domains), and a non-blocking nightly workflow for external links that posts results somewhere visible without stopping deploys.

Dealing with False Positives

Every real-world site generates some false positives. Here's how I handle the most common ones:

LinkedIn, Twitter/X, and Facebook. These platforms actively block automated requests. LinkedIn returns 999 (not a real HTTP status). Just skip them unconditionally:

--skip "linkedin.com,twitter.com,x.com,facebook.com,instagram.com"

GitHub rate limiting. If your content links to many GitHub repositories or issues, you'll hit rate limits. Either skip GitHub, or add --retry and run with lower concurrency. If you need reliable checking of GitHub links, you can pass a token via --header:

--header "Authorization: Bearer $GITHUB_TOKEN"

Auth-protected routes. Any page that returns 401 or 403 to an unauthenticated request will show as a failure. Skip the entire auth path:

--skip "example.com/admin,example.com/dashboard,example.com/api"

Anchor fragments (#section-id). Linkinator skips fragment validation by default. If you want it to verify that #section-id actually exists on the target page, add --check-fragments. Be aware this generates false positives on fragments that are valid but dynamically rendered, like tabs or accordions — so it's usually not worth enabling in CI.

Flaky third-party services. Some documentation sites (I'm looking at you, certain cloud provider docs) return 503 intermittently. --retry helps here, but if a domain is consistently unreliable, just skip it. A flaky external doc site is not your problem to fix.

What It Won't Catch

Linkinator checks that URLs respond with a non-error status code. It doesn't verify the content at that URL is what you expect. A link to a blog post that was replaced with a redirect to the homepage will show as 200. A documentation page whose content has changed will show as 200. It doesn't validate that the href attribute actually takes you where the surrounding text says it does.

It also won't catch dynamically rendered links that don't appear in the initial HTML — if your React component builds a URL in a useEffect and appends it to the DOM client-side, linkinator won't see it. For that you'd need something like Playwright.

These are acceptable limitations. For the cost of adding one check to your CI pipeline, you catch the straightforward failures: internal routes you renamed, external URLs that went dark, documentation links that returned 404 when the library dropped support.

The Actual Payoff

Before I added this to the CI pipeline for this site, I had a few broken links to external documentation pages that had been quietly returning 404 for weeks. Nobody reported them. They didn't cause errors. They just silently degraded the experience for anyone who clicked them.

Running this took 40 seconds and found all of them:

npx linkinator https://iurii.rogulia.fi \
  --recurse \
  --silent \
  --retry

The --rewrite flag is the part that changed my pre-deploy workflow. Instead of deploying and then checking, I check locally, fix locally, and deploy once. The difference in feedback cycle time is significant: finding a broken link in CI after deployment means opening a second PR, waiting for another build, and deploying again. Finding it locally takes 10 seconds to fix and move on.

Link checking is a safeguard, not a strategy. It doesn't prevent broken links from being created — it catches them before they reach users. The primary defense is good process: consistent URL structure, redirects when you rename routes, and content reviews after migrations. But process is imperfect, and linkinator is cheap insurance for when it slips.

If you're building a content-heavy site, a documentation portal, or anything where link integrity matters for SEO or user experience, add this to your pipeline. The setup takes 15 minutes. The alternative is hoping nobody clicks the wrong link.

For site architecture and developer experience questions beyond tooling — get in touch. I'm available for freelance projects and longer-term engagements. If your project needs MVP development with proper CI/CD baked in from the start, that's the work I do.

slug="seo-audit"
text="A linkinator scan is one piece of a proper SEO audit. If you want the full picture — broken links, schema, sitemap, hreflang, Core Web Vitals, indexing, OG images — fixed-fee Technical SEO Audit, written report in 5 working days."
/>

Related reading: