Why it matters
A distributor builds its parts catalog as a React app. When a non-rendering AI crawler requests a page, the server returns an empty HTML shell and a bundle of JavaScript the bot never runs. All 50,000 SKUs sit behind that shell, so they are uncrawlable. The catalog looks complete in a browser and invisible to the engine. Server-side rendering changes this. The HTML arrives already populated with part numbers, specs, and cross-references, and those SKUs become retrievable. Crawlability for AI bots is the gate before anything else. An engine cannot chunk, embed, or cite a page it could not fetch and parse.
How to check it
Request your page the way a non-rendering bot does and read what comes back, not what the browser paints. A few concrete checks:
- Fetch the raw HTML with curl and confirm the SKU data, specs, and body copy are present in the source, not injected later by JavaScript.
- Confirm robots.txt and meta tags do not block the named AI user-agents (GPTBot, ClaudeBot, PerplexityBot, Google-Extended).
- Check that part-number and category URLs return 200, not soft-404 shells or redirect chains.
In practice
A hydraulics distributor finds its cylinder pages absent from AI answers. A curl request returns a near-empty shell, so the bot sees no bore, stroke, or pressure ratings. The team moves the catalog to server-side rendering, leaves client-side hydration for interactivity, and the next crawl ingests full spec tables. Reading lists, not courses. Render the page server-side, then worry about chunking and citations.