Glossary Technical & structural
Technical & structural
These terms cover the plumbing that decides whether an AI engine can read your catalog and cite it correctly: how crawlers reach your pages, how content gets chunked and grounded, how product schema and entities make a part number machine-legible, and how RAG and llms.txt feed it all back. Get the structure wrong and your SKUs never make the answer.
11 terms
AI crawler
An AI crawler is a bot that fetches web pages to feed AI systems — for training (GPTBot, ClaudeBot) or live answer retrieval (OAI-SearchBot, PerplexityBot). Blocking retrieval bots via robots.txt or bot protection removes content from AI search citations; blocking training bots only keeps it out of future models.
Content chunking for retrieval
Content chunking for retrieval is the practice of splitting a page into self-contained passages so a retrieval system can pull one accurate, complete unit into an AI answer. Each chunk should hold a whole fact, such as a full spec row, rather than a sentence cut off mid-value.
Crawlability for AI bots
Crawlability for AI bots is whether AI crawlers can actually fetch and parse a page's content. It is the AI-bot cut of generic crawlability: JavaScript-rendered catalogs often return an empty shell to non-rendering crawlers, leaving the page's data invisible until server-side rendering exposes the HTML.
Entity SEO
Entity SEO is the practice of optimizing for entities, the things an engine knows, and their relationships in a knowledge graph rather than for keyword strings, so engines understand your brand and products as defined entities they can store, disambiguate, and cite.
Grounding
Grounding is tying an AI model's generated output to specific external sources — retrieved, looked up, or supplied at query time — so the answer reflects, and can cite, real documents and data instead of unverified model memory. Retrieval-augmented generation (RAG) is the leading grounding method, but not the only one.
Knowledge graph
A knowledge graph is a structured network of entities (people, products, brands, standards) and the labeled relationships between them, which search and AI engines use to connect facts and answer questions. It stores knowledge as nodes and edges rather than as loose documents.
LLM seeding
LLM seeding is the practice of deliberately placing brand and product information on sources large language models tend to ingest or cite, such as Wikipedia, Wikidata, Reddit, and trade directories, so the brand is already present when an engine assembles an answer. The term was popularized by Locomotive Agency.
Product schema for industrial SKUs
Product schema for industrial SKUs is the application of schema.org/Product markup to technical parts, using mpn, gtin, and additionalProperty fields to encode part numbers, specifications, and compatibility so AI engines can read and answer questions about a SKU. It is the industrial-catalog use of generic Product structured data.
Retrieval-augmented generation (RAG)
RAG (retrieval-augmented generation) is the technique where an AI model fetches relevant external documents at query time and grounds its answer in them, instead of relying only on training data. It is why fresh, crawlable web content can be cited by AI search without being in a model's training set.
Structured data for AI
Structured data for AI is schema.org markup (usually JSON-LD) that labels page facts like price, MPN, and availability so machines can read them. Its proven value is Product rich results and clean entity disambiguation; whether AI answer engines ingest the JSON-LD directly is unproven, so it should not be overclaimed.
llms.txt
llms.txt is a proposed Markdown file at a site's root (e.g. /llms.txt), introduced by Answer.AI in 2024, listing a site's key pages in an LLM-friendly form. No major AI provider has committed to it; Google explicitly declined, though Perplexity reportedly uses it. Studies show no measured citation lift. Treat as low-cost, low-certainty hygiene.
See where you’re losing customers.
Two doors, depending on your business. Either way you leave with the numbers — the exact gap and the highest-payback fix — whether or not you hire us.