Technical Findings
Technical Findings
Technical findings identify infrastructure and configuration issues on your website that directly affect whether AI systems can access, understand, and reference your content. These findings are the foundation of AI visibility -- if AI crawlers cannot reach your content, no amount of content optimization will help.
Each finding is classified by severity (blocker, warning, or info), includes a description of the issue, and provides a specific recommendation for resolution.
Severity levels
Blockers
Blockers are critical issues that prevent AI systems from accessing your content entirely. They have the highest impact on your health score (-20 points per blocker) and should be resolved immediately.
A single blocker can render your entire website invisible to one or more AI platforms, regardless of how good your content is.
Warnings
Warnings are significant issues that reduce your AI visibility effectiveness but do not completely prevent access. They carry moderate health score penalties (-3 to -5 points) and should be addressed as a priority.
Warnings typically indicate missing signals or configurations that AI systems use to better understand and prioritize your content.
Info
Info-level findings are opportunities for improvement rather than problems. They carry no health score penalty but represent best practices that can enhance your AI visibility.
Blocker: AI crawler blocks in robots.txt
Severity: Blocker (critical) Health score impact: -20 points Affects all keywords: Yes
What is detected
The diagnosis checks your website's robots.txt file for Disallow rules targeting AI crawler user agents. The following crawlers are checked:
| Crawler | Operator | AI Platform |
|---|---|---|
| GPTBot | OpenAI | ChatGPT, GPT-based applications |
| ClaudeBot | Anthropic | Claude |
| Google-Extended | Gemini, AI Overviews | |
| Amazonbot | Amazon | Alexa, Amazon AI services |
| FacebookBot | Meta | Meta AI |
| Bytespider | ByteDance | TikTok AI features |
What it means
When a robots.txt rule blocks an AI crawler, that AI platform cannot index any content on your website. The platform will have no direct knowledge of your pages, products, services, or expertise. It may still reference your company based on external sources (news articles, directories, Wikipedia), but it cannot cite or recommend your own content.
Blocking multiple crawlers compounds the problem. If you block GPTBot, ClaudeBot, and Google-Extended, three of the largest AI platforms are unable to access your site.
How to resolve
Edit your website's robots.txt file to allow AI crawlers access:
# Allow AI crawlers
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: Amazonbot
Allow: /
User-agent: FacebookBot
Allow: /
User-agent: Bytespider
Allow: /
If you want to allow AI crawlers access to most of your site but restrict certain sections (e.g., private or sensitive areas), use targeted Disallow rules:
User-agent: GPTBot
Disallow: /private/
Disallow: /internal/
Allow: /
GEO/SEO considerations
- Blocking AI crawlers does NOT affect traditional search engine rankings (Googlebot and Bingbot are separate from Google-Extended and GPTBot).
- Some organizations block AI crawlers as a policy decision. If this is intentional, be aware that it significantly limits AI visibility.
- After unblocking crawlers, it may take days to weeks for AI systems to re-index your content. The impact will not be immediate.
- Review your robots.txt periodically as new AI crawlers emerge.
Warning: Missing llms.txt
Severity: Warning Health score impact: -3 points Affects all keywords: Yes
What is detected
The diagnosis checks for the presence of an llms.txt or llms-full.txt file at the root of your website (e.g., https://yoursite.com/llms.txt).
What it means
The llms.txt file is a convention that allows website owners to provide AI systems with explicit guidance about which pages are most important. It acts as a curated index of your most valuable content, helping AI systems prioritize what to read and reference.
Without an llms.txt file, AI systems must discover your content through general crawling, which may miss important pages or fail to identify which pages are most authoritative.
How to resolve
Create an llms.txt file in your website's root directory. The file should:
- Start with a Markdown H1 heading (your company or site name).
- Include a brief description of your company.
- List your most important pages with URLs and brief descriptions.
Example:
# Acme Corporation
Acme Corporation is a leading provider of cloud infrastructure solutions for enterprise customers.
## Core Pages
- [About Us](https://acme.com/about): Company history, mission, and leadership team
- [Cloud Platform](https://acme.com/platform): Overview of our cloud infrastructure platform
- [Enterprise Solutions](https://acme.com/enterprise): Solutions for enterprise customers
- [Case Studies](https://acme.com/case-studies): Customer success stories and results
- [Documentation](https://acme.com/docs): Technical documentation and API reference
## Services
- [Managed Cloud](https://acme.com/services/managed-cloud): Fully managed cloud hosting
- [Migration Services](https://acme.com/services/migration): Cloud migration assistance
- [Support Plans](https://acme.com/services/support): Enterprise support options
GEO/SEO considerations
- llms.txt is a relatively new convention and adoption is growing among AI providers. Implementing it early positions you ahead of competitors.
- Keep the file updated when you add or restructure important pages.
- Include only your most valuable pages (10--30 URLs is typical). This is not a sitemap -- it is a curated recommendation.
- Use descriptive link text and brief descriptions to give AI systems context about each page.
Warning: Missing Organization schema
Severity: Warning Health score impact: -5 points Affects all keywords: Yes
What is detected
The diagnosis checks whether your website includes Organization schema markup (Schema.org/Organization) in JSON-LD format.
What it means
Organization schema is the most important structured data type for AI visibility. It provides AI systems with machine-readable information about your company's identity: name, URL, logo, description, contact details, social profiles, and more.
Without Organization schema, AI systems must infer your company's identity from unstructured content, which may be incomplete or ambiguous. This weakens your entity signals and can lead to confusion with similarly-named companies.
How to resolve
Add Organization schema to your homepage (and optionally to every page) using JSON-LD format:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Acme Corporation",
"url": "https://acme.com",
"logo": "https://acme.com/logo.png",
"description": "Leading provider of cloud infrastructure solutions",
"foundingDate": "2010",
"address": {
"@type": "PostalAddress",
"streetAddress": "123 Tech Street",
"addressLocality": "San Francisco",
"addressRegion": "CA",
"postalCode": "94105",
"addressCountry": "US"
},
"contactPoint": {
"@type": "ContactPoint",
"telephone": "+1-555-0100",
"contactType": "customer service"
},
"sameAs": [
"https://linkedin.com/company/acme",
"https://twitter.com/acme"
]
}
</script>
GEO/SEO considerations
- Organization schema benefits both traditional SEO (Knowledge Panel) and AI visibility (entity recognition).
- Include as many properties as possible: name, URL, logo, description, address, contact, social profiles, founding date, founders, and number of employees.
- If your company operates as a LocalBusiness, use that type instead of (or in addition to) Organization.
- Validate your schema using Google's Rich Results Test before deploying.
Warning: HTTPS issues
Severity: Warning Health score impact: -5 points Affects all keywords: Yes
What is detected
The diagnosis checks whether your website is consistently served over HTTPS (secure HTTP).
What it means
HTTPS is a baseline requirement for modern web credibility. AI systems may deprioritize or refuse to index content served over insecure HTTP connections. Some AI crawlers will not follow HTTP URLs at all.
How to resolve
- Obtain an SSL/TLS certificate for your domain (free options include Let's Encrypt).
- Configure your web server to redirect all HTTP requests to HTTPS.
- Update internal links to use HTTPS URLs.
- Ensure no mixed content warnings (HTTP resources loaded on HTTPS pages).
- Update your sitemap.xml and canonical tags to use HTTPS URLs.
GEO/SEO considerations
- HTTPS is a confirmed Google ranking signal and equally important for AI visibility.
- Mixed content (some resources loaded over HTTP on an HTTPS page) can trigger warnings even if the base URL is HTTPS.
- After migrating to HTTPS, set up 301 redirects from HTTP to HTTPS to preserve link equity.
Warning: Missing sitemap.xml
Severity: Warning Health score impact: -3 points Affects all keywords: Yes
What is detected
The diagnosis checks for the presence of a sitemap.xml file at the standard locations (/sitemap.xml or as specified in robots.txt).
What it means
A sitemap.xml file tells crawlers (both search engines and AI systems) which pages exist on your website, when they were last updated, and how important they are relative to each other. Without a sitemap, crawlers must discover pages by following links, which may miss orphaned or deep pages.
How to resolve
- Generate a sitemap.xml file listing all public pages on your website.
- Include the
<lastmod>tag for each URL so crawlers know when content was last updated. - Submit the sitemap to Google Search Console and Bing Webmaster Tools.
- Reference the sitemap in your robots.txt file:
Sitemap: https://yoursite.com/sitemap.xml - Keep the sitemap automatically updated when pages are added, removed, or modified.
GEO/SEO considerations
- Most CMS platforms (WordPress, Shopify, etc.) generate sitemaps automatically. Verify that your CMS has this feature enabled.
- Large sites may need multiple sitemaps organized in a sitemap index.
- Include only canonical, indexable URLs in the sitemap. Do not include redirects, error pages, or noindex pages.
Info: FAQ schema opportunities
Severity: Info (no penalty) Health score impact: None Affects all keywords: Varies
What is detected
The diagnosis identifies pages on your website that contain FAQ-like content (question-and-answer patterns) but do not have FAQPage schema markup applied.
What it means
FAQ content is highly valuable for AI visibility. AI systems frequently extract FAQ-structured content for direct answers to user queries. When FAQ content exists but lacks schema markup, AI systems may still detect it from the HTML structure, but schema markup makes extraction significantly more reliable and likely.
How to resolve
Add FAQPage schema markup to pages with FAQ content:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What services do you offer?",
"acceptedAnswer": {
"@type": "Answer",
"text": "We offer cloud infrastructure, managed hosting, and migration services for enterprise customers."
}
},
{
"@type": "Question",
"name": "How much does your service cost?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Our pricing starts at $99/month for basic plans. Enterprise pricing is customized based on usage and requirements."
}
}
]
}
</script>
GEO/SEO considerations
- FAQ schema can trigger rich results in Google search (FAQ rich snippets), providing both SEO and AI visibility benefits.
- Only add FAQ schema to genuine FAQ content. Do not fabricate questions just for schema purposes.
- Keep FAQ content concise and factual. AI systems prefer clear, direct answers.
- Update FAQ content regularly to ensure answers remain accurate.
Finding attributes
Each technical finding in the diagnosis interface includes:
| Attribute | Description |
|---|---|
| Type | blocker, warning, or info |
| Title | Short description of the finding |
| Description | Detailed explanation of the issue and its impact |
| Affects all keywords | Whether this finding impacts every keyword (true for most technical issues) or only specific ones |
| Recommendation | Specific action to resolve the finding |
The foundation principle
Technical accessibility is the foundation of AI visibility. The relationship between technical findings and content optimization is hierarchical:
- Layer 1: Technical access -- AI crawlers must be able to reach your content. Blockers in this layer negate everything above it.
- Layer 2: Content presence -- Content must exist for each target keyword. Without content, there is nothing to optimize.
- Layer 3: Content quality -- Content must be structured and written for AI readability (GEO optimization).
- Layer 4: External signals -- Authority, backlinks, and brand presence amplify your content's reach.
Fixing a technical blocker can unlock improvements across all other layers simultaneously, making technical findings the highest-leverage actions in most cases.
Relationship to other diagnosis components
- Technical findings generate penalties that reduce the health score.
- Blocker-level findings generate critical recommendations in the technical access category.
- Technical issues affect the entire keyword set, influencing all classifications in the gap analysis.
- The perspective flow shows the downstream impact of technical issues across all keyword paths.