Measuring AEO ROI: The Metrics That Tell You If It's Working
If you're investing in schema, FAQ blocks, llms.txt, and comparison pages, how do you know it's working? The four metrics that matter, the right measurement window, and a weekly review template.
← Back to AI Search Attribution
AEO investment — schema markup, FAQ blocks, llms.txt, comparison pages — typically takes 4–8 weeks to show measurable results. The challenge is knowing what to measure and separating AEO signal from organic SEO noise.
This article covers the four metrics that matter, the right measurement window, and a weekly review template you can run in 30 minutes.
Why Standard SEO Metrics Don’t Capture AEO Results
Traditional SEO is measured in rankings, clicks, and organic sessions. AEO is measured in citations, AI-source revenue, and share-of-voice — none of which appear in Search Console or GA4’s default reports.
The mismatch causes two errors:
False negative: Your AEO work generates $600/month in ChatGPT-sourced revenue — but GA4 labels it Unassigned, Search Console doesn’t track AI citations, and the investment looks like it produced nothing.
False positive: Organic Google traffic grew 15% the month you launched AEO content. You attribute the lift to AEO when it was seasonal demand.
Without AI-specific metrics, you can’t separate the signals.
The Four Metrics That Matter
1. AI-Source Revenue (Primary)
What: Shopify orders where the originating session was an AI engine.
How to track: Inxy dashboard (order-level) or GA4 custom channel groups + Shopify revenue (session-level estimate, understates by 2–3×).
Baseline: Set it before implementing AEO changes. If you start in June, May AI-source revenue is your baseline — even if it’s $0 (meaning all prior AI revenue was invisible).
Realistic targets at 90 days for a $30K/month Shopify store:
- With Inxy attribution: $400–$800/month AI-source revenue
- With GA4 custom channels only: GA4 will show $150–$300 (same revenue, undercounted)
2. AI Citation Count (Leading Indicator)
What: How many times your pages are cited in AI responses to relevant queries.
How to track:
- Manual (free): ask ChatGPT, Perplexity, and Claude 10–20 queries in your product category each week. Record who gets cited.
- Tools: Otterly.ai, Peec AI, or LLMrefs track citation frequency across engines.
Why it’s leading: Citation count changes 2–4 weeks before revenue changes. If citations grow in week 4, expect AI-source revenue to grow in weeks 6–8.
What queries to use: Bottom-funnel only.
Good: “best moissanite engagement ring under $500”, “moissanite vs lab diamond for daily wear”
Bad: “what is moissanite” — educational citations don’t convert.
3. Share of AI Voice (Competitive)
What: Your citations as a percentage of total category citations.
Formula: (Your citations ÷ Total citations across competitors for your query set) × 100
How to track: Same 20 queries, run weekly. Record every citation — yours and competitors’. After 4 weeks, calculate share.
Why it matters: You might grow in absolute citation count while losing share to a competitor investing more aggressively. Share of AI Voice tells you whether you’re winning the category race, not just improving in isolation.
4. AEO Content Coverage (Operational)
What: Percentage of your high-intent pages that are AEO-complete.
AEO-complete checklist for a page:
- FAQPage schema, 5+ Q&A pairs, matching visible content ✓
- Article schema with valid
dateModified(within 6 months) ✓ - At least one comparison table (where relevant) ✓
- Listed in llms.txt ✓
- 2+ external authority links ✓
Target: 80%+ of your top 20 landing pages AEO-complete within 60 days.
Why it’s operational: Coverage is a leading indicator of leading indicators. Low coverage explains why citations aren’t growing yet.
The Right Measurement Window
| Week | What typically happens |
|---|---|
| 1–2 | Schema installed, pages crawled by AI bots |
| 3–4 | First citation appearances (spot-check manually) |
| 4–6 | Citation count measurably up vs baseline |
| 6–8 | AI-source revenue begins appearing in attribution data |
| 8–12 | Share of AI Voice trend becomes visible |
| 12+ | Compounding effect as more pages are optimized |
Don’t evaluate AEO at week 2. Stores that declare it “doesn’t work” after 3 weeks are measuring at the wrong point.
Use 4-week rolling averages rather than week-over-week comparisons. AI search volume fluctuates with news cycles and product launches — rolling averages reveal the underlying trend.
Weekly Review Template (30 Minutes)
5 min — Revenue check
Open Inxy or GA4 AI Search channel. Record this week’s AI-source revenue and order count. Update your tracker: Week | Revenue | Orders | 4-week rolling avg.
10 min — Citation spot-check Run 5 queries from your tracking list in ChatGPT and Perplexity. Record: cited / not cited / competitor cited. Update citation tracker.
5 min — Content hygiene
- New pages published this week → add to llms.txt
- Existing pages updated → bump
dateModified - Inxy installed new FAQ blocks → note the page
10 min — Monthly deep-dive (first Monday only)
- 30-day AI-source revenue vs prior 30 days
- Citation count change
- AEO coverage %
- Top 3 pages by AI-source revenue — are they fully optimized?
- Top 3 coverage gaps — high-traffic pages with no AEO optimization
The Common Failure Pattern
Stores that invest in schema installation but not content quality. Schema without citable content produces schema without citations. An FAQ block with 3 generic questions and vague answers won’t get extracted — the AI engine reads the quality of the answer, not just the presence of the schema.
The combination that works: schema + high-quality extractable content (stat-anchored, self-contained answers). See Citable Quote Engineering for the specific patterns.
Next: Inxy vs GA4 vs Triple Whale vs Polar — what each tool actually shows for AI search attribution.