Alex Haliday from AirOps took a refreshingly practical approach at Tech SEO Connect: five questions his team gets asked constantly, with direct yes-or-no answers and the reasoning behind each.
With 20 years of building companies and growth teams—including the zero-to-one SEO story at Masterclass—Haliday has the credibility to cut through the noise. His talk was the perfect complement to the more theoretical presentations: here’s what to actually do on Monday morning.
Question 1: Should You Create Markdown Versions of Pages?
The short answer: No, with rare exceptions.
The idea behind markdown alternatives is appealing: take your heavy HTML pages with all their JavaScript and boilerplate, and create lean, well-structured markdown versions optimized for AI retrieval. Some teams have experimented with .md versions of pages or real-time switching based on user agent (serving different content to GPTbot vs. Chrome).
The theory is solid—context windows are finite, LLM focus matters, and stripping down content should help AI find what it needs. This is exactly how you’d build your own RAG system.
The problem: there’s no evidence it improves anything. The RAG pipelines already do an excellent job converting well-structured HTML into markdown or equivalent formats. Both community tests and Google’s own statements confirm this doesn’t move the needle.
Haliday acknowledged two scenarios where teams consider this approach anyway. First, organizational politics: some pages are nightmares to change because of all the stakeholders involved, so teams create markdown alternatives to iterate faster on AI optimization without touching the main page. Second, content expansion: teams want to format answers differently for citation likelihood without updating the primary page.
His warning on that second use case: once you create different content pathways between humans and agents, you’re stepping on landmines. Traditional cloaking guidelines haven’t been explicitly updated for this, but the spirit is clear—consistency matters. There’s also user confusion risk: someone sees a passage in ChatGPT, clicks through, and Chrome tries to highlight text that doesn’t exist on the actual page.
Recommendation: Don’t use markdown-only versions unless you have absolutely no alternative. Stick to classic best practices—clean pages, strong heading hierarchies, remove div bloat.
Question 2: Should You Adopt llms.txt?
The short answer: No, not today.
The spirit of llms.txt is good—we want to provide high-quality information to models. But in its current implementation, it’s solving a problem that doesn’t exist.
For llms.txt to be helpful, it would need to be adopted and supported by the major players, and it would need to mean that fan-out queries prioritize linked pages. That’s simply not the case. When fan-out queries happen, they’re seeing the same SERP results as humans. File type doesn’t matter.
Haliday also called out a trend he’s seen: “I’ve also seen some really crazy stuff, like people putting ‘Hey, AI, click this’ in their footer. Don’t do that.” In 99.9% of cases, the retrieval process is one hop—it’s not doing deep exploration of your site.
What about ChatGPT’s agent mode? “I have never seen a real human being use that for anything useful,” Haliday said. Don’t optimize for edge cases.
What’s Coming That Might Change This
Haliday shared some forward-looking developments worth watching. OpenAI’s Agentic Commerce Protocol lets brands submit rich product descriptions directly to ChatGPT—no RAG retrieval needed. There’s reason to believe this will expand beyond products to events and other entity types.
The Apps SDK, which lets brands build in-stream experiences within ChatGPT, creates interesting strategic decisions about what to build in-feed versus on your own site. App submission opens end of 2024; initially only installed apps render, but in six to nine months you’ll be able to discover new apps. “There will be a new optimization challenge for us all to contend with,” Haliday predicted.
Question 3: Should You Go Deeper with Schema?
The short answer: Yes, absolutely.
This was the recurring theme of the entire conference, and Haliday reinforced it: schema creates clarity around the entities a page represents. Products, articles, Q&A pairs—all worthwhile.
Even if you’re skeptical about runtime usage, there’s a training data argument: Web Data Commons uses schema as part of Common Crawl, which feeds training material. Clear, unambiguous entity definitions with freshness signals, pricing, and ratings are valuable regardless of when they’re consumed.
Haliday shared his priority schema types. The less obvious high-value ones: SpeakableSpecification (for voice), VideoObject, SoftwareApplication, and Dataset. The more common ones: Product, Article, FAQPage, LocalBusiness, HowTo, Event, and Organization.
One he wished he’d included: ImageObject. “The nice thing about doing image schema tags is that you actually get a really fantastic Google Image boost if people are searching for infographics. We’ve seen tremendous success in traffic coming through from Google Images by doing high quality schema for images.”
His framing: think about hardening the definition of entities your content contains—products, people, places, FAQs, events. The more you do that, the better future indexing and runtime models can understand relationships and disambiguate.
Question 4: Should You Analyze Server Logs for AI Traffic?
The short answer: Yes, this is gold.
“If AI is constantly spending time looking at your pages, you should probably be spending time updating them,” Haliday said. Simple logic, but most teams aren’t doing this.
The problem with traditional tracking (GA and similar tools) is they rely on JavaScript, and AI crawlers don’t trigger JavaScript. You need CDN-level or web server-level tracking to see how agents traverse your pages.
Server logs give you the bottom-up view: a heat map of crawl behavior showing which pages are heavily trafficked by AI bots. But that’s only half the picture. You also need the top-down view: for your high-priority queries, which pages are being cited?
“You’re really trying to triangulate a picture of how agents are visiting your pages and how agents are using your pages for RAG and runtime retrieval.”
Question 5: Should You Use Answer Blocks?
The short answer: Yes, definitely.
LLMs consistently extract short, declarative answer chunks. They’re looking for authoritative, fresh snippets they can include in responses. Formatting pages with Q&A sections, TLDRs, and summaries at the top is a strong strategy.
Haliday’s framing: “You can think of the agent as sitting between you and your end customer. You want to make sure you’re setting them up for success with prepared snippets they can take and include in their responses.”
Specifics: add first-paragraph definitions, FAQ blocks, key fact sections, and step-by-step blocks. Make sure they’re well-structured from an HTML standpoint and succinctly answer the question. Freshness signals on the answer help. Baking authority into the first sentence helps. General page hygiene matters.
Bonus Question: Can You Capitalize on LLM URL Hallucinations?
The short answer: Yes, if you have enough traffic.
LLMs love to make up URL paths. Studies show about 1% of URLs that ChatGPT tries to retrieve don’t actually exist. If you can track those hallucinated URLs in your server logs, they represent opportunities to create pages that will “just work.”
The caveat: not everyone has sufficient traffic for this to be worthwhile. But for larger sites or those with higher hallucination rates, it’s free traffic waiting to be captured. And the data suggests this problem isn’t going away.
My Takeaways
What I appreciated about Haliday’s talk was the decisiveness. After two days of frameworks and theory, he gave us a checklist: do this, don’t do that, here’s why.
The quick summary:
- Markdown alternatives: Skip it. The RAG pipelines already handle HTML-to-markdown conversion. Focus on clean, well-structured pages instead.
- llms.txt: Not yet. Keep an eye on it, but don’t invest time today. And definitely don’t put “Hey AI, click this” in your footer.
- Schema: Yes, go deeper. Prioritize based on your content types, but this is table stakes now. Don’t forget ImageObject for Google Images traffic.
- Server logs: Yes, analyze AI bot traffic. Combine bottom-up crawl data with top-down citation analysis to get the full picture.
- Answer blocks: Yes, add them. TLDRs, FAQs, key facts, step-by-step blocks. Spoon-feed the intermediary.
- Hallucinated URLs: Track them in logs if you have the traffic volume. Free page opportunities.
The forward-looking piece about OpenAI’s Agentic Commerce Protocol and the Apps SDK is worth watching. App store optimization for ChatGPT might sound far-fetched, but Haliday’s timeline—six to nine months until app discovery is live—makes it feel imminent.
As Haliday put it, “We are all students of the space.” But some students are doing their homework more rigorously than others. This talk was a good reminder to stay practical while the theory catches up.







