Carnegie Mellon University published a paper in October 2025 that did something nobody had done systematically before: they reverse-engineered the content preferences of Gemini, GPT, and Claude to produce a definitive list of what makes a document more likely to be cited in an AI-generated answer. I built a tool yesterday that applies every rule to every page on this site automatically.

CMU extracted 14 universal AI citation preferences. Here's what they found

The paper (What Generative Search Engines Like and How to Optimize Web Content Cooperatively — Wu, Zhong, Kim & Xiong, arXiv:2510.11438) tested Gemini, GPT-4, and Claude across thousands of documents and distilled 14 preference rules. Three appear identically across all three engines, making them the highest-confidence rules in the set:

1. State the key conclusion at the beginning. Every AI engine tested penalises documents that build to an answer. It needs to extract the conclusion in the first sentence.

2. Cite authoritative sources for every factual claim. This just means explicit inline attribution (i.e. "ASIC register, AFSL #123456" beats "fully licensed").

3. Eliminate promotional language. Superlatives and marketing copy don't just fail to help. The paper found they actively degrade citation probability scores.

What to do this week: Open your "About" or bio page. Find the first paragraph. If it doesn't contain your AFSL number, your registered scope of advice, and one specific, verifiable credential in the first two sentences — rewrite it. That's the single highest-leverage content change you can make right now.

Source: arXiv:2510.11438 — Wu, Zhong, Kim & Xiong (Carnegie Mellon University / Vody), October 2025

I built a 3-script pipeline to apply all 14 rules automatically

The AutoGEO code exposes the 14 rules as a rewriter API. I wrote three scripts to operationalise it for LogitRank:

1. extract-for-autogeo.ts — reads every blog post from my website, strips the HTML, and writes each one to a plain-text file in an input folder.

2. run-autogeo.py — feeds each plain-text file to Claude Sonnet 4.6 via the Anthropic API with a prompt built from all 14 rules, and saves the rewritten document to an output folder.

3. apply-autogeo.ts — converts the rewritten markdown back to HTML and patches it directly into the blog post.

The full pipeline ran across 48 blog posts in a single terminal session. The rules, with the help of the CMU researchers, were extracted from real AI engine behaviour, not from the model's self-assessment.

What to do this week: You don't need to automate this. Pick your highest-traffic page and manually apply the three universal rules: move the answer to sentence one, add an explicit inline citation to ASIC or a regulatory source, and remove any superlatives or promotional phrases. That's the manual version of what the pipeline does.

Source: AutoGEO on GitHub and arXiv:2510.11438

Neutral, factual content is a ranking signal

The paper found that promotional rewrites degraded what the researchers call "engine utility scores", which is a composite measure of how much an AI engine benefits from citing a given source. AI engines are pattern-matching for objective, attributable facts, and marketing copy produces noise in that pattern.

For financial planners, this matters in a specific way. The AFSL compliance requirement to avoid misleading or deceptive conduct (s12DA ASIC Act) already pushes you toward factual, accurate language. You're not being asked to sacrifice your marketing voice for AI visibility. You're being told that the content your compliance team already insists on is mechanically better suited for AI citation than the copy your marketing firm writes.

What to do this week: Pull any service page on your site and count the sentences that a compliance officer would flag. Now count the sentences that contain a specific, verifiable fact. If the second count is lower than the first, your page is under-optimised for AI citation by exactly that margin.

Source: arXiv:2510.11438 — Section 4.3, Promotional Tone and Engine Utility

When it comes to internal optimisation, the gap between "what AI engines cite" and "what most AFSL firms publish" is most probably a content structure issue. Most financial planners have the underlying facts. They just bury them under taglines and build-up. The AutoGEO pipeline makes that gap measurable and fixable at scale.

I automated AI citation rules across 48 blog posts

CMU extracted 14 universal AI citation preferences. Here's what they found

I built a 3-script pipeline to apply all 14 rules automatically

Neutral, factual content is a ranking signal

Keep Reading

The Answer Brief

The Answer Brief