We Tested AI-Only vs AI-Assisted vs Human Content: Here Is What Actually Ranked

The question of whether AI-generated content can rank in Google has produced more opinion and less evidence than almost any other SEO topic of the last few years. Every content tool vendor claims their AI produces ranking content. Every SEO traditionalist claims AI content is doomed. Every agency has a motivated interpretation. And very few of these claims come with actual data from actual experiments.

So we decided to run one. Over six months, we published forty-five blog posts on fifteen topics, produced through three different workflows: pure AI, AI-assisted, and human-written. We measured rankings, engagement metrics, and conversion rates. The results were more nuanced than either the AI boosters or the AI skeptics would lead you to believe -- and some of them genuinely surprised us.

This post is part of the AI Content & SEO hub, which covers the full picture of how to think about AI in content production alongside our primary-source analysis of Google's actual stance.

The Experimental Design

Let me first explain what we did, because the methodology matters.

We selected fifteen topics from our editorial backlog -- topics we would have written about anyway. For each topic, we produced three different articles using three different workflows:

Pure AI: A single prompt to Claude (or equivalent) asking for a 1,500-word article on the topic. No human editing. The output was cleaned for obvious errors (incorrect dates, broken formatting) but not rewritten. Time per article: approximately 5 minutes.

AI-assisted: An AI-generated draft, followed by substantial human rewriting. The human editor (with genuine subject-matter expertise) rewrote roughly 40-60% of the text, added specific examples, corrected factual errors, and injected personal perspective. Time per article: approximately 90 minutes.

Human-written: Researched and written from scratch by a human subject-matter expert. No AI involvement at any stage. Time per article: approximately 4 hours.

Each article was published under the same author name, on the same domain, with roughly the same meta tags and internal linking. The topics were distributed randomly across the three workflows to avoid bias. We waited six months for rankings to stabilize, then measured.

A note on honesty: this was a real experiment on a real site. The "pure AI" articles were published. We did not hold them back or hedge. If you are concerned this is unethical, that concern is reasonable. In our defense, the pure AI articles were factually accurate, spell-checked, and not misleading -- they were simply not as good as the human-written versions, which is the point of the experiment.

The Ranking Results

Let me give you the numbers before I give you the interpretation.

Pure AI articles:

Average ranking position: 48
Percentage ranking in top 10: 13%
Percentage ranking in top 50: 40%
Percentage not ranking at all (beyond page 10): 33%

AI-assisted articles:

Average ranking position: 22
Percentage ranking in top 10: 47%
Percentage ranking in top 50: 80%
Percentage not ranking at all: 7%

Human-written articles:

Average ranking position: 18
Percentage ranking in top 10: 53%
Percentage ranking in top 50: 87%
Percentage not ranking at all: 0%

The headline finding: human-written articles outperformed AI-assisted articles, which in turn dramatically outperformed pure AI articles. The gap between pure AI and either of the other two categories was large. The gap between AI-assisted and human-written was smaller -- statistically meaningful but not dramatic.

The Unexpected Finding: AI-Assisted Matches Human on Some Metrics

This is where it got interesting. We did not expect AI-assisted articles to rival human-written ones on every metric, but on several they did.

On time-on-page: AI-assisted and human-written articles produced nearly identical engagement (around 4 minutes average), while pure AI articles averaged 1.5 minutes. The rewriting pass appears to eliminate the signals that cause users to bounce.

On scroll depth: AI-assisted articles averaged 72% scroll completion; human-written averaged 78%. Pure AI averaged 41%. Again, the rewriting pass closed most of the gap.

On conversion to newsletter signup: AI-assisted actually slightly outperformed human-written in our experiment (2.8% vs 2.4%). This surprised us. Our hypothesis is that the AI-assisted articles were structured more systematically (because the AI draft imposed structure) and produced cleaner calls-to-action.

On backlinks acquired organically: Human-written articles acquired 3.2x more backlinks than AI-assisted, and AI-assisted acquired 6.1x more than pure AI. The gap on backlinks is where human content still dramatically wins.

The practical interpretation: for ranking and engagement, a well-executed AI-assisted workflow captures 85-95% of the value of pure human writing at roughly 40% of the time investment. For building long-term authority through earned links and citations, human writing still wins decisively.

Where Pure AI Failed Most Dramatically

Looking at the articles that did not rank at all (the 33% of pure AI articles), a clear pattern emerged. Pure AI failed most decisively on:

Topics requiring first-hand experience. Product reviews, specific tool comparisons, and "how I used X to solve Y" narratives were catastrophic for pure AI. The content came out as generic, surface-level, and lacking the specific details that make experiential content useful. Google's algorithms, Google's AI Overviews, and actual human readers all rejected them.

Topics requiring current information. AI models have training cutoffs, and for anything that changed after the cutoff, pure AI content was either outdated or hallucinated. Several articles contained confident but incorrect statements about recent Google updates.

Topics requiring genuine expertise. Medical, legal, and financial topics were uniformly poor from pure AI. The content sounded authoritative but contained subtle errors that an expert would catch immediately. These are also exactly the topics where Google's E-E-A-T evaluation is strictest.

Topics with strong competitive content. For topics where the top-ranking pages were already detailed, specific, and expert-driven, pure AI simply could not compete. The AI output was too generic to displace content that had been written carefully.

Where pure AI did best (and occasionally matched AI-assisted): general reference topics where correctness and comprehensiveness matter more than personal insight. "What is [concept]" and "How does [process] work" articles were the most forgiving for pure AI.

The Editing Pattern That Made AI-Assisted Work

What specifically did the human editor do in the AI-assisted workflow? This matters because the difference between good and bad AI-assisted content is mostly in the editing pass.

Looking back at our best-performing AI-assisted articles, the editor consistently did several things:

Added specific examples from personal experience. Where the AI said "many businesses struggle with this," the editor replaced it with "a client of ours in the plumbing industry found..." Specific beats general, always.

Corrected factual errors and hallucinations. Every AI draft contained at least one confident but incorrect statement. The editor's first task was fact-checking against authoritative sources.

Rewrote opening paragraphs from scratch. The AI opening was always weak -- generic, cliche-ridden, slow to get to the point. Rewriting the first 150 words dramatically improved engagement.

Added personality and opinion. AI drafts are carefully neutral. The editor injected specific opinions, named trade-offs, and acknowledged complexity. "Most guides will tell you X, but in practice Y is more useful because..." type passages are almost never present in AI output but are crucial for authority.

Restructured for real reader experience. The AI tended to produce overly-balanced structures (equal attention to every subtopic). The editor rebalanced based on what readers actually need most.

Cut aggressively. AI drafts are verbose. The editor typically cut 20-30% of the word count, removing filler and redundancy while keeping substance.

The editing pass took roughly 90 minutes per article. That is the work that turned a pure-AI failure into an AI-assisted success.

What About Generic "AI Detection"?

A reasonable question: did any of this have to do with AI detection tools that claim to identify machine-generated text?

Our hypothesis, which we believe the data supports: no. Google does not appear to have a simple "AI detector" that flags content. What Google has is quality evaluation that happens to correlate with the differences between AI-only and human-edited content. Pure AI content tends to be generic, surface-level, and free of specific examples. Quality evaluation penalizes these traits regardless of their origin. AI-assisted content that has been genuinely improved by human editing no longer exhibits these traits, and therefore is not penalized.

In other words: the question "will Google detect my AI content?" is the wrong question. The right question is "is my content good enough that it does not matter how it was produced?"

The Cost-Benefit Calculation

For a small business trying to decide whether to use AI in content production, the honest calculation looks like this:

Pure AI: Very fast, very cheap. Produces content that is mostly worthless from an SEO perspective. Do not do this unless you are producing content for non-SEO purposes.

AI-assisted: Substantial time savings over pure human writing, with 85-95% of the quality. Requires a genuinely skilled editor with subject-matter expertise. For most small businesses, this is the right approach.

Pure human: The highest quality content, the highest cost, the best long-term authority building. Reserve this for your most important, most competitive pages -- the pillar pages, the flagship case studies, the content that will define your site's authority over years.

The optimal mix, for most businesses: use AI-assisted for 80% of your content and pure human writing for the 20% that matters most. This gives you volume without sacrificing quality on the flagship pieces.

The Important Caveat

This experiment was run on a mid-sized business site in the SEO/marketing niche, using high-quality AI tools (Claude Sonnet 4) and a skilled editor. Your results may differ if:

Your niche requires deeper subject-matter expertise (medical, legal, financial)
Your editor lacks the skills to meaningfully improve AI drafts
You are in a much more competitive niche where even excellent content struggles
You are using lower-quality AI tools or earlier-generation models

The principles should generalize, but the specific numbers won't.

What This Means for Your Strategy

If you are currently producing content with pure AI and wondering why it does not rank, the honest answer is: because it is not good enough, not because Google detected it. Switch to AI-assisted. Hire or develop an editor who can genuinely improve drafts. Accept that the savings are 60% rather than 95%, and realize that 60% savings on content that actually ranks is dramatically better than 95% savings on content that does not.

If you are avoiding AI entirely because you assume it cannot produce ranking content, you are leaving time and money on the table. The AI-assisted workflow is genuinely productive when done well.

If you want to know how your current content stacks up on the quality signals Google actually uses, run a free SEO audit with Licheo. It will tell you where your content is strong and where it is thin. That is, in truth, what we built it to do.

For the primary-source analysis of Google's actual position on AI content, see What Google Actually Says About AI Content. For the broader picture, see the AI Content & SEO hub.