Quantcast

Why Your “Direct Traffic”
in GA4 Isn’t What
It Used to Be

A Three-Site Analysis of How AI Tools Are Reshaping One
of Marketing’s Most Trusted Metrics

Executive Summary

Something changed in Google Analytics in mid-2025, and it hasn’t reverted. What marketers call “Direct traffic” — visits with no tracked source — is being flooded by a new kind of visitor: people who clicked a link inside ChatGPT, Claude, Perplexity, Gemini, or another AI tool and landed on your site. Because AI tools don’t pass along referral information the way Google Search does, those visits show up as Direct in your reports.

We first spotted this on smamarketing.net. But that raised an important question: is this happening everywhere, or is one site just unusually exposed?

This study answers that question across three websites managed by SMA Marketing, all measured with the same tools during the same time window: smamarketing.net (a B2B marketing agency content site), dragonplate.com (a B2B composites e-commerce site), and element6composites.com (a B2B specialty composites services site).

The findings confirm the original discovery — but reframe it. AI-driven distortion of Direct traffic isn’t a uniform problem. It’s a spectrum. The same underlying mechanism produces dramatically different outcomes depending on what kind of site you have.


What We Found

  • Direct traffic is being distorted by AI on all three sites — but the severity ranges from barely noticeable to nearly complete.
  • smamarketing.net is the extreme case: the engaged human cohort in Direct has almost entirely disappeared, replaced by people clicking AI citations.
  • dragonplate.com is split in two: real customers still arrive and buy, but roughly half the Direct channel is AI-related noise.
  • element6composites.com looks like Direct did in 2023 — mostly deliberate human visits, healthy engagement.
  • Three simple metrics — computable from any GA4 export in about 30 minutes — can tell you exactly where your site falls on this spectrum.
  • AI isn’t just sending traffic; it’s sending pre-qualified visitors. On smamarketing.net, people arriving via AI citations were 10x more likely to show high purchase intent than visitors from paid Google ads.

Background

The original analysis of smamarketing.net found that something structural shifted in GA4’s Direct channel starting July 30, 2025. It wasn’t a volume problem — Direct sessions only dropped 15% year-over-year. It was a composition problem.

The segment of Direct visitors who actually read content (sessions averaging 15–30 seconds) dropped from 42% of all Direct sessions in 2025 to 2% in 2026. In their place: a flood of visits that bounce within a second or two on deep, specific pages — the exact behavior of someone who clicked a link from an AI chatbot, confirmed the answer, and left.

The mechanism: when someone clicks a citation in ChatGPT or Perplexity, the AI tool doesn’t tell your analytics where the visitor came from. That click lands in GA4 as “Direct.”

That original report was explicit about one limitation: it was one site. This study is a replication.


The Three Sites

Data window: January 1 – May 20, 2026 (140 days)

PropertyDirect SessionsSessions/DayDirect RevenueUnique Landing Pages
smamarketing.net12,18187$01,038
dragonplate.com50,947364$204,9233,577
element6composites.com2,83920$077

Server-side crawl logs (tracking which bots visited, and how often) were collected for the same window and cross-referenced with the GA4 data to identify which AI systems are driving the behavior.


How We Measured It

Because the three sites are very different in size, industry, and traffic volume, we needed metrics that could be compared on the same scale. We developed three composition indices — each one a percentage of total Direct sessions — that work across any site type.

Real-Direct Index — the share of Direct sessions where a visitor spent at least 15 seconds on the page. This is the closest proxy for “a real human who intentionally came to read something.” A falling number means that the cohort is being diluted.

LLM-Shaped Index — the share of Direct sessions where a visitor bounced in under 5 seconds. This is the signature of an AI citation click: someone who got an answer from an AI, clicked the source link to verify it, confirmed the answer, and left.

AI-to-Real Direct Ratio — LLM-Shaped Index divided by the Real-Direct Index. A ratio above 1.0 means AI-shaped traffic now outweighs genuine human visitors in your Direct channel. Higher numbers indicate more severe distortion.

These three numbers give a comparable, defensible view of channel health regardless of how big or small a site is.


Findings

Finding 1: All Three Sites Are Affected — But at Wildly Different Levels

Engagement BucketSMA MarketingDragonPlateElement 6
0–1 seconds (instant bounce)39.6%40.0%10.0%
1–5 seconds22.5%10.2%14.1%
5–15 seconds35.0%16.5%22.0%
15–30 seconds (engaged read)2.3%13.9%42.1%
30+ seconds (deep read)0.5%19.4%11.7%
Real-Direct Index (15s+)2.8%33.3%53.8%
LLM-Shaped Index (<5s)62.1%50.2%24.1%
AI-to-Real Direct Ratio22.11.510.45

The three sites span the entire spectrum within the same time window. This rules out two simpler explanations — that the problem is uniform everywhere, or that smamarketing.net was just an anomaly. Instead, the same underlying mechanism produces very different results depending on the site.

Finding 2: Three Distinct Archetypes Emerge

Severe Dilution — smamarketing.net

The Real-Direct Index has collapsed to 2.8%. The engaged-reader segment that made up 42% of Direct in 2025 now represents just 2%. With an AI-to-Real Ratio of 22.1, LLM-shaped traffic outnumbers genuine human visits by more than 20 to 1. Direct on this site is, in practical terms, now an AI citation channel — with a small residual human population.

Bimodal — dragonplate.com

At a glance, DragonPlate sits in the middle. But the engagement breakdown tells a more interesting story: the channel has split into two separate populations. Forty percent of Direct sessions bounce in under a second. Another 19% engage for more than 30 seconds. The middle has been hollowed out.

The top landing pages make this visible. The /login page received 638 Direct sessions averaging 38.5 seconds of engagement — returning customers coming back to buy. The /kevlar-composite-products page received 729 sessions averaging 0.6 seconds — AI citation bounces. Direct produced $204,923 in revenue over the 140-day window, so real customers are still there. They’re just sharing the channel with a lot of noise.

Pre-Disruption — element6composites.com

An AI-to-Real Ratio of 0.45 and a Real-Direct Index of 53.8% are roughly what Direct looked like across the web in 2023 and early 2024. Only 10% of Direct sessions bounce instantly. The largest engagement bucket — 42% of sessions — is the 15–30 second read. Direct here still means what marketers have historically assumed it means.

Finding 3: Crawl Logs Reveal Which AI Systems Are Driving What

The server logs show that AI bots are crawling all three sites, but the mix of which AI systems varies significantly, and that variation explains the differences in Direct distortion.

AI AgentSMA (per day)DragonPlate (per day)Element 6 (per day)
ChatGPT-User~150~20~44
OAI-SearchBot~12~69~5
GPTBot~5~27~3
ClaudeBot~0~104~5
Amazonbotminimal~303~0
Meta external agent~50~20~20
AI-related total/day~466~616~101

A few patterns stand out:

ChatGPT-User volume tracks most closely with Direct distortion. SMA Marketing receives about 150 ChatGPT user visits per day and shows the most severe dilution. Element 6 receives about 44 per day and is the least affected. ChatGPT-User represents a human using ChatGPT and following a citation link — the exact mechanism driving the Direct distortion.

DragonPlate’s distortion has a different cause. On that site, ClaudeBot and Amazonbot dominate AI traffic — both primarily crawling product catalog data at scale, not following editorial citations. That’s a different behavior with a different signature in GA4.

OpenAI is focused on editorial content. Anthropic and Amazon are focused on product catalogs. ClaudeBot visited DragonPlate 14,719 times in the study window and visited smamarketing.net just 4 times. Amazonbot visited DragonPlate 43,018 times and barely touched the other two sites. AI players appear to have distinct appetites based on site type.

Finding 4: A Major Coordinated Crawl Event Hit DragonPlate in April 2026

Something notable happened on dragonplate.com in April 2026 that didn’t occur on the other two sites. Both ClaudeBot and Amazonbot spiked dramatically — and at the same time.

  • ClaudeBot: 1,407 visits in March → 8,893 in April (6x increase) → 2,083 in May
  • Amazonbot: 2,804 in March → 14,535 in April (5x increase) → 18,742 in May (still rising)

Neither spike has a counterpart on smamarketing.net or element6composites.com. The most straightforward explanation: multiple AI systems ran large-scale catalog ingestion against high-value e-commerce data sources around the same time — independently or in coordination.

The practical takeaway for e-commerce sites: AI crawl load is no longer a steady background hum. It’s event-driven and can spike dramatically without warning. Monthly monitoring is the minimum; daily monitoring is better.

Finding 5: AI Citation Visitors Are Pre-Qualified

Despite the distorted-looking engagement metrics, the Direct channel on smamarketing.net produced something unexpected when we looked at downstream intent data.

Over a 90-day window using on-site chat instrumentation, Direct generated 5 of 5 high-intent conversations — a 9.8% high-intent rate from 51 sessions. Paid Google generated 0 out of 26. That’s a 10x quality multiple.

The interpretation: AI tools like ChatGPT are doing the screening. They surface a source, vouch for it, and forward a smaller — but meaningfully better — cohort of visitors who already trust the answer they got. The engagement metrics look bad because most of those visitors got what they needed quickly. But the ones who needed more showed up ready to buy.

This finding is currently limited to smamarketing.net, since the other two sites don’t have the same intent-scoring instrumentation. Extending this measurement across the portfolio is a clear next step.


The Direct Dilution Spectrum: A Classification Framework

Based on the three archetypes observed, we propose a four-tier classification of Direct channel health.

TierAI-to-Real RatioReal-Direct IndexWhat It Means
StableBelow 0.5Above 50%Direct still reflects deliberate human visits. Standard reporting is valid.
Emerging Dilution0.5–2.030–50%AI-shaped traffic is becoming visible but isn’t dominant yet. Start tracking composition.
Bimodal1.0–3.0 (with high 30s+ share)30–40% with split distributionTwo distinct populations sharing the channel. Volume metrics are misleading; you need to segment by engagement.
Severe DilutionAbove 5.0Below 10%AI-shaped traffic dominates. Reporting Direct as a volume number is functionally meaningless. Engagement segmentation is mandatory.

Element 6 is Stable. DragonPlate is Bimodal. SMA Marketing is Severe Dilution.

These indices can be computed from any GA4 Direct landing-page report in under 30 minutes. You don’t need custom instrumentation to classify your site — you need the right export and a simple spreadsheet.


What Determines Where a Site Falls on the Spectrum?

With only three properties, we can’t make statistical claims. But the qualitative differences between the sites point to five factors worth testing at larger scale.

1. How often do people ask AI tools about your topic? SMA Marketing publishes content about SEO, AI search, and structured data — topics that come up constantly in AI conversations. Element 6 publishes composites engineering content, which generates far fewer AI queries. The more your content answers common questions, the more citation traffic you’ll attract.

2. Is your brand recognized within AI-cited topics? SMA Marketing is one of the few agencies publishing original research on GEO and AI search, making it a named, citable source. Element 6 is one of many composites suppliers with lower name recognition in AI query contexts.

3. Is your content easy for AI to extract and cite? Content with clear structure — defined terms, statistics, lists, explicit answers to questions — is easier for AI systems to pull from and cite. Product catalog pages are harder to cite in a question-and-answer context. Engineering specs fall somewhere in between.

4. Do you have a baseline of intent-driven returning visitors? E-commerce sites like DragonPlate have customers who come back regularly to buy. That population is largely immune to AI dilution because purchasing behavior hasn’t shifted to AI mediation at scale yet. Their Direct revenue shows this. Information-only sites don’t have that buffer.

5. How large is your content surface? SMA Marketing has 1,038 unique Direct landing pages. DragonPlate has 3,577. Element 6 has just 77. More pages mean more citation opportunities — and more total citation traffic. But even a small site can show high per-page citation density, which is actually an opportunity signal, not a problem.


What This Means for Practitioners

Stop reporting Direct as a single number — especially across a portfolio. Different sites require completely different reporting frames. A site in Severe Dilution and a site in Stable cannot be evaluated using the same channel metrics.

The three composition indices are the new minimum for Direct reporting. Real-Direct Index, LLM-Shaped Index, and AI-to-Real Ratio give you a defensible, comparable view at very low cost. They should be tracked monthly.

Know which AI systems are targeting your site — and why. If you’re an editorial or content-driven site, OpenAI’s ChatGPT-User is likely your primary source of citation traffic. If you’re an e-commerce or product catalog site, Anthropic’s ClaudeBot and Amazon’s Amazonbot are your primary AI visitors — and they behave very differently. Crawl log monitoring is now part of the marketing analyst’s toolkit.

Pre-disruption isn’t a safe position — it means AI hasn’t discovered you yet. A Stable score like Element 6’s isn’t something to protect; it’s a signal that the site hasn’t become visible to AI systems yet. For most B2B businesses, that visibility is a strategic goal, not a threat.

Bimodal sites need to serve two different audiences. DragonPlate’s Direct channel contains real buyers arriving to transact and AI-citation visitors arriving to verify a claim. These are different people with different needs. Designing the page experience for one works against the other.

Severe-dilution sites need to move their conversion measurement off of engagement metrics. When 62% of Direct sessions bounce on landing, page-level engagement data is mostly measuring noise. On-site chat, lead-magnet conversion tracking, and downstream pipeline attribution become your real signal layers — and as the SMA Marketing chat data shows, those signals can reveal a high-quality audience hiding inside a metric that looks broken.


Limitations

N=3. Three properties are not a statistical sample. The dilution spectrum, four-tier classification, and five drivers are candidate frameworks — not established findings. They need 10–50 property validations to be defensible.

All three properties share an agency. This controls for measurement consistency, but these aren’t independent observations.

Crawl log windows differ across sites. SMA Marketing’s logs cover the original inflection point in August–September 2025. The other two sites’ logs begin after that inflection. We can observe their current state but not their transition.

No year-over-year comparison for the new sites. The composition shift was directly measurable on SMA Marketing because we had both 2025 and 2026 data. For DragonPlate and Element 6, we only have a 2026 YTD snapshot.

Intent and chat data is only available for one property. The 10x Direct vs. paid Google quality multiple is a one-site finding and may not generalize.

Causal inference is observational. We haven’t run controlled experiments. The driver framework is consistent with the data, but hasn’t been rigorously tested.


How to Classify Your Own Site (30-Minute Guide)

  1. Pull a GA4 Landing Page report filtered to Direct (Session default channel group = Direct) for any recent 90+ day window.
  2. For each row, note the average engagement time and group pages into buckets: 0–1s, 1–5s, 5–15s, 15–30s, and 30s+.
  3. Compute the share of total Direct sessions in each bucket.
  4. Calculate your Real-Direct Index (15s+ share), LLM-Shaped Index (<5s share), and AI-to-Real Ratio.
  5. Classify your site using the four-tier table above.
  6. Pull crawl logs for the same window. Identify visits from AI agents: ChatGPT-User, GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Perplexity-User, Amazonbot, meta-externalagent, Google-CloudVertexBot, Google-Extended. Compare per-day rates to understand which AI systems are targeting your site.
  7. If you have on-site chat or another intent-scoring layer, segment intent by source to see if the quality signal is there.

Repeat monthly. This is a dynamic shift — single-point measurements understate the trajectory, and the April 2026 crawl event on DragonPlate is evidence that meaningful changes can happen within a single month on high-value properties.


Conclusion

The Direct channel in GA4 is undergoing a structural change that is real, persistent, and driven by AI citation behavior. Across three different properties — measured with the same tools in the same window — we observed states ranging from essentially undisturbed to almost entirely AI-driven.

A small set of composition metrics, computable from any GA4 export in minutes, captures the full spectrum and produces directly comparable readouts across very different site types. Crawl logs explain the mechanism and show which AI players are responsible for which kinds of activity.

Two operational messages: First, stop reporting Direct as if it still means what it meant in 2024. Calibrate your reporting to the tier your site is actually in. Second, recognize that different AI players target different types of sites — OpenAI is following editorial content, Anthropic and Amazon are ingesting product catalogs, and Google ran its indexing burst last summer. Different sites have different AI exposures, and different responses are needed.

What the field still needs is a larger-N validation study. The three-property dataset here supports a working framework — it doesn’t establish one. A 10–50 property study across distinct site archetypes would convert this from a case study into a methodology. We intend to pursue that. Practitioners interested in contributing properties to that study are invited to make contact.


Data and analysis: SMA Marketing. Properties: smamarketing.net (subject), dragonplate.com (under management, All Red Corp), element6composites.com (under management, All Red Corp). Measurement methodology and monthly tracking template available on request.

Want to know how AI is affecting your Direct channel and what to do about it?

Contact SMA Marketing today!