Quantcast

Amazon’s AI Scraping Exposed: How To Protect Your Products

If you run an e-commerce site and sell exclusively through your own platform, you assume your products are yours. But what happens when a billion-dollar corporation scrapes your entire catalog, builds a storefront in your name, and starts intercepting your customers without ever asking? That’s exactly what we uncovered for one of our clients. Here’s the full story, what it means for your brand, and what you can do about it right now.

The Discovery

We were working with a client who sells exclusively through their own website. No Amazon or third-party storefronts. They sell only through their site, which we’ve helped maintain strong search visibility in a competitive niche for years.

On February 10th, while reviewing data in Known Agents (formerly Dark Visitors), we spotted a bot we had never seen before. It went from zero activity to over 3,000 page requests in a single day. This wasn’t Googlebot. It wasn’t ChatGPT. It was Amazon’s product discovery bot, and it scraped the client’s entire product catalog in one hit.

Within days, those products appeared on Amazon under a storefront the client never created. The page was labeled “Shop Other Stores Directly” and framed as a benefit to shoppers. The client had no idea it existed. They outrank Amazon for every one of those products in traditional search. Amazon didn’t ask, didn’t send a consent letter, and didn’t credit the source. If you clicked through to the product, a box appeared with a link to the client’s site, but without referral attribution. There was no way to track it. Amazon made it very clear they’d prefer you stay on their site.

The Timeline

Here’s how it unfolded, start to finish:

  • Before February 10th: Amazon’s bot had never visited the site. There was zero historical activity.
  • February 10th: Amazon’s product discovery bot spiked from zero to over 3,000 page requests in a single day, scraping the entire product catalog.
  • Shortly after: Products appeared on Amazon under a “Shop Other Stores Directly” storefront that the client never created.
  • Our response: Using Known Agents (formerly Dark Visitors), we identified the bot, located the storefront, and documented everything.
  • Cease-and-desist letter sent: We submitted a detailed letter requesting removal and an explanation of Amazon’s actions.
  • Amazon’s response: A “thank you for opting out” email. There was no acknowledgment of wrongdoing or explanation. Just an opt-out confirmation for a program the client never joined.

Black Hat Marketing

Let’s call this what it is. This wasn’t indexing. This wasn’t helpful discovery. It was a calculated grab, and it checks every box for black hat marketing. And it’s harmful to SEO.

Unauthorized scraping. Over 3,200 bot requests in a single day from a crawler that had never visited the site before. It took copyrighted content and proprietary product images without permission.

Traffic hijacking. The goal was to intercept customers who would have found this brand organically and keep them on Amazon instead. Amazon makes more money when purchases are made on its platform. That’s the whole point.

Referral stripping. When Amazon sends users from your product listing to your site, it doesn’t pass referral data. That traffic shows up as direct in GA4, Search Console, and BigQuery. You’re blind to it. You can’t track it, attribute it, or make decisions based on it.

Competitive asymmetry. Amazon’s AI assistant Rufus doesn’t even recommend the scraped products. It recommends products sold through the Amazon marketplace. So Amazon used the client’s content to capture the audience, and then handed that audience directly to their competitors.

The Referral Stripping Problem

This one deserves its own conversation because it’s a bigger deal than most people realize.

Attribution is already harder than ever to get right. When Amazon sends users from your scraped storefront to your site without passing referral data, that traffic lands as direct in GA4, Search Console, and BigQuery. It’s untrackable and untraceable. You’re losing visibility at a critical stage of the funnel and have no idea it’s happening.

Most brands would look at a spike in direct traffic and assume something good is going on. The reality is you’re blind. You’re losing attribution, and you don’t even know it unless you’re actively monitoring bot activity and doing the kind of backwards research we did here.

What This Means for SEO

This case has real implications for how you think about SEO and GEO. It hurts visibility, attribution, and data integrity.

Bot Traffic Is Largely Invisible

Most analytics platforms are still catching up. GA4 filters some bot traffic, but not all of it. Microsoft Clarity handles it better, but even that isn’t perfect. The reality is that roughly half of your website traffic today is non-human, and if you’re not actively monitoring it, your data is already compromised.

Amazon Targets Market Leaders

The client ranked number one for their core product terms. A term like “carbon fiber sheets” at 3,600 searches per month with strong purchase intent may seem small to a billion-dollar company, but aggregated across thousands of scrapes over time, the math adds up fast.

As far as we can tell, Amazon did not target second and third-tier competitors. We checked competitors’ ranking in positions two and three, and no scraped storefronts were anywhere. Amazon appears to be going after the top-ranking sites specifically. If you’re a market leader in your niche, you’re likely a higher priority target than you think.

Consent Is Inverted

This is the core ethical problem. The opt-out model shouldn’t come after the fact. You don’t have the right to take someone’s content, build a storefront around it, and then offer them an exit after the damage is done. Nobody signed up for this program. If Amazon had reached out first and offered proper referral attribution as part of the deal, there’s a chance brands might have tested it. But that’s not what happened. Amazon placed the burden entirely on the brand being scraped, which is backwards.

Attribution Erosion Is a Growing Problem

When scraped content gets republished on another platform without referral data, it doesn’t trigger a single page view in GA4 or Search Console. You lose visibility into a critical stage of the funnel and have no idea it’s happening unless you’re monitoring bot activity.

4 Defensive Actions to Take Right Now

1. Monitor Your Bot Traffic

Tools like Known Agents (formerly Dark Visitors) make it straightforward to identify which crawlers are visiting your site, how often, and what they’re doing. If you’re not tracking this, you’re flying blind. Set it up now.

2. Block Unauthorized Bots

Once you identify bots you don’t want on your site, block them using your robots.txt file or server-level rules. We blocked Amazon’s bot and their product discovery bot entirely. Be selective here. There are legitimate AI crawlers you want to allow if you’re building visibility in AI search. Block the ones abusing your content.

3. Document Everything

Keep timestamps, screenshots, and records of every unauthorized crawl. This is your evidence trail for legal or regulatory action if it comes to that. Don’t skip this step.

4. Check Amazon Right Now

If you’re in e-commerce, go to Amazon and search for your brand. Look for a “Shop Other Stores Directly” section. If your products are there, search for Amazon’s opt-out process — there is an email you can contact to request removal. It’s frustrating that the burden falls on you, but the faster you act, the better.

The Bigger Picture

Amazon has paid billions in fines and kept moving. Your individual site is a data source to them, not a business relationship. Their growth teams are actively looking for tactics like this, and most brands never find out it’s happening.

But sunlight is a disinfectant. When enough brands document this, share it, and push back, industry pressure builds. If you find this happening to your site, don’t stay quiet. Build your own case study. Raise the flag. The more visible this becomes, the harder it is to ignore.

If you’re an e-commerce brand and want help auditing your bot traffic, identifying unauthorized scraping, or building a strategy to protect and grow your visibility, reach out to our team. We’d love to help.

Until next time, happy marketing.

Additional Resources

Scroll to Top