All use cases09 / 11Erobold
09 · Case study · Erobold

Erobold. In intimates, the buying decision is trust before fit.

Intimates is a category where shoppers don’t experiment lightly. Reviews, ratings, fit notes (the whole social-proof stack) is what tips a query into a purchase. Shopthru.OS ran the deepest single-merchant audit on the platform on Erobold and put the Review Agent in the lead.

Industry
Apparel
Specialists in motion
Review Agent
Headline outcome
5,475 executions
01
Measures

Measured 5,475 LLM executions and 709 ground-truth field comparisons.

The Review Agent ran the deepest cohort the platform has executed on a single merchant: 5,475 LLM queries across providers, surfacing exactly when AI engines mention reviews, ratings, fit guidance and sizing for Erobold SKUs. 709 ground-truth checks confirmed which AI claims matched the live PDPs and which drifted.

5,475
LLM executions across providers
709
ground-truth field comparisons
15
SKUs deep-audited end-to-end
02
Simulates

Simulated the trust-signal stack: review schema, sentiment match, social proof.

For each PDP the OS modelled three things in parallel: is review schema present and valid? Are AI engines actually citing the reviews when asked? Does the sentiment AI infers match what shoppers actually wrote? In intimates, where most buying decisions happen on trust, every gap matters more.

SKUReview schemaAI cites reviewsSentiment matchFit-note cov.
SKU 01
SKU 02
SKU 03
SKU 04
SKU 05
SKU 06
SKU 07
SKU 08
SKU 09
SKU 10
SKU 11
SKU 12
SKU 13
SKU 14
SKU 15
Coverage9 · 3 · 39 · 3 · 38 · 4 · 38 · 4 · 3
15 SKUs × 4 trust signals = 60 cellsstrong · drifting · missing
ReceiptTrust-signal coverage across 15 SKUs, review schema present, AI cites reviews, sentiment match, fit-note coverage. Green = signal strong, amber = drift, red = missing.
  • Review schema validated per PDP, surfacing missing review counts, ratings, and dates
  • AI-cited sentiment cross-checked against the actual review corpus
  • Fit-note coverage flagged separately: sizing matters most in this category
03
Ships

Shipped 3,100 action items, with the Review Agent leading the trust-signal queue.

P-priority CONTENT_FIX items lead the queue: every PDP that lacks review schema, every product where AI is citing weak or no social proof, every gap between the sentiment AI infers and what the live reviews actually say. 15 SKUs, 3,100 line items, every one ranked.

04
Compounds

Compounded: the trust-signal pass is now a default specialist on every audit in trust-heavy categories.

Erobold proved that visibility alone isn’t enough in categories where the buying decision rides on social proof. The Review Agent’s trust-signal pass (schema + sentiment match + fit-note coverage) is now a default for intimates, beauty, supplements, and any DTC category where reviews carry the cart.

Want this for your catalog?

Book a 30-minute Loop walkthrough.

We’ll run the first measurement live. No deck.