Maier et al. 2025 — SSR for Purchase Intent
Citation
Maier, B.F., Aslak, U., Fiaschi, L., et al. (2025). LLMs Reproduce Human Purchase Intent via Semantic Similarity Elicitation of Likert Ratings. arXiv:2510.08338v3.
Core Contribution
Introduces Semantic Similarity Rating (SSR): a method that elicits textual responses from LLMs, then maps them to Likert distributions using embedding similarity to reference anchor statements. Achieves 90% of human test-retest reliability while maintaining realistic response distributions.
Key Framing
The problem isn’t whether LLMs can simulate human survey responses, it’s how we ask them. Direct Likert elicitation produces narrow, regression-to-mean distributions. Textual elicitation + embedding-based mapping produces human-like variance.
This reframes apparent LLM limitations as elicitation method artifacts.
Method Summary
- Prompt LLM with demographic persona + product concept
- Ask purchase intent question, elicit free-text response
- Embed response using text-embedding model
- Compare to five reference anchor statements (one per Likert point)
- Generate probability distribution over Likert scale based on cosine similarity
Key Findings
- Direct Likert Rating (DLR): High correlation (~80%) but poor distributions (KS sim ~0.26-0.39). Models cluster around “3”
- Follow-up Likert Rating (FLR): Better but still narrow distributions
- SSR: 90% correlation attainment + KS similarity >0.85
- Demographic conditioning essential: Without personas, correlation attainment drops to ~50%
- Synthetic consumers show less positivity bias than humans, providing wider discriminative range
Data
57 consumer surveys on personal care products (Colgate-Palmolive), 9,300 human respondents, 150-400 participants per survey.
Limitations Acknowledged
- Reference statement sets require manual design
- Some demographic patterns (gender, region, ethnicity) not consistently replicated
- Bounded by LLM training data coverage of domain
- Last-write-wins for concurrent updates
Transferable Insights
- Elicitation design shapes output quality: applies to any structured output task
- Two-stage elicitation (generate text → map to structure) preserves richness while enabling quantification
- Embedding space as semantic bridge between natural language and structured formats
- Correlation attainment as validation metric for synthetic data quality
Extracted Content
Atoms:
- 05-atom—direct-likert-regression-problem
- 06-atom—semantic-similarity-rating
- 05-atom—demographic-persona-necessity
- 03-atom—correlation-attainment-metric
Molecules: