Maier et al. 2025 — SSR for Purchase Intent

Citation

Maier, B.F., Aslak, U., Fiaschi, L., et al. (2025). LLMs Reproduce Human Purchase Intent via Semantic Similarity Elicitation of Likert Ratings. arXiv:2510.08338v3.

Core Contribution

Introduces Semantic Similarity Rating (SSR): a method that elicits textual responses from LLMs, then maps them to Likert distributions using embedding similarity to reference anchor statements. Achieves 90% of human test-retest reliability while maintaining realistic response distributions.

Key Framing

The problem isn’t whether LLMs can simulate human survey responses, it’s how we ask them. Direct Likert elicitation produces narrow, regression-to-mean distributions. Textual elicitation + embedding-based mapping produces human-like variance.

This reframes apparent LLM limitations as elicitation method artifacts.

Method Summary

  1. Prompt LLM with demographic persona + product concept
  2. Ask purchase intent question, elicit free-text response
  3. Embed response using text-embedding model
  4. Compare to five reference anchor statements (one per Likert point)
  5. Generate probability distribution over Likert scale based on cosine similarity

Key Findings

  • Direct Likert Rating (DLR): High correlation (~80%) but poor distributions (KS sim ~0.26-0.39). Models cluster around “3”
  • Follow-up Likert Rating (FLR): Better but still narrow distributions
  • SSR: 90% correlation attainment + KS similarity >0.85
  • Demographic conditioning essential: Without personas, correlation attainment drops to ~50%
  • Synthetic consumers show less positivity bias than humans, providing wider discriminative range

Data

57 consumer surveys on personal care products (Colgate-Palmolive), 9,300 human respondents, 150-400 participants per survey.

Limitations Acknowledged

  • Reference statement sets require manual design
  • Some demographic patterns (gender, region, ethnicity) not consistently replicated
  • Bounded by LLM training data coverage of domain
  • Last-write-wins for concurrent updates

Transferable Insights

  1. Elicitation design shapes output quality: applies to any structured output task
  2. Two-stage elicitation (generate text → map to structure) preserves richness while enabling quantification
  3. Embedding space as semantic bridge between natural language and structured formats
  4. Correlation attainment as validation metric for synthetic data quality

Extracted Content

Atoms:

Molecules: