Implicit Reasoning for Large Language Model-based Generative Recommendation
Abstract
Large Language Models for generative recommendation face challenges with semantic IDs disrupting natural-language reasoning, prompting a lightweight implicit reasoning approach that outperforms explicit methods while reducing computational costs.
Large Language Models (LLMs) are increasingly adopted as backbones for Generative Recommendation (GR), promising access to pretrained world knowledge. Yet reliably invoking this knowledge for GR remains poorly understood. A key obstacle is that LLM-based GR typically represents items with Semantic IDs (SIDs), disrupting LLMs' natural-language reasoning interface because these tokens are unseen by the LLM during pretraining. Existing approaches address this with expensive multi-stage pipelines that ground SIDs and elicit explicit rationales, but offer limited insight into when and why each stage is necessary. In this work, we systematically decompose explicit reasoning training pipelines for LLM-based GR, revealing three key limitations: weakened world-knowledge verbalization, misalignment between SID and natural-language token embedding spaces, and sensitivity to rationale quality, all of which hurt explicit reasoning performance. To circumvent these issues, we propose PauseRec, a lightweight implicit reasoning paradigm tailored for GR. PauseRec is exceptionally practical, avoiding costly reasoning trace acquisition and reasoning alignment training, leading to a multitude of benefits: (1) it outperforms standard explicit CoT methods by up to 6.22%, (2) it reduces training cost by up to 65% GPU hours, and (3) it speeds up inference by up to 71.3%. These results position PauseRec as a lightweight alternative to explicit rationale generation, enabling more effective and efficient LLM-based GR.
Community
A lightweight implicit resasoning method for generative recommendation
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- LASAR: Latent Adaptive Semantic Aligned Reasoning for Generative Recommendation (2026)
- Reinforced Preference Optimization for Reasoning-Augmented Recommendations (2026)
- TwiSTAR:Think Fast, Think Slow, Then Act,Generative Recommendation with Adaptive Reasoning (2026)
- RRCM: Ranking-Driven Retrieval over Collaborative and Meta Memories for LLM Recommendation (2026)
- ReGuLaR: Relation-Grounded Latent Reasoning for Large Vision-Language Models (2026)
- LaME: Learning to Think in Latent Space for Multimodal Embedding via Information Bottleneck (2026)
- CIRF: Tokenizing Chain-of-Thoughts into Reusable Functional Units for Efficient Latent Reasoning in Large Language Models (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2606.14142 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper