📑 Table of Contents

Rakuten Deploys Custom LLMs to Power E-Commerce Search

📅 · 📁 Industry · 👁 7 views · ⏱️ 13 min read
💡 Japanese tech giant Rakuten integrates proprietary large language models into its e-commerce search engine, aiming to revolutionize product discovery for millions of users.

Rakuten, Japan's largest e-commerce platform and one of the world's most prominent online retail ecosystems, has announced the integration of its proprietary large language models directly into its core search infrastructure. The move positions Rakuten as one of the first major global e-commerce players to deploy in-house LLMs at scale for product discovery — a bold strategy that could reshape how hundreds of millions of shoppers find and purchase goods online.

Unlike competitors such as Amazon and Alibaba, which have primarily relied on partnerships with external AI providers or traditional keyword-matching algorithms, Rakuten is betting on models built and fine-tuned internally. The initiative reflects a growing trend among non-AI-native tech companies to develop sovereign AI capabilities rather than depend on third-party solutions from OpenAI, Google, or Meta.

Key Facts at a Glance

  • Proprietary models: Rakuten has developed its own family of LLMs, reportedly trained on vast multilingual e-commerce datasets spanning Japanese, English, and other Asian languages
  • Search overhaul: The new AI-powered search replaces traditional keyword-matching with semantic understanding, enabling natural language product queries
  • Scale: Rakuten's marketplace serves over 100 million registered users across Japan and international markets
  • Timeline: The rollout is expected to proceed in phases throughout 2025, starting with Rakuten Ichiba, its flagship Japanese marketplace
  • Investment: Rakuten has reportedly invested hundreds of millions of dollars in AI infrastructure over the past 2 years
  • Competitive edge: The company aims to reduce search abandonment rates by up to 30% through improved query understanding

Semantic Search Replaces Keyword Matching

Traditional e-commerce search engines rely heavily on keyword matching — a method that often fails when shoppers use conversational language or describe products in non-standard ways. A search for 'something to keep my coffee warm at my desk' would typically return zero results on legacy platforms. Rakuten's LLM-powered system changes this fundamentally.

The new search engine processes queries using natural language understanding (NLU), interpreting the intent behind a shopper's words rather than simply matching them to product titles and descriptions. This means the system can parse complex, multi-attribute queries and return contextually relevant results even when the exact keywords don't appear in the product listing.

Rakuten's approach draws on techniques similar to those used by Google's Search Generative Experience and Amazon's recently introduced Rufus shopping assistant. However, Rakuten's differentiator lies in its decision to build models from scratch rather than fine-tune existing open-source alternatives like Meta's Llama 3 or Mistral.

Why Rakuten Built Its Own Models

The decision to develop proprietary LLMs rather than license external ones reflects several strategic considerations that are increasingly common among large-scale platform companies.

Data sovereignty stands as a primary motivator. Rakuten's marketplace generates enormous volumes of transactional data, search logs, seller descriptions, and customer reviews. Training models on this proprietary data creates a competitive moat that cannot be replicated by general-purpose LLMs like GPT-4o or Claude 3.5 Sonnet.

Language specialization is another critical factor. While major Western LLMs perform well in English, they often struggle with the nuances of Japanese — a language with complex character systems (kanji, hiragana, katakana) and context-dependent meaning. Rakuten's models are natively multilingual, with particular strength in Japanese e-commerce terminology.

Cost control also plays a role. At Rakuten's scale — processing millions of search queries daily — API calls to external LLM providers would quickly become prohibitively expensive. Running inference on proprietary models deployed on owned infrastructure can reduce per-query costs by 60-80% compared to commercial API pricing, according to industry estimates.

Technical Architecture Behind the Integration

While Rakuten has not disclosed the full technical specifications of its LLMs, industry analysts and patent filings suggest the company is using a retrieval-augmented generation (RAG) architecture tailored for e-commerce applications.

The system likely operates in multiple stages:

  • Query understanding: The LLM first interprets the shopper's intent, extracting entities like product categories, attributes, price ranges, and use cases
  • Semantic retrieval: Vector embeddings are used to match the interpreted query against a product catalog indexed in a vector database
  • Re-ranking: A secondary model re-ranks results based on relevance, purchase probability, and personalization signals
  • Response generation: For certain query types, the system can generate natural language summaries or product comparisons alongside traditional grid results

This multi-stage pipeline mirrors architectures deployed by companies like Perplexity and You.com in web search, but adapted specifically for the structured data environment of e-commerce. The RAG approach allows the system to ground its responses in real product data, minimizing the hallucination problems that plague general-purpose LLMs.

Rakuten has also reportedly invested in custom inference hardware, including partnerships with chip makers to optimize model performance on edge servers located in its data centers across Japan and Southeast Asia.

Industry Context: The E-Commerce AI Arms Race

Rakuten's move arrives amid an intensifying race among global e-commerce platforms to embed AI deeply into the shopping experience. The competitive landscape is shifting rapidly.

Amazon launched its Rufus AI shopping assistant in early 2024, using a combination of proprietary and third-party models to answer product questions and guide purchasing decisions. Shopify introduced Sidekick, an AI-powered assistant for merchants. Alibaba has deployed its Tongyi Qianwen LLM across Taobao and Tmall for product recommendations and customer service.

In the broader retail tech space, companies like Instacart, Mercado Libre, and Zalando have all announced AI-enhanced search and discovery features within the past 12 months. The consensus among industry analysts is that AI-powered search will become table stakes for major e-commerce platforms by 2026.

What sets Rakuten apart is its commitment to full vertical integration — owning the models, the training data, and the deployment infrastructure. This mirrors the approach taken by tech giants like Apple with its on-device AI strategy, prioritizing control over convenience.

What This Means for Sellers and Shoppers

For sellers on Rakuten's marketplace, the shift to LLM-powered search carries significant implications. Product listings optimized purely for keyword density may become less effective, while those with rich, descriptive natural language content could see improved visibility.

Sellers should anticipate several changes:

  • Product descriptions matter more than ever — detailed, conversational descriptions will be better understood by semantic search
  • Attribute completeness becomes critical, as the LLM extracts structured data from listings to match against complex queries
  • Review quality may influence search ranking, as the model can parse sentiment and extract product insights from customer feedback
  • Long-tail discoverability improves, meaning niche products may gain visibility they previously lacked under keyword-based systems

For shoppers, the experience should feel more intuitive. Users can describe what they want in natural language — 'a lightweight waterproof jacket for spring hiking under $100' — and receive relevant results without needing to navigate complex filter menus or guess the right keywords.

The potential reduction in search abandonment rates — Rakuten targets a 30% improvement — could translate directly to higher conversion rates and increased gross merchandise value (GMV) across the platform.

Challenges and Risks Ahead

Hallucination risks remain a concern, even in constrained e-commerce environments. If the LLM misinterprets a query or surfaces irrelevant products, it could erode shopper trust more quickly than a simple 'no results found' page.

Latency is another challenge. LLM inference is computationally expensive, and e-commerce shoppers expect search results in under 200 milliseconds. Rakuten will need to balance model complexity with response speed, likely relying on aggressive caching, model distillation, and hardware acceleration to meet performance targets.

There are also regulatory considerations. As AI-driven systems increasingly determine which products gain visibility, questions about algorithmic fairness and transparency will inevitably arise. The European Union's Digital Services Act and Japan's evolving AI governance framework may require platforms to disclose how AI influences search rankings.

Looking Ahead: Rakuten's Broader AI Ambitions

The search integration appears to be just one component of Rakuten's larger AI strategy. The company has signaled ambitions to deploy LLMs across its entire ecosystem, which spans financial services (Rakuten Bank, Rakuten Card), telecommunications (Rakuten Mobile), and digital content (Rakuten Viki, Rakuten Kobo).

Cross-platform personalization — using AI to create a unified understanding of each user across shopping, banking, and entertainment — represents a significant opportunity. Few companies outside of Google and Apple possess this breadth of consumer data.

Rakuten's CEO Mickey Mikitani has publicly stated that AI is the company's top strategic priority for 2025 and beyond. If the e-commerce search integration proves successful, expect rapid expansion into customer service automation, dynamic pricing, and personalized marketing.

The global e-commerce industry generates over $6 trillion in annual revenue. As AI transforms how consumers discover and evaluate products, companies that control their own AI infrastructure — rather than renting it — may hold a decisive advantage. Rakuten's bet on proprietary LLMs is a calculated gamble that the future of online retail belongs to those who own the intelligence layer, not just the marketplace.