📑 Table of Contents

New Method Reverse-Engineers Closed-Source LLM Parameter Counts Through Knowledge Capacity

📅 · 📁 Research · 👁 10 views · ⏱️ 5 min read
💡 A new arXiv paper proposes 'Incompressible Knowledge Probes,' a method that leverages information-theoretic lower bounds on fact storage to externally estimate the parameter scale of closed-source LLMs, offering a new tool for industry transparency.

Closed-Source Model Parameter Counts Remain a Mystery as Researchers Propose a Novel Estimation Approach

Leading AI labs have increasingly chosen not to disclose the parameter counts of their closed-source large language models, making the true size of models like GPT-4 and Claude a persistent topic of industry speculation. Traditional methods typically rely on inference economics — indirectly estimating parameter counts through inference speed, hardware configurations, and serving architecture — but these approaches are heavily influenced by external factors such as hardware type, batching strategies, and serving stack design, often producing errors exceeding 2x and raising serious reliability concerns.

A recent paper published on arXiv (arXiv:2604.24827v1) introduces an entirely new intrinsic estimation method — Incompressible Knowledge Probes — which attempts to reverse-engineer how large a model is based on how much it knows.

Core Principle: Information-Theoretic Lower Bounds on Knowledge Storage

The theoretical foundation of this method stems from an elegant and profound information-theoretic principle: storing F facts requires at least F ÷ (bits per parameter) weight parameters. In other words, the more facts a model can accurately recall, the higher the lower bound on its parameter count must be.

The research team translated this principle into an actionable experimental framework. They constructed large-scale factual knowledge probing datasets, systematically measured the number of facts a target model could correctly answer, and then combined this with coding efficiency upper bounds from information theory to compute a lower-bound estimate of the model's parameter count.

Unlike inference economics approaches, this method's advantage lies in its intrinsic nature — it depends solely on the model's own knowledge capacity rather than external hardware and deployment assumptions, theoretically providing tighter and more reliable estimation bounds.

Technical Analysis: Why This Approach Is More Reliable

From a technical perspective, the method's key innovations include the following:

1. Leveraging incompressibility to guarantee valid lower bounds. The researchers carefully designed "incompressible" factual probes — knowledge points that lack exploitable statistical patterns, preventing models from "cheating" through learned shortcuts or pattern compression. Each fact must occupy real parameter capacity for storage, ensuring the rigor of the lower-bound estimate.

2. Independence from external assumptions. Traditional inference economics methods require assumptions about what GPUs the provider uses, what quantization precision is employed, and what parallelism strategies are deployed. Any deviation in these assumptions can cause estimation results to shift dramatically. The knowledge probe method completely bypasses these uncertainties.

3. A scalable black-box testing framework. The entire testing process only requires sending queries to the target model via API and collecting responses, with no need for any internal model access — making it naturally suited for evaluating closed-source commercial models.

Industry Significance: Advancing AI Transparency and Fair Competition

The significance of this research extends well beyond the technical level. Currently, model parameter count is a key metric for measuring AI system scale, cost, and capability, and serves as a critical reference for investors, regulators, and users evaluating AI products. Closed-source labs' refusal to disclose parameter counts creates information asymmetry to a certain extent, undermining fair competition and effective regulation in the industry.

If this method can be validated as sufficiently accurate in practice, it would provide independent third-party organizations with a low-cost, high-credibility model-scale auditing tool, helping drive the entire AI industry toward greater transparency.

Outlook and Limitations

Of course, this method still faces several challenges. First, it provides a lower bound on parameter count rather than a precise value — models may possess far more parameters than needed for fact storage alone, with surplus capacity devoted to generalization, reasoning, and other capabilities. Second, techniques such as knowledge distillation and Mixture of Experts (MoE) architectures may affect the actual efficiency of bits per parameter, requiring further theoretical and experimental validation.

Nevertheless, this research opens a new direction for understanding and measuring large models from the perspective of knowledge capacity, offering a creative solution for model evaluation in the era of closed-source AI that merits continued attention.