ICY Tech and SEMIFIVE Tape Out Asia's First 8nm eMRAM Edge AI Chip
SEMIFIVE, a South Korean chip design services company, and ICY Tech (寒序科技), a Chinese spintronics chip developer, have successfully taped out a next-generation edge AI system-on-chip (SoC) using Samsung Foundry's 8LPU process with embedded MRAM technology. The milestone marks Asia's first commercial deployment of 8nm eMRAM technology, signaling a potentially significant shift in how edge AI devices handle memory-intensive inference workloads.
The announcement, made on May 7, also represents SEMIFIVE's first ASIC design project leveraging eMRAM — a non-volatile memory technology that could reshape the economics of on-device AI processing.
Key Takeaways
- First in Asia: The tape-out is the first commercial deployment of 8nm eMRAM technology on the continent
- Samsung 8LPU process: The chip leverages Samsung Foundry's mature yet capable 8nm low-power process node
- Processing Near Memory (PNM): ICY Tech's architecture addresses the bandwidth bottleneck that plagues edge AI inference
- 2B parameter support: The SoC can run models with up to 2 billion parameters entirely on-device
- Non-volatile advantage: eMRAM eliminates the need for periodic refresh cycles required by DRAM, cutting power consumption
- Higher density than SRAM: Smaller cell sizes enable more memory capacity in the same die area
What Makes eMRAM a Game-Changer for Edge AI
Embedded Magnetic Random Access Memory (eMRAM) represents a fundamentally different approach to on-chip data storage. Unlike conventional DRAM, which stores data as electrical charges that leak over time and require constant refreshing, eMRAM uses the Magnetic Tunnel Junction (MTJ) structure to store information via electron spin states rather than electrical charge.
This distinction carries enormous practical implications. Because spin states are inherently stable, eMRAM retains data almost indefinitely without power — making it truly non-volatile. The elimination of refresh cycles translates directly into lower power consumption, a critical advantage for battery-powered edge devices.
Compared to SRAM, which is traditionally used for on-chip caches and buffers, eMRAM achieves significantly higher data density thanks to smaller cell sizes. Where a typical 6-transistor SRAM cell occupies a relatively large silicon footprint, an MTJ-based eMRAM cell is far more compact. This means designers can pack more memory onto the same die, enabling larger on-chip buffers without ballooning chip area or cost.
For edge AI applications — where every milliwatt and every square millimeter matters — this combination of non-volatility, low power, and high density positions eMRAM as an increasingly attractive alternative to legacy memory technologies.
Processing Near Memory Tackles the Bandwidth Bottleneck
One of the most technically interesting aspects of ICY Tech's new SoC is its adoption of a Processing Near Memory (PNM) architecture. This design philosophy places compute logic as close as physically possible to the memory arrays, dramatically reducing the energy and latency costs of shuttling data back and forth across a chip.
The bandwidth bottleneck is arguably the single biggest challenge facing edge AI inference today. Running even modest large language models (LLMs) on-device requires moving enormous volumes of weight parameters and activations between memory and processing units. In traditional architectures, this data movement — not the actual computation — dominates both power consumption and latency.
PNM architectures attack this problem at its root. By co-locating compute and memory, ICY Tech's design minimizes data transfer distances and enables higher effective bandwidth without requiring wider, more power-hungry memory buses. The result is a chip that can support 2 billion parameter models running entirely at the edge — a capability that would have been impractical with conventional architectures at this process node.
This 2B parameter threshold is significant. Models of this size, such as Microsoft's Phi-2 or various distilled versions of Llama and Qwen, are capable of handling a wide range of practical NLP tasks including summarization, classification, translation, and basic conversational AI — all without cloud connectivity.
Why Samsung's 8LPU Process Was the Right Choice
The decision to use Samsung Foundry's 8LPU (8nm Low Power Ultra) process node reflects a pragmatic approach to edge AI chip design. While leading-edge nodes like 3nm and 4nm grab headlines, the 8nm node offers a compelling balance of performance, power efficiency, and cost-effectiveness for edge applications.
Samsung's 8LPU process is a mature, well-characterized node with established design rules and reliable yields. For a first-of-its-kind eMRAM integration, this maturity reduces risk significantly. Attempting to integrate a novel memory technology on a bleeding-edge process node would compound the engineering challenges and potentially jeopardize the project.
Moreover, edge AI chips operate under fundamentally different constraints than data center accelerators. They do not need the raw transistor density of a 3nm process. Instead, they need:
- Low standby and active power consumption for battery or energy-harvested operation
- Sufficient compute density to run inference on models up to a few billion parameters
- Cost-effective manufacturing to enable deployment at scale across IoT and embedded devices
- Reliable integration of heterogeneous technologies like eMRAM alongside standard logic
Samsung's 8LPU checks all these boxes, making it an ideal foundation for ICY Tech's PNM-based edge AI SoC.
SEMIFIVE's Growing Role in the AI Chip Ecosystem
For SEMIFIVE, this tape-out represents a strategic expansion into emerging memory technologies. Founded in 2019, the Seoul-based company has built its reputation as a platform-based ASIC design services provider, helping clients navigate the complexity of modern chip design without maintaining a full in-house semiconductor team.
The successful eMRAM integration demonstrates SEMIFIVE's ability to handle non-standard design challenges — a differentiator in an increasingly competitive chip design services market. As more startups and established companies seek custom silicon for AI workloads, firms like SEMIFIVE that can integrate novel technologies will command premium positions in the value chain.
This project also highlights the growing cross-border collaboration in Asia's semiconductor ecosystem. Despite geopolitical tensions that have complicated some technology partnerships, the ICY Tech-SEMIFIVE collaboration shows that technical cooperation between Chinese AI chip designers and Korean semiconductor service providers continues to produce results.
The broader chip design services market has been expanding rapidly, driven by the proliferation of AI workloads across edge, mobile, and IoT segments. According to industry estimates, the global ASIC market is projected to exceed $30 billion by 2028, with edge AI representing one of the fastest-growing segments.
Industry Context: The Race to Bring AI to the Edge
The ICY Tech-SEMIFIVE tape-out arrives amid an industry-wide push to move AI inference from the cloud to edge devices. Major players across the semiconductor landscape are investing heavily in this transition:
- Qualcomm has integrated NPUs into its Snapdragon mobile processors capable of running 7B+ parameter models on smartphones
- MediaTek has embedded AI accelerators in its Dimensity chipsets targeting mid-range and flagship devices
- Intel continues to push its Meteor Lake and subsequent architectures with dedicated AI tiles
- Apple has steadily expanded the Neural Engine in its M-series and A-series chips
- Startup ecosystem players like Hailo, Syntiant, and now ICY Tech are targeting specialized edge AI niches
What distinguishes ICY Tech's approach is the use of eMRAM as a core architectural element rather than a supplementary feature. Most edge AI chips today rely on conventional SRAM for on-chip memory and external LPDDR for larger model storage. By integrating eMRAM directly into the SoC with a PNM architecture, ICY Tech is betting on a more tightly coupled compute-memory paradigm.
This approach aligns with broader industry trends toward in-memory computing and near-memory processing, which academic researchers and companies like Samsung, SK Hynix, and TSMC have been exploring for years. ICY Tech's successful tape-out suggests these concepts are now moving from research labs to commercial silicon.
What This Means for Developers and Businesses
For developers building edge AI applications, this chip could open new possibilities in several key areas. On-device inference with 2B parameter models enables privacy-preserving AI processing — data never leaves the device, addressing regulatory requirements under frameworks like GDPR and emerging AI governance rules.
The non-volatile nature of eMRAM also has practical implications for deployment. Devices can retain model weights in memory even when powered off, enabling instant-on AI capabilities without the boot-time latency associated with loading models from flash storage into volatile SRAM or DRAM. This is particularly valuable for industrial IoT, automotive, and security applications where response time is critical.
Businesses deploying edge AI at scale stand to benefit from the cost and power advantages of eMRAM-based SoCs. Lower power consumption extends battery life and reduces thermal management requirements, while higher memory density potentially reduces bill-of-materials costs by eliminating the need for external memory chips.
Looking Ahead: From Tape-Out to Commercial Deployment
A successful tape-out is a critical milestone, but it is not the finish line. ICY Tech and SEMIFIVE must now validate the silicon through extensive testing, characterize performance and power across operating conditions, and work with potential customers to develop reference designs and software stacks.
Several key questions remain for the months ahead:
- Yield and reliability: How well does the eMRAM integration perform at volume manufacturing? MTJ-based memories have historically faced endurance and retention challenges that must be thoroughly validated.
- Software ecosystem: What frameworks and toolchains will support model deployment on this architecture? Compatibility with popular inference frameworks like ONNX Runtime, TensorFlow Lite, or proprietary SDKs will be essential.
- Commercial timeline: When will production-grade chips be available for design-in by OEMs and system integrators?
- Competitive positioning: How will this chip's performance-per-watt compare against established edge AI solutions from Qualcomm, MediaTek, and specialized startups?
If ICY Tech can answer these questions favorably, its eMRAM-based edge AI SoC could carve out a meaningful niche in the rapidly expanding market for on-device AI processing. The successful tape-out proves the technology works in silicon — the next challenge is proving it works in the market.
For the broader semiconductor industry, this milestone validates eMRAM as a commercially viable technology for advanced SoC integration. As AI models continue to proliferate across every category of electronic device, memory technologies that can deliver higher density, lower power, and non-volatility simultaneously will become increasingly indispensable. Asia's first 8nm eMRAM tape-out may well be remembered as an early inflection point in that transition.
📌 Source: GogoAI News (www.gogoai.xin)
🔗 Original: https://www.gogoai.xin/article/icy-tech-and-semifive-tape-out-asias-first-8nm-emram-edge-ai-chip
⚠️ Please credit GogoAI when republishing.