Agentic AI ignites efficiency race amid memory crunch

SK hynix's 12-layer HBM4 memory chips on display at the SK AI Summit in Seoul on Nov. 3 [YONHAP]

What price do we pay when AI takes on human tasks? As the saying goes, “There is no free lunch.”

Artificial intelligence is consuming memory chips at an explosive pace, with no viable substitutes as of present.

Memory chips provide the high-speed storage AI models need to perform computations. As their performance improves, the amount of data that can be fed into AI models increases and processing speeds rise. They are also considered essential for delegating multi-process tasks to AI agents, a type of AI system that can carry out tasks on its own based on instructions.

But that demand is quickly creating a bottleneck.

Rihard Jarc, chief investment officer at U.S. tech-focused investment firm New Era Funds, claimed that an analysis of the source code of Anthropic’s Claude showed AI’s memory consumption was higher than expected.

“AI coding agents consumed up to 93 to 129GB of memory during active sessions, with idle processes still using around 15GB each," Jarc said on X on March 31.

A smartphone displaying the logo of the U.S. artificial intelligence safety and research company Anthropic [AFP/YONHAP]

At peak use, in this case, a single agent can take up about 40 percent of a 288-gigabyte HBM4 memory stack, showing how much memory these systems require. If multiple AI agents are deployed, total memory demand would increase substantially.

Demand for AI agents themselves is also rising rapidly.

The number of AI agents in use worldwide reached 28.5 million last year and is expected to grow to 2.2 billion by 2030, according to global market research firm Statista. The rise of “multi-agent” systems, where a single user runs multiple AI agents, is also gaining traction.

Anthropic has reportedly recently begun shifting its pricing model from time-based subscriptions to usage-based fees, as demand exceeds the company’s available memory capacity.

“Even processing text data alone is already straining memory capacity, and demand will surge further when video and images are included," Kim Tae-ho, chief technology officer at Nota AI, said.

AI has options when it comes to computing chips, like how GPUs which can be replaced by neural processing units (NPU) and Google’s tensor processing units (TPU). But, for memory semiconductors, its dominant option is HBM (high bandwidth memory), a 3-D-stacked, high-performance computer memory interface.

As a result, companies are increasingly focused on reducing how much memory AI models use.

Google’s TurboQuant, introduced in March though the company's official blog, reflects this shift. The technology improves AI efficiency by addressing memory bottlenecks that occur when large language models process data. Google plans to present a full paper on TurboQuant and have it reviewed by peers at the International Conference on Learning Representations, which runs from Thursday to Monday in Brazil.

Korean startups are also joining the race to improve memory efficiency.

AI startup Nota has developed technology that compresses data exchanged between AI models and accelerators to reduce memory usage. MakinaRocks and ActionPower, AI transformation startups which adopt and integrate artificial intelligence into their operations, are focusing on reducing memory use to enable “on-device” AI, which runs directly on devices.

“To run AI agents on devices like laptops and smartphones, which cannot accommodate large memory chips, minimizing memory usage is the best approach," Lee Ji-hwa, chief technology officer at ActionPower, said.

The industry expects a shift toward AI models that are built depending on memory usage.

“AI developers are currently focused on building state-of-the-art models to dominate the market, but as adoption accelerates, cheaper and lighter models will emerge, leading to market segmentation," an industry source said.

This article was originally written in Korean and translated by a bilingual reporter with the help of generative AI tools. It was then edited by a native English-speaking editor. All AI-assisted translations are reviewed and refined by our newsroom.
BY OH HYEON-WOO. [lee.jian@joongang.co.kr]

Header Ads

Agentic AI ignites efficiency race amid memory crunch

Related Article

SK hynix begins mass production of next-gen AI server memory module

Can Korea protect its memory semiconductor dominance?

Google's TurboQuant will ease bottlenecks, not cut memory demand: Analysts

No comments

Facebook

Links and Affiliates

About

Fashion

Recent

Comments

Subscribe Us

Photography

Blog Archive

Tags

Custom Widget

Recent in Sports

Beauty