DeepSeek releases new AI tech that makes large models run on cheaper hardware

Language

China

2026.01.13 22:57 GMT+8

DeepSeek releases new AI tech that makes large models run on cheaper hardware

Updated 2026.01.13 22:57 GMT+8

Gong Zhe

/VCG

DeepSeek, the team behind some of the world's most powerful open-weight AI models, on Tuesday dropped a new paper it co-authored that could change how we think about AI's memory use. The research, personally spearheaded by founder Liang Wenfeng, introduces a way to run massive models using far less precious video memory.

The secret sauce is a tech called "conditional memory." Much like DeepSeek's previous work with Mixture-of-Experts (MoE), this tech is all about efficiency. It separates an AI's "logic" from its "knowledge," allowing the bulk of its data to be stored in cheaper, more accessible hardware.

What's more, the tech enables almost-instant searching in its knowledge base. While today's mainstream solution – retrieval-augmented generation (RAG) – often feels clunky and slow, DeepSeek's method is almost instant. It's like having a library where, the moment you think of a question, the right book teleport-opens to the correct page in your hand.

DeepSeek has released the official code for this tech under the name "Engram."

"Engram enables the model to effectively scale its knowledge capacity … allowing the model to perform better on knowledge-intensive tasks while maintaining high training and inference efficiency," the paper said.

For users, this means the future of AI will likely be cheaper, faster and much better at remembering what you said fifty prompts ago.

DeepSeek releases new AI tech that makes large models run on cheaper hardware

RELATED STORIES