Skip to content

DeepSeek Unveils V3.2-Exp: A Game-Changer for Long-Context AI Tasks

DeepSeek's new model slashes costs for long-context tasks. DSA makes it 6× cheaper at 128k. Plus, it's now fully open source.

In this image we can see a tablet in a person's hand, there are some texts and image on the tablet,...
In this image we can see a tablet in a person's hand, there are some texts and image on the tablet, there are some plants on the racks, and the background is blurred.

DeepSeek Unveils V3.2-Exp: A Game-Changer for Long-Context AI Tasks

DeepSeek AI has unveiled DeepSeek-V3.2-Exp, an update to its language model that promises significant improvements in efficiency and cost for long-context tasks.

The new version, a drop-in replacement for Retrieval-Augmented Generation (RAG) and long-document pipelines, introduces DeepSeek Sparse Attention (DSA). DSA splits the attention path into two tiers: a lightweight 'indexer' and sparse attention over a selected subset. This results in a substantial reduction in decode costs, making them around 6× cheaper at 128k.

DeepSeek-V3.2-Exp retains the V3/V3.1 stack, featuring Mixture-of-Experts (MoE) and Multi-Head Latent Attention (MLA). The new DSA is implemented under MLA in MQA mode for decoding, reusing latent key-value (KV) entries across query heads. The indexer is trained to mimic the dense model's attention distribution using KL-divergence in a dense warm-up stage and a sparse training stage.

DeepSeek V3.2-Exp is now truly open source, licensed under MIT. While there are no specific details about the cost model for API access with this update, DeepSeek has consistently reduced API prices by 50% in line with its efficiency gains. The new version shows benchmark parity while improving long-context economics with a 50%+ API price cut.

Read also:

Latest