DeepSeek Unveils V3.2-Exp: A Game-Changer for Long-Context AI Tasks
DeepSeek AI has unveiled DeepSeek-V3.2-Exp, an update to its language model that promises significant improvements in efficiency and cost for long-context tasks.
The new version, a drop-in replacement for Retrieval-Augmented Generation (RAG) and long-document pipelines, introduces DeepSeek Sparse Attention (DSA). DSA splits the attention path into two tiers: a lightweight 'indexer' and sparse attention over a selected subset. This results in a substantial reduction in decode costs, making them around 6× cheaper at 128k.
DeepSeek-V3.2-Exp retains the V3/V3.1 stack, featuring Mixture-of-Experts (MoE) and Multi-Head Latent Attention (MLA). The new DSA is implemented under MLA in MQA mode for decoding, reusing latent key-value (KV) entries across query heads. The indexer is trained to mimic the dense model's attention distribution using KL-divergence in a dense warm-up stage and a sparse training stage.
DeepSeek V3.2-Exp is now truly open source, licensed under MIT. While there are no specific details about the cost model for API access with this update, DeepSeek has consistently reduced API prices by 50% in line with its efficiency gains. The new version shows benchmark parity while improving long-context economics with a 50%+ API price cut.
Read also:
- Trump and Xi speak over the phone, according to China's confirmation.
- Economic Growth of Nitric Acid for Electronic Applications Anticipated to Reach 5.8% by 2034
- NVIDIA introduces Blackwell to the cloud and unveils the significant enhancement of GeForce Now at Gamescom 2025, marking a major step in cloud gaming technology.
- App Store Faces Threat of Lawsuit from Elon Musk over Accusations of Unfair AI Preference