DeepSeek V4 set to launch in mid-July with peak hour pricing that doubles current rates

DeepSeek’s V4 series, expected to fully launch in mid-July, will implement a tiered pricing structure that doubles token rates during designated peak periods.

How the pricing works

The peak windows are set from 9:00 to 12:00 and 14:00 to 18:00 Beijing Time. During those hours, both input and output token costs will double across the two V4 models: deepseek-v4-pro and deepseek-v4-flash.

V4-Flash currently sits at roughly $0.14 per million input tokens (on cache misses) and $0.28 per million output tokens. For V4-Pro, peak output pricing lands around 12 yuan per million tokens, approximately $1.76. V4-Flash peak output comes in at about 4 yuan per million tokens, or roughly $0.59.

DeepSeek says users will receive 24-hour advance notification before any price changes take effect.

What V4 actually brings to the table

The V4 series first appeared as a preview on April 24, 2026. Both models use a Mixture of Experts (MoE) architecture. V4-Pro packs 1.6 trillion total parameters. V4-Flash comes in at 284 billion total parameters. Both support context windows of 1 million tokens and were trained on a dataset exceeding 32 trillion tokens. Both ship under an MIT license.

Why surge pricing matters for the AI industry

Beijing Time peak hours correspond to late evening and overnight in the US (roughly 9 PM to 6 AM Eastern), which means American developers running workloads during their own business day would actually hit DeepSeek’s off-peak rates.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Source link

DeepSeek V4 set to launch in mid-July with peak hour pricing that doubles current rates

How the pricing works

What V4 actually brings to the table

Why surge pricing matters for the AI industry

Leave a Reply Cancel reply

Meet the Fintech Founder Taking the Reins as Whatsapp CEO – FinTech Magazine