DeepSeek releases Prover-V2 model with 671 billion parameters
DeepSeek today released a new model named DeepSeek-Prover-V2-671B on the AI open source community Hugging Face. It is reported that DeepSeek-Prover-V2-671B uses a more efficient safetensors file format and supports multiple calculation precisions, making it easier and more resource-efficient to train and deploy models faster, with 671 billion parameters, or an upgraded version of the Prover-V1.5 mathematical model released last year. In terms of model architecture, the model uses the DeepSeek-V3 architecture, adopts the MoE (Mixture of Experts) mode, has 61 Transformer layers, and a 7168-dimensional hidden layer. It also supports ultra-long contexts, with a maximum position embedding of 163,800, making it capable of handling complex mathematical proofs. It also uses FP8 quantization to reduce the model size and improve inference efficiency through quantization technology. (Jinse)
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
US Non-Farm Payrolls Exceed Forecasts; Dollar Gains

April NFP Data Reduces June Rate Cut Expectation

Blockchain Group’s $24 Billion Bitcoin Plan Sparks Crypto Market Surge

Security alert: Google Subpoena Scam Targets users’ personal information

Trending news
MoreCrypto prices
More








