Try DeepSeek v3 on Vector Robot
We have found that Vector integrated with Wirepod is very useful to experiment with the latest Large Language Models (LLMs) released by different research teams. As an example, previously, we have evaluated o1-preview from OpenAI, and Llama 3.1 405B from Meta.
Deepseek v3
Deepseek v3 is the latest LLM catching the news waves. There are many aspects of Deepseek v3 that make it particularly interesting and newsworthy. This post discusses some cool features of Deepseek v3, and how you can try this model using Vector + Wirepod.
DeepSeek-V3 is a Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated during inference for each token. The Mixture of Experts architecture means that although the model is huge and has a large number of parameters (671 Billion), only a small segment of the model (37B parameters) is used to generate each token. The choice of which segment needs to be used is made by the model, and the segments can be different across tokens. Running inf…