https://huggingface.co/deepseek-ai/DeepSeek-V2.5
deepseek-ai/DeepSeek-V2.5 · Hugging Face
Paper Link👁️ DeepSeek-V2.5 1. Introduction DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. For model details, p
huggingface.co
v2는 논문하고 구조가 있는데 2.5는 없음...
https://github.com/deepseek-ai/DeepSeek-V2
GitHub - deepseek-ai/DeepSeek-V2: DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model - deepseek-ai/DeepSeek-V2
github.com
MLA가 메인인 논문
'인공지능' 카테고리의 다른 글
iTransformer: Inverted Transformers Are Effective for Time Series Forecasting (1) | 2024.11.25 |
---|---|
Pixtral 12B (1) | 2024.11.23 |
De novo design of high-affinity protein binders with AlphaProteo (3) | 2024.11.22 |
AlphaProteo generates novel proteins for biology and health research (1) | 2024.11.22 |
VASA-1: Lifelike Audio-Driven Talking FacesGenerated in Real Time (3) | 2024.11.21 |