How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals

Set up and run the GPQA-Diamond benchmark on DeepSeek-R1’s distilled models locally to evaluate its reasoning capabilities.


This content originally appeared on Level Up Coding - Medium and was authored by Kenneth Leung

Set up and run the GPQA-Diamond benchmark on DeepSeek-R1's distilled models locally to evaluate its reasoning capabilities.


This content originally appeared on Level Up Coding - Medium and was authored by Kenneth Leung


Print Share Comment Cite Upload Translate Updates
APA

Kenneth Leung | Sciencx (2025-05-13T13:27:49+00:00) How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals. Retrieved from https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/

MLA
" » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals." Kenneth Leung | Sciencx - Tuesday May 13, 2025, https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/
HARVARD
Kenneth Leung | Sciencx Tuesday May 13, 2025 » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals., viewed ,<https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/>
VANCOUVER
Kenneth Leung | Sciencx - » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/
CHICAGO
" » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals." Kenneth Leung | Sciencx - Accessed . https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/
IEEE
" » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals." Kenneth Leung | Sciencx [Online]. Available: https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/. [Accessed: ]
rf:citation
» How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals | Kenneth Leung | Sciencx | https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.