How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals

This content originally appeared on Level Up Coding - Medium and was authored by Kenneth Leung

Set up and run the GPQA-Diamond benchmark on DeepSeek-R1's distilled models locally to evaluate its reasoning capabilities.

Continue reading on Level Up Coding »

This content originally appeared on Level Up Coding - Medium and was authored by Kenneth Leung

Print Share Comment Cite Upload Translate Updates

APA

Kenneth Leung | Sciencx (2025-05-13T13:27:49+00:00) How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals. Retrieved from https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/

MLA

" » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals." Kenneth Leung | Sciencx - Tuesday May 13, 2025, https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/

HARVARD

Kenneth Leung | Sciencx Tuesday May 13, 2025 » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals., viewed ,<https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/>

VANCOUVER

Kenneth Leung | Sciencx - » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/

CHICAGO

" » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals." Kenneth Leung | Sciencx - Accessed . https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/

IEEE

" » How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals." Kenneth Leung | Sciencx [Online]. Available: https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/. [Accessed: ]

rf:citation

» How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals | Kenneth Leung | Sciencx | https://www.scien.cx/2025/05/13/how-to-benchmark-deepseek-r1-distilled-models-on-gpqa-using-ollama-and-openais-simple-evals/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Related Posts