How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals Post date May 13, 2025 Post author By Kenneth Leung Post categories In deepseek, generative-ai-tools, large-language-models, llm, openai