Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost

This content originally appeared on HackerNoon and was authored by This Week in AI Engineering

Hello AI Enthusiasts!

Welcome to the Twenty-Fourth edition of "This Week in AI Engineering"!

This week, the spotlight shines on MiniMax, the Chinese AI startup that just released a frontier-level open-weight reasoning model, MiniMax-M1, with some jaw-dropping benchmarks. We also saw Google introduce a new Flash-Lite variant that's faster and cheaper. Meanwhile, Kimi-Dev-72B emerges as one of the strongest open-source coding models ever, targeting real-world debugging workflows with a two-agent architecture.

As always, we’ll wrap things up with under-the-radar tools and releases that deserve your attention.

MiniMax-M1 is INSANE

Chinese startup MiniMax is back in the spotlight with their new open-weight reasoning model, MiniMax-M1, and it is nothing short of impressive. M1 supports a context window of 1 million tokens, putting it in the same class as Gemini 2.5 Pro. But here’s the kicker: thanks to its hybrid Mixture-of-Experts architecture and lightning attention mechanism, it achieves the same reasoning quality as DeepSeek R1 at just 25% of the compute cost. And yes, it’s completely open sourced.

Variants & Benchmarks \n MiniMax-M1 comes in two variants: M1-40K and M1-80K, referring to their token output capacities. Both are built on the 456B parameter MiniMax-Text-01 foundation, with just 45.9B activated per token. That MoE architecture makes inference cheaper and faster.
On AIME 2024, M1-80K scored 86.0% accuracy. It also logged:
65.0% on LiveCodeBench
56.0% on SWE-bench Verified
62.8% on TAU-bench
73.4% on OpenAI MRCR (4-needle version)
These results place it ahead of Qwen3-235B and DeepSeek R1 on long-context and software reasoning tasks.

Training Cost

The most shocking detail is it was trained with just $534,700 worth of compute, using 512 NVIDIA H800 GPUs for three weeks. Compare that to DeepSeek’s $5.6 million or OpenAI’s hundred-million-dollar pipelines, and you realize how aggressively MiniMax is optimizing for cost-efficiency without compromising on performance.

Open Access and Developer Features

MiniMax-M1 includes structured function calling, online search-enabled chatbots, image/video generation, and voice cloning via API. For deployment, it supports vLLM and Transformers-based backends for enterprise-ready serving.
This is a massive win for open-access frontier models, especially for long-context workflows and agent development.

MiniMax Isn’t Done Yet: Meet Hailuo 02

Right after dropping M1, they also released Hailuo 02 , their most advanced text-to-video and image-to-video model yet , and it's turning heads.
With 6-second clips at 768p and native support for detailed prompts, Hailuo delivers physically coherent, visually sharp, and story-driven outputs that rival even Google’s Veo 3.
What really sets it apart is the realistic motion and camera control. Think accurate gravity, collisions, fluid effects. And the pricing’s competitive too. At $0.25 per 6s clip or $0.52 for 10s, it’s cheaper than most closed models with this level of fidelity.
MiniMax also ships an API with Hailuo, making it easier for devs to integrate. If you’re building for VFX, cinematic content, or interactive story tools , this one’s worth a test run.

Gemini 2.5 Flash-Lite: Google’s Cheapest

Google has officially made Gemini 2.5 Pro and Flash generally available for production use. These hybrid reasoning models have already been deployed by partners like Snap, Rooms, and SmartBear. But the real highlight is the new Gemini 2.5 Flash-Lite, now in preview. It’s the fastest and cheapest model in the 2.5 family. Despite that, it outperforms Gemini 2.0 Flash-Lite in coding, math, reasoning, science, and multimodal benchmarks.

Flash-Lite supports:

Tool use via code execution and Google Search
Multimodal input (text, images, audio)
1 million-token context length
Low-latency, high-throughput tasks like classification, translation, and data extraction
The model is now live in Google AI Studio, Vertex AI, and the Gemini app. Early demos include converting PDFs into interactive dashboards and automating analytics reports from unstructured text.
Gemini 2.5 Flash-Lite is a strong contender for real-time AI assistants and high-volume internal tooling.

The Best Open Coding Model Yet?

Moonshot AI’s new Kimi-Dev-72B just hit 60.4% on SWE-bench Verified, making it the strongest open-weight coding model right now. What makes Kimi-Dev different is its dual-agent setup. The model uses two specialized agents:

BugFixer, which identifies and patches faulty code
TestWriter, which generates unit tests to confirm and prevent regressions
Both agents follow a 2-step routine of file localization and precise code edits. The model is trained on over 150B tokens of real-world GitHub issues and PRs, and then fine-tuned with reinforcement learning and a self-play mechanism to handle complex debugging tasks.
What stands out is its outcome-based reward system and curriculum-style training pipeline, which boosts success rates by filtering weak prompts and reinforcing correct solutions.
It’s available on GitHub and Hugging Face with model weights, source code, and full tech report to follow. If you’re building automated code review, debugging, or developer agent tools, this is a serious contender.

AI Video Gets Wild: Kling & Midjourney

If you thought AI video couldn’t get more cinematic, wait till you see this. Chinese startup KlingAI dropped a Studio Ghibli–style short, complete with hand-drawn textures, dreamy movements. They also shared some ASMR videos. The timing, the rhythm, the SFX matches perfectly.
Meanwhile, Midjourney just opened up its V1 video model ,turning any image into a stylized animation. You get to control motion intensity, select “low” or “high” movement, and even tweak the pacing. The only catch is it costs 8x more credits than a regular image gen. But for creators who already love Midjourney’s aesthetic, it might be worth the price.

Tools & Releases YOU Should Know About

Unicorn Platform is an AI-first website builder tailored for indie creators, startups, and SaaS founders. It comes with drag-and-drop templates, AI-powered copywriting, and built-in translation, all optimized for fast deployment. The platform also includes SSL, CDN, SEO tools, and integrations for forms and newsletters. The free plan includes one live site, while paid plans unlock team features and multiple projects.

CodingFleet's Python Code Generator streamlines development by transforming natural language instructions into production-ready code through an intuitive interface. The tool supports 60+ programming languages and frameworks. Users simply describe their requirements in plain English, and CodingFleet delivers clean, documented code snippets with implementation guidance.It's built for developers who want fast, precise outputs across stacks.

**AirCodum **lets developers to seamlessly interact with their coding environment using touch, voice, and custom keyboard commands. With AirCodum, users can transfer files, images, and code snippets between their mobile devices and VS Code effortlessly.

And that wraps up this issue of "This Week in AI Engineering."

Thank you for tuning in! Be sure to share this newsletter with your fellow AI enthusiasts and follow for more weekly updates.

This content originally appeared on HackerNoon and was authored by This Week in AI Engineering

Print Share Comment Cite Upload Translate Updates

APA

This Week in AI Engineering | Sciencx (2025-06-23T03:34:11+00:00) Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost. Retrieved from https://www.scien.cx/2025/06/23/chinese-ai-model-promises-gemini-2-5-pro-level-performance-at-one-fourth-of-the-cost/

MLA

" » Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost." This Week in AI Engineering | Sciencx - Monday June 23, 2025, https://www.scien.cx/2025/06/23/chinese-ai-model-promises-gemini-2-5-pro-level-performance-at-one-fourth-of-the-cost/

HARVARD

This Week in AI Engineering | Sciencx Monday June 23, 2025 » Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost., viewed ,<https://www.scien.cx/2025/06/23/chinese-ai-model-promises-gemini-2-5-pro-level-performance-at-one-fourth-of-the-cost/>

VANCOUVER

This Week in AI Engineering | Sciencx - » Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/06/23/chinese-ai-model-promises-gemini-2-5-pro-level-performance-at-one-fourth-of-the-cost/

CHICAGO

" » Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost." This Week in AI Engineering | Sciencx - Accessed . https://www.scien.cx/2025/06/23/chinese-ai-model-promises-gemini-2-5-pro-level-performance-at-one-fourth-of-the-cost/

IEEE

" » Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost." This Week in AI Engineering | Sciencx [Online]. Available: https://www.scien.cx/2025/06/23/chinese-ai-model-promises-gemini-2-5-pro-level-performance-at-one-fourth-of-the-cost/. [Accessed: ]

rf:citation

» Chinese AI Model Promises Gemini 2.5 Pro-level Performance at One-fourth of the Cost | This Week in AI Engineering | Sciencx | https://www.scien.cx/2025/06/23/chinese-ai-model-promises-gemini-2-5-pro-level-performance-at-one-fourth-of-the-cost/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.