Arm, Meta Partner to Improve AI Power Efficiency Post date October 16, 2025 Post author By Sheharyar Khan Post categories In ai, ai-efficiency, arm-and-meta, data-centers, executorch, generative-ai-efficiency, neoverse, pytorch
Value Today Means Moving Faster Than the Plan Post date August 20, 2025 Post author By hackernoon Post categories In ai, ai-efficiency, ai-in-ui-design, ai-in-ux, digital-opportunities, merging-ai-roles, product-design, product-management
Self-Speculative Decoding Speeds for Multi-Token LLMs Post date June 6, 2025 Post author By Large Models (dot tech) Post categories In ai-efficiency, code-generation, inference-optimization, llm-decoding-speed, llm-inference, multi-token-models, multi-token-prediction, self-speculative-decoding
The Hidden Power of “Cherry” Parameters in Large Language Models Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity
Rethinking AI Quantization: The Missing Piece in Model Efficiency Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity
The Future of AI Compression: Smarter Quantization Strategies Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity
The Impact of Parameters on LLM Performance Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity
Can ChatGPT-Style Models Survive Quantization? Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity
The Perplexity Puzzle: How Low-Bit Quantization Affects AI Accuracy Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity
The Science of “Cherry” Parameters: Why Some LLM Weights Matter More Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity
Quantizing Large Language Models: Can We Maintain Accuracy? Post date March 6, 2025 Post author By Disproportionate Techstack Post categories In ai-efficiency, ai-model-optimization, cherryq-algorithm, llm-performance, llm-quantization, low-bit-quantization, mixed-precision-training, parameter-heterogeneity