Comparing Efficiency Strategies for LLM Deployment and Summarizing PowerInfer‑2’s Impact Post date November 3, 2025 Post author By Writings, Papers and Blogs on Text Models Post categories In edge-computing, mobile-ai, model-optimization, neural-efficiency, on-device-llm, power-infer, quantization, speculative-decoding
Performance Evaluation of PowerInfer‑2: Offloading, Prefill, and In‑Memory Efficiency Post date November 3, 2025 Post author By Writings, Papers and Blogs on Text Models Post categories In ai-infrastructure, benchmarking, Edge AI, mobile-inference, model-optimization, on-device-llm, power-infer, sparse-computing
Training Deep-Learning Models At Ultra-Scale Using PyTorch Post date April 9, 2025 Post author By Sahib Dhanjal Post categories In ai, deep-learning, model-optimization, model-parallelism, scalability
Experiments Post date April 8, 2025 Post author By Machine Ethics Post categories In computational-efficiency, computer-vision-(cv), early-bird-ticket-hypothesis, language-models, model-optimization, natural-language-processing, transformer-models, vision-transformers
How We Found Early-Bird Subnetworks in Transformers Without Retraining Everything Post date April 8, 2025 Post author By Machine Ethics Post categories In computational-efficiency, computer-vision-(cv), early-bird-ticket-hypothesis, language-models, model-optimization, natural-language-processing, transformer-models, vision-transformers
Transformer Training Optimization via Early-Bird Ticket Analysis Post date April 8, 2025 Post author By Machine Ethics Post categories In computational-efficiency, computer-vision-(cv), early-bird-ticket-hypothesis, language-models, model-optimization, natural-language-processing, transformer-models, vision-transformers