Quick Framework and some Performance Improvements

I’m very excited to share more improvements implemented in the Quick Framework, developed by @jeffotoni. These updates focus on robustness, performance, and support for modern protocols.

Early mornings improving Quick

As we’re using Quick for an AI (…


This content originally appeared on DEV Community and was authored by jefferson otoni lima

I'm very excited to share more improvements implemented in the Quick Framework, developed by @jeffotoni. These updates focus on robustness, performance, and support for modern protocols.

Early mornings improving Quick

As we're using Quick for an AI (Artificial Intelligence) communication project, several needs have emerged that have led to new implementations and improvements to Quick. So, I'll present some improvements made this week.

There's nothing better than putting a framework to the test in practice, and it's fascinating! I'm using it in AI projects, developing native servers to orchestrate flows with LLMs, creating custom connectors, and continuously building RAG solutions. It's been a challenging yet very exciting experience.

Reflections from a curious developer

The path is arduous, long, but incredibly enjoyable. I'm driven by curiosity, and sometimes I can't contain myself. I need to study again, revisit something that was right there with me all along, but only now makes sense, you know? It wasn't the right moment before, perhaps, but it was there, so close and yet so distant. And this belated discovery intrigues me even more.

Despite years of practice and development, I often feel as if I'm seeing the world of technology through a crack, like someone peeking at the "whole" through a keyhole. Reality seems fragmented; we only access bits and pieces, flashes, never the complete whole, and that's intriguing. And perhaps we'll never truly see it.

That's why I decided to create these posts, not only to show in practice what was done for the Quick framework, but also to share some sincere reflections from a developer constantly learning.

What was implemented? 😁

1. Duplicate WriteHeader Protection

We implemented a custom wrapper (responseWriter) that prevents "superfluous response.WriteHeader" errors. Now, multiple calls to WriteHeader are handled silently, avoiding crashes in complex middleware chains.

2. Hijacker Support

The responseWriter now implements the http.Hijacker interface, allowing connection upgrades natively. This enables real-time bidirectional communication directly through Quick.

3. HTTP/2 Server Push

We added support for the http.Pusher interface with the c.Pusher() method, enabling HTTP/2 Server Push to improve performance by proactively sending resources to the client before they are even requested, reducing latency and round-trips.

Example:

q.Get("/", func(c *quick.Ctx) error {
    pusher, ok := c.Pusher()
    if ok {
        pusher.Push("/static/style.css", nil)
        pusher.Push("/static/app.js", nil)
    }
    return c.Status(200).SendString("<html>...</html>")
})

4. Simplified Server-Sent Events (SSE) for Streaming LLMs

We implemented the c.Flusher() method, which facilitates real-time data streaming, essential for modern applications with Large Language Models (LLMs). It allows you to progressively send tokens as they are generated, creating interactive experiences in both the browser and CLIs.

Real-world use cases:
Streaming responses from ChatGPT, Claude, and Gemini
Real-time deployment logs
Progressive dashboard updates
Server push notifications

Example 1:

q.Post("/ai/chat", func(c *quick.Ctx) error {
    c.Set("Content-Type", "text/event-stream")
    c.Set("Cache-Control", "no-cache")
    c.Set("Connection", "keep-alive")

    flusher, ok := c.Flusher()
    if !ok {
        return c.Status(500).SendString("Streaming not supported")
    }

    // Simulate streaming of tokens the LLM
    tokens := []string{"Hello", " this", " is", " a", " streaming", " response", " from", " AI"}

    for _, token := range tokens {
        // Standard SSE format
        fmt.Fprintf(c.Response, "data: %s\n\n", token)
        flusher.Flush() // Sends immediately to the customer
        time.Sleep(100 * time.Millisecond) // Simulates LLM latency
    }

    // Signals end of stream
    fmt.Fprintf(c.Response, "data: [DONE]\n\n")
    flusher.Flush()
    return nil
})

Example 2:

q.Get("/events", func(c *quick.Ctx) error {
    c.Set("Content-Type", "text/event-stream")
    c.Set("Cache-Control", "no-cache")

    flusher, ok := c.Flusher()
    if !ok {
        return c.Status(500).SendString("Streaming not supported")
    }

    for i := 0; i < 10; i++ {
        fmt.Fprintf(c.Response, "data: Message %d\n\n", i)
        flusher.Flush()
        time.Sleep(time.Second)
    }
    return nil
})

Client JavaScript (Browser):

const eventSource = new EventSource('/ai/chat');
eventSource.onmessage = (event) => {
    if (event.data === '[DONE]') {
        eventSource.close();
        return;
    }
    document.getElementById('response').innerText += event.data;
};

Client CLI (Go):

resp, _ := http.Post("http://localhost:8080/ai/chat", "application/json", body)
reader := bufio.NewReader(resp.Body)
for {
    line, _ := reader.ReadString('\n')
    if strings.Contains(line, "[DONE]") {
        break
    }
    fmt.Print(strings.TrimPrefix(line, "data: "))
}

It works perfectly with HTTP/2 multiplexing, allowing multiple simultaneous streams on the same connection.

5. Pooling Optimization

The Reset() method has been optimized to reuse the existing wrapper instead of recreating it with each request. This reduces memory allocations in the hot path, improving throughput.

6. Memory Leak Prevention

We implemented full context cleanup in releaseCtx(), including Context and wroteHeader fields, ensuring that no residual state remains between requests.

Impact

βœ… Greater robustness in high-concurrency scenarios
βœ… Native HTTP/2 support
βœ… Server-Sent Events (SSE) made easy with c.Flusher()
βœ… Reduced allocations per request
βœ… Zero breaking changes and 100% backward compatibility

All changes maintain full compatibility with existing code.

πŸ†š SSE vs WebSocket

Appearance SSE WebSocket
Server CPU 🟒 Low 🟑 Medium
Server Memory 🟒 2-4 KB/connection 🟑 8-16 KB/connection
Bandwidth 🟒 Lower overhead 🟑 Higher overhead
Latency 🟑 ~50 ms 🟒 ~5-10 ms
Implementation 🟒 Simple 🟑 Complex
Debugging 🟒 HTTP Tools πŸ”΄ Specific Tools
Firewall/Proxy 🟒 Standard HTTP 🟑 Power Issues
Bidirectional πŸ”΄ No (server-client only) 🟒 Yes
Protocol HTTP/1.1 or HTTP/2 WebSocket (RFC 6455)
Parser Plain Text Binary Frames
Handshake Normal HTTP Request HTTP Refresh
Automatic Reconnect 🟒 Yes (native) πŸ”΄ Manual
Browser Support 🟒 All modern 🟒 All modern
Message Overhead ~45 bytes ~50+ bytes
Ideal for Notifications, feeds, logs Chat, games, collaboration
Scalability 🟒 I have ~10,000 connections 🟒 Thousands of Connections
CDN Compatible 🟒 Yes 🟑 Limited
Backend Complexity 🟒 Low 🟑 High

πŸ“Š Performance Comparison - SSE Writing Methods

This benchmark was run to identify the most efficient method for detecting SSE events unrelated to http.ResponseWriter. Tests were performed on an Apple M3 Max with Go 1.x, measuring nanoseconds per operation (ns/op), bytes allocated (B/op), and number of allocations (allocs/op).

Benchmark Results

The w.Write([]byte()) method performs best with 13.56 ns/op and zero allocations, approximately 4x faster than fmt.Fprint() and 9x faster than io.WriteString().

For large messages (>1 KB), it is recommended to use sync.Pool to reuse buffers, reducing allocation and putting pressure on the garbage collector.

Recommendations

Development/Debugging: Use fmt.Fprint() for simplicity.
Production (small messages): Use w.Write([]byte()) for maximum performance.
Production (large messages): Use sync.Pool with reused buffers.
High performance (>10,000 requests/s): Combine w.Write() with a buffer pool.

The full benchmark is available in /bench.

Method Performance Allocations Complexity Recommendation
fmt.Fprint() 🟑 Medium (53 ns) 3 allocations 🟒 Simple βœ… Development
io.WriteString() 🟑 Slow (116 ns) 1 allocation 🟒 Simple ⚠️ Avoid
w.Write([]byte) 🟒🟒 Excellent (13 ns) 0 allocations 🟒 Simple βœ… Throughput
strings.Builder 🟑 Slow (124 ns) 1-2 allocations 🟑 Media ⚠️ Avoid
Multiple Writes 🟒 Good (21 ns) 0 allocations 🟒 Simple βœ… Alternative
sync.Pool 🟒🟒 Excellent (40 ns) 0 allocations πŸ”΄ Complex βœ… High-performance
Unsafe 🟒🟒 Excellent (21 ns) 0 allocations πŸ”΄ Complex ⚠️ Experts

πŸš€ Running the Benchmark

To reproduce the performance tests and validate the results on your machine, run:

go test -bench=. -benchtime=1s -benchmem ctx_bench_test.go

Benchmark Parameters

-bench=. - Run all benchmarks
-benchtime=1s - Run each benchmark for 1 second
-benchmem - Include memory allocation statistics

πŸ“Š Benchmark Results

Test Environment:

OS: macOS (darwin)
Architecture: ARM64
CPU: Apple M3 Max
Go Version: 1.x

Method ns/op ops/sec B/op allocs/op Performance
WriteBytes 13.32 75.1M 0 0 πŸ₯‡ Winner
MultipleWrites 20.68 48.4M 0 0 πŸ₯ˆ Excellent
Unsafe 20.98 47.7M 0 0 πŸ₯‰ Great
Pooled 39.00 25.6M 0 0 βœ… Good
FmtFprint 52.69 19.0M 16 1 ⚠️ Slow
FmtFprintf 62.61 16.0 million 16 1 ⚠️ Slow
IoWriteString 111.6 9.0 million 1024 1 πŸ”΄ Very slow
Optimized 119.7 8.4 million 1024 1 πŸ”΄ Very slow
StringsBuilder 122.7 8.2 million 1032 2 πŸ”΄ Very slow
Method ns/op Speedup B/op allocs/op Performance
PooledLarge 234.5 Baseline 0 0 πŸ₯‡ Winner
WriteBytesLarge 811.2 3.46x slower 9472 1 πŸ”΄ Much Slower

Examples and Source Code

All tests can be accessed here -> Quick SSE

Here you can view the Bench source code bench

Contributions

Quick is an open-source project in constant evolution. Feedback and contributions are always welcome!
GitHub: Quick

golang #webframework #performance #opensource #go #quick


This content originally appeared on DEV Community and was authored by jefferson otoni lima


Print Share Comment Cite Upload Translate Updates
APA

jefferson otoni lima | Sciencx (2025-10-05T17:17:37+00:00) Quick Framework and some Performance Improvements. Retrieved from https://www.scien.cx/2025/10/05/quick-framework-and-some-performance-improvements/

MLA
" » Quick Framework and some Performance Improvements." jefferson otoni lima | Sciencx - Sunday October 5, 2025, https://www.scien.cx/2025/10/05/quick-framework-and-some-performance-improvements/
HARVARD
jefferson otoni lima | Sciencx Sunday October 5, 2025 » Quick Framework and some Performance Improvements., viewed ,<https://www.scien.cx/2025/10/05/quick-framework-and-some-performance-improvements/>
VANCOUVER
jefferson otoni lima | Sciencx - » Quick Framework and some Performance Improvements. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/10/05/quick-framework-and-some-performance-improvements/
CHICAGO
" » Quick Framework and some Performance Improvements." jefferson otoni lima | Sciencx - Accessed . https://www.scien.cx/2025/10/05/quick-framework-and-some-performance-improvements/
IEEE
" » Quick Framework and some Performance Improvements." jefferson otoni lima | Sciencx [Online]. Available: https://www.scien.cx/2025/10/05/quick-framework-and-some-performance-improvements/. [Accessed: ]
rf:citation
» Quick Framework and some Performance Improvements | jefferson otoni lima | Sciencx | https://www.scien.cx/2025/10/05/quick-framework-and-some-performance-improvements/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.