Qwen 3.6: 35B vs 27B comparison - benchmark results

Thu, 23 Apr 2026 12:00:00 +0000

I finally summed up all the Qwen 3.6 model test results I gathered over the past few days. I compared two models in detail: the Qwen3.6-35B-A3B (MoE, hybrid attention/delta) and the Qwen3.6-27B (dense, hybrid attention/delta). I ran both with turbo3 KV cache compression on an RTX 4090 as a llama.cpp server.

If I had to summarize briefly: the 35B-A3B is 3-4x faster in everything, but the 27B delivers better quality. This is the classic MoE vs. dense tradeoff, just backed by numbers.

Llama.cpp on ZoliBen Csupra(Kabra)

Qwen 3.6: 35B vs 27B comparison - benchmark results