For vllm + gpt-oss-20b + combo kernel, we can see ~50 prints of ComboKernels: 1 large pointwise nodes are separated. We should check if this misses some optimization opportunities, or consider not ...