When JSON Schema Crashes Your Inference Server: Regex DoS in C++
Feed `std::regex` a pathological pattern and a crafted input, and watch it spiral. Input length 16 characters: 4.48 milliseconds. Input length 18: 18 milliseconds. Input length 20: over a second. The
Modern C++ // dev May 8, 2026 9 min read
LLVM's Flat-Buffer Tree for IR Dominators: O(1) Reads vs O(n) Moves
Compiler optimization passes live and die on tree traversal. LLVM's dominator analysis alone queries ancestor relationships thousands of times per function. A real C++ translation unit with heavy temp
Modern C++ // dev May 5, 2026 9 min read
Compile-Time Unsigned Overflow Detection in C++: From `if`-checks to `constexpr` Assertions
Unsigned overflow wraps. That's the contract. C++ won't catch it, the CPU won't trap it. But your size calculation that wraps to a smaller buffer? That's a memory corruption vulnerability, and it's yo
Modern C++ // dev May 1, 2026 9 min read
Beyond Binary Search: When Interpolation Beats Quaternary, Radix, and SIMD
You have a sorted array. Need to search it. `std::lower_bound`: O(log n), predictable, robust. This is the default solution. But my industry experience breeds skepticism. Every few months, a new paper
Modern C++ // dev Apr 28, 2026 10 min read
Parallel execution for loops: C++26's work-stealing scheduler under the hood
I've spent enough time staring at `perf stat` output to recognize a pattern: OpenMP's dynamic scheduler measures **55,593 ops/sec** on Zipf-distributed task costs with 512 tasks. That's about 18 micro
Modern C++ // dev Apr 24, 2026 9 min read
Kernel Fusion on CPU: What llama.cpp's RMS_NORM + MUL Fusion Teaches Us About LLM Performance
Llama.cpp's PR #22423 landed a kernel fusion for RMS_NORM + MUL in the ggml CPU backend a few weeks ago. The speedup: 1.60×. Consistently. Across dimension sizes, thread counts, even hardware variatio
Modern C++ // dev Apr 21, 2026 7 min read
C++26 Move Semantics: What's New Since CppCon 2025 Basics Talk
If you watched Ben Saks's CppCon 2025 'Back to Basics: Move Semantics' talk, you know what moves are and why the compiler calls them. That talk is solid. C++26 doesn't contradict it. What it does is t
Modern C++ // dev Apr 10, 2026 8 min read
Designing a SIMD Algorithm from Scratch
I manually unrolled a byte-counting loop with four independent accumulators — the textbook ILP optimization — and it ran 2.08x *slower* than the plain loop. The plain loop that GCC had quietly autovec
Modern C++ // dev Mar 31, 2026 10 min read
Profile-Guided Optimization Made Our Code Slower
That's the whole story. I took a virtual-dispatch interpreter loop — the textbook PGO target — instrumented it, trained it on a representative workload, and recompiled. Both GCC 15.2.1 and Clang 21.1.
Modern C++ // dev Mar 10, 2026 8 min read