Archive
2026
- May 10 Unlocking ARM NEON Performance: A Deep Dive into Parallel Prefix Sums performance
- May 10 Building Fast, Immutable String-to-Float Array Maps in C++: A 'ConstMap' Inspired Approach performance
- May 10 Beyond Sequential Consistency: Practical Performance Gains with C++ Weaker Memory Orderings concurrency
- May 10 GCC 16.1 and C++20 Modules: A Practical Guide to Bridging the Tooling Gap standards
- May 8 When JSON Schema Crashes Your Inference Server: Regex DoS in C++ ai-ml
- May 5 LLVM's Flat-Buffer Tree for IR Dominators: O(1) Reads vs O(n) Moves performance
- May 1 Compile-Time Unsigned Overflow Detection in C++: From `if`-checks to `constexpr` Assertions standards
- Apr 28 Beyond Binary Search: When Interpolation Beats Quaternary, Radix, and SIMD performance
- Apr 24 Parallel execution for loops: C++26's work-stealing scheduler under the hood concurrency
- Apr 21 Kernel Fusion on CPU: What llama.cpp's RMS_NORM + MUL Fusion Teaches Us About LLM Performance ai-ml
- Apr 17 P3373R2: The Case for a Standardized Low-Latency I/O API concurrency
- Apr 14 Compile-Time String Substitution with C++26 Reflection standards
- Apr 10 C++26 Move Semantics: What's New Since CppCon 2025 Basics Talk standards
- Apr 7 C++23 std::stacktrace: Never Debug Blind Again standards
- Apr 3 C++26 Is Finalized: What Shipped, What Didn't, and What It Means standards
- Mar 31 Designing a SIMD Algorithm from Scratch performance
- Mar 27 C++ Profiles: What, Why, and How at using std::cpp 2026 standards
- Mar 20 C++20 Modules: The Tooling Gap standards
- Mar 17 Contracts in C++26: What They Check, What They Cost, When to Use Them standards
- Mar 13 Anatomy of llama.cpp: How 105K Stars of C++ Runs LLMs on Your Laptop ai-ml
- Mar 10 Profile-Guided Optimization Made Our Code Slower performance
- Mar 6 Lock-Free Queue Implementations Compared: Correctness, Performance, and the Bugs You'll Ship concurrency
- Mar 3 std::expected on Bare Metal: Error Handling Without Exceptions embedded
- Feb 27 Cache-Line Archaeology: Finding and Fixing False Sharing in Production performance