needs a branch with more work to be done, partial inlining that is invisible to performance (5kb which is nothing), PGO + BOLT optimization path yielding better results regardless
Apply aggressive inlining attributes to BCn decoding logic to eliminate function call overhead during texture decompression. This allows the compiler to better analyze loops for vectorization. Includes portability guards to ensure compatibility across Clang, GCC, and MSVC.
Signed-off-by: Collecting <collecting@noreply.localhost>