close
Skip to content

fix(i2_s): resolve to_float UB + guard BLAS I2_S routing#533

Open
JasonOA888 wants to merge 1 commit intomicrosoft:mainfrom
JasonOA888:fix/issue-468-i2-s-v2
Open

fix(i2_s): resolve to_float UB + guard BLAS I2_S routing#533
JasonOA888 wants to merge 1 commit intomicrosoft:mainfrom
JasonOA888:fix/issue-468-i2-s-v2

Conversation

@JasonOA888
Copy link
Copy Markdown

Fixes #468, Fixes #512

Bug 1: to_float callback UB (missing scale param)

The type_traits entry for GGML_TYPE_I2_S registers dequantize_row_i2_s as a ggml_to_float_t callback, but the signatures dont match: the function takes 4 params (x, y, n, i2_scale) but the typedef only has 3. The cast silences the compiler but causes UB — the 4th argument register contains garbage, producing wrong dequantized weights on ARM and potentially x86.

Fix: Added dequantize_row_i2_s_wrapper() that matches the ggml_to_float_t signature and calls the real dequantizer.

Bug 2: BLAS routes I2_S through generic MUL_MAT (segfault on Apple Silicon)

When ubatch >= 32, the BLAS backend claims the generic MUL_MAT path for I2_S because to_float is non-NULL. But I2_S stores an external scale outside the per-row payload, which the generic BLAS dequantize path does not handle. This causes segfaults on Apple Silicon Metal+BLAS.

Fix: Explicitly reject GGML_TYPE_I2_S from the BLAS MUL_MAT support check so I2_S uses its specialized kernel path.

Changes

llama.cpp submodule updated with fixes (commit aa603a3):

  • ggml-quants.c: Added dequantize_row_i2_s_wrapper() with correct 3-param signature
  • ggml-quants.h: Declared the wrapper function
  • ggml.c: Use wrapper instead of cast in type_traits
  • ggml-blas.cpp: Reject GGML_TYPE_I2_S from both MUL_MAT and OUT_PROD support checks

Submodule URL changed to JasonOA888/llama.cpp so the fix commit is publicly reachable (previous PR pointed to a local-only commit).

Testing

  • Wrapper correctly routes to the real dequantizer
  • BLAS guard is minimal (single condition addition, no perf impact)
  • Specialized I2_S kernels (gemv/gemm/vec_dot) unaffected

@JasonOA888 JasonOA888 force-pushed the fix/issue-468-i2-s-v2 branch 4 times, most recently from 819bb55 to 226b2ab Compare April 6, 2026 03:36
@JasonOA888 JasonOA888 force-pushed the fix/issue-468-i2-s-v2 branch from 226b2ab to a6fdd51 Compare April 6, 2026 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant