close
Skip to content

add MLIR → DXIL compute pipeline#409

Merged
makslevental merged 11 commits intomainfrom
user/m-levental/directx
May 4, 2026
Merged

add MLIR → DXIL compute pipeline#409
makslevental merged 11 commits intomainfrom
user/m-levental/directx

Conversation

@makslevental
Copy link
Copy Markdown
Contributor

@makslevental makslevental commented May 3, 2026

Full flow: MLIR (eudsl-python-extras matmul) → bare-ptr LLVM → lower_mlir_to_dxil (new C++ pass) → DXIL backend → metallib via libmetalirconverter → Metal dispatch with the top-level Argument Buffer encoded from the reflection data → results match A @ B.

DXILLowering.cpp — standalone LLVM IR rewrite:

lowerMLIRToDXIL(Module &M) finds every function with "hlsl.shader" attr and rewrites it.
Per-arg classification: ptr addrspace(1)UAV if any store else SRV, ptr addrspace(2)SRV, <3 x i32> → thread id.
For each buffer arg: element type inferred from first GEP/load/ store; @llvm.dx.resource.handlefrombinding inserted at entry, GEP+load chains rewritten to rawbuffer.load + extractvalue 0, GEP+store chains to rawbuffer.store.
For gid: extractelement with constant index K@llvm.dx.thread.id (i32 K).
Signature rewritten to void () via a fresh function with blocks spliced over and metadata/attrs copied.

DXILHelper.cpp — new bindings:

lower_mlir_to_dxil(module) — calls into the pass above.
translate_llvm_to_dxil — runs the DirectX backend, emits a DXContainer.
translate_dxil_to_metallib — returns (metallib_bytes, [(type, space, slot, top_level_offset, size_bytes, name), ...]) via IRShaderReflection so Python knows where each resource sits in the top-level Argument Buffer.
IRShaderStage enum. Links LLVMDirectXCodeGen et al. and libmetalirconverter; force-loads LLVMDirectXCodeGen so the static archive's AsmPrinter TU is pulled into the CAPI dylib.
mlir/dxil.py — re-exports the new bindings and adds add_dxil_module_metadata / mark_as_dxil_compute_kernel helpers.

tests/test_matmul_dxil.py — 32x32 f32 matmul that runs the full pipeline. After lowering, encodes the top-level Argument Buffer from the reflection data (each slot a 24-byte {gpuAddress, resourceID=0, flags=length} descriptor), binds it at kIRArgumentBufferBindPoint = 2, calls useResource:usage: on each buffer for residency, dispatches with numthreads=(8,8,1), verifies against A @ B.

@makslevental makslevental force-pushed the user/m-levental/directx branch from 9bd8b91 to 5c8025c Compare May 3, 2026 06:09
@makslevental makslevental changed the title Add DirectX to experimental targets for LLVM add MLIR → DXIL → MetalIR compute pipeline May 3, 2026
@makslevental makslevental force-pushed the user/m-levental/directx branch from 3a8b8e8 to 37e8a6b Compare May 3, 2026 06:22
@makslevental makslevental force-pushed the user/m-levental/directx branch from 37e8a6b to d3ddedb Compare May 3, 2026 06:33
@makslevental makslevental changed the title add MLIR → DXIL → MetalIR compute pipeline add MLIR → DXIL compute pipeline May 3, 2026
@makslevental makslevental merged commit bddf58f into main May 4, 2026
205 of 210 checks passed
@makslevental makslevental deleted the user/m-levental/directx branch May 4, 2026 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant