Skip to content

[Proposal] Optimize phase_encode_kernel using LLVM JIT and CUDA Driver API for 5x+ Speedup #1394

@botszhuang

Description

@botszhuang

What

This proposal introduces a performance optimization for the phase_encode_kernel by leveraging LLVM with Just-In-Time (JIT) compilation and the CUDA Driver API. Initial assessments suggest this refactoring can yield a 5x+ speedup over the current implementation, significantly improving performance for workloads relying on this kernel.

Language Kernel execution time Gain
CUDA C runtime api 0.302787 ms 1
LLVM+JIT 0.0195 ms ~ 15

Why

The goal of this optimization is to maximize hardware efficiency and properly utilize all available computing resources. Enhancing its execution speed via runtime compilation and direct hardware acceleration will dramatically reduce compute time and improve resource utilization across modern hardware backends.

How

  • LLVM + JIT Refactoring: Refactor the existing phase_encode_kernel function to generate LLVM Intermediate Representation (IR) dynamically, allowing for runtime optimization tailored to specific data shapes and parameters.

  • Execution Pipeline: Finalize the execution pipeline to smoothly bind the JIT-compiled LLVM kernel with the CUDA driver, including proper memory mapping.

  • tentive plan
    Step 1: Bridge LLVM JIT with Rust Core
    Step 2: Load the PTX with the CUDA Driver API
    Step 3: Wire into the Mahout Pipeline

Q

  1. The phase-encode kernel itself is refactored in C++ using LLVM and JIT, so it's expected to be stored as a .cpp file. Shall it be with the original .cu file, or in an independent folder?
  2. The CUDA C driver API will be added, shall it be with the original kernel, or in a independent .rs file?
  3. If Mahout accepts this merge, shall the kernel be renamed, or should a feature like define be used to switch between them?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions