[cccl.c]: Add infrastructure to support ahead-of-time compilation (AoT)

## Background

Currently, each algorithm has a `cccl_device_<algo>_build` function that compiles a kernel via NVRTC and returns a build result struct holding the loaded CUlibrary and related state.

To support ahead-of-time (AoT) compilation — pre-compiling kernels and saving them to disk for use in a different process or on a different machine — the Python layer needs to serialize and deserialize these build result structs. This requires the C layer to expose enough metadata in the structs to make that possible.

## Required changes

  1. New fields in all `*_build_result_t` structs:
    - `int cc` — the compute capability the kernel was compiled for (encoded as major*10 + minor)
    - `size_t runtime_policy_size` — size of the opaque `runtime_policy` blob, so it can be round-tripped through serialization
    - Per-kernel char* lowered name fields — the mangled CUDA kernel names produced by NVRTC, needed to resolve kernels from a cubin via `cuLibraryGetKernel` during deserialization
  2. Cross-CC build support — when a kernel is compiled for a target CC that doesn't match the current device (e.g. compiling for SM 9.0 on an SM 8.6 machine), `cuLibraryLoadData` returns
  `CUDA_ERROR_NO_BINARY_FOR_GPU`. Currently this is a fatal error. The build functions should be updated to treat this case as success — returning the cubin and lowered names without a loaded CUlibrary — so that the
  result can be serialized and shipped to a matching device.

# Motivation

These changes are purely additive to the C structs and transparent to existing callers. They unblock the Python layer to implement `save()` / `load_algorithm()` for pre-compiled kernel distribution (e.g. shipping pre-compiled kernels in a Python wheel that works across a range of GPU architectures).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cccl.c]: Add infrastructure to support ahead-of-time compilation (AoT) #8410

Background

Required changes

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[cccl.c]: Add infrastructure to support ahead-of-time compilation (AoT) #8410

Description

Background

Required changes

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions