Replies: 5 comments
-
|
There's definitely a few crates for graphics types. These are not that hard to make these days 🤖 This looks basically useful But it's not Slang types, and I'm leaning towards 100% Slang semantics in Rust, support for alignment / type wrapper correlations with Slang, and swizzles and things in Rust. Less generic, less redundant and confusing for users. |
Beta Was this translation helpful? Give feedback.
-
struct Meters
{
float value;
};
struct AltitudeMeters
{
Meters value;
};
struct FlightData
{
Meters groundDistance;
AltitudeMeters altitude;
};
RWStructuredBuffer<FlightData> output;
[shader("compute")]
[numthreads(1, 1, 1)]
void main(uint3 tid : SV_DispatchThreadID)
{
Meters dist;
dist.value = 100.0;
AltitudeMeters alt;
alt.value = dist;
FlightData fd;
fd.groundDistance = dist;
fd.altitude = alt;
output[tid.x] = fd;
}
{
"parameters": [
{
"name": "output",
"binding": {"kind": "descriptorTableSlot", "index": 0},
"type": {
"kind": "resource",
"baseShape": "structuredBuffer",
"access": "readWrite",
"resultType": {
"kind": "struct",
"name": "FlightData",
"fields": [
{
"name": "groundDistance",
"type": {
"kind": "struct",
"name": "Meters",
"fields": [
{
"name": "value",
"type": {
"kind": "scalar",
"scalarType": "float32"
},
"binding": {"kind": "uniform", "offset": 0, "size": 4, "elementStride": 0}
}
]
},
"binding": {"kind": "uniform", "offset": 0, "size": 4, "elementStride": 0}
},
{
"name": "altitude",
"type": {
"kind": "struct",
"name": "AltitudeMeters",
"fields": [
{
"name": "value",
"type": {
"kind": "struct",
"name": "Meters",
"fields": [
{
"name": "value",
"type": {
"kind": "scalar",
"scalarType": "float32"
},
"binding": {"kind": "uniform", "offset": 0, "size": 4, "elementStride": 0}
}
]
},
"binding": {"kind": "uniform", "offset": 0, "size": 4, "elementStride": 0}
}
]
},
"binding": {"kind": "uniform", "offset": 4, "size": 4, "elementStride": 0}
}
]
}
}
}
],
"entryPoints": [
{
"name": "main",
"stage": "compute",
"parameters": [
{
"name": "tid",
"semanticName": "SV_DISPATCHTHREADID",
"type": {
"kind": "vector",
"elementCount": 3,
"elementType": {
"kind": "scalar",
"scalarType": "uint32"
}
}
}
],
"threadGroupSize": [1, 1, 1],
"bindings": [
{
"name": "output",
"binding": {"kind": "descriptorTableSlot", "index": 0}
}
]
}
],
"bindlessSpaceIndex": 1
}
There is a strong argument for the simplicity of forbidding the creation of Rust newtype wrappers around existing wrappers. Ergonomically, the intrinsic type can become really far away. Our macros will be stuck talking about type chains rather than leaf types. Our witnesses will be too complex. I think single-level wrappers around Rust intrinsic should be as far as we go. Our intrinsic types just need Slang types so that we hit the leaf type in Rust and then look at the leaf type in Slang. Our map is intrinsic Rust to Intrinsic Slang. That seems less likely to blow up in most cases. |
Beta Was this translation helpful? Give feedback.
-
|
In order to use a common descriptor table, it will be valuable to share slang. Using slang in dependents will require some help for the build scripts. We might be able to use manifest If that path becomes cumbersome, we can use toml configuration to tell build scripts where else to look, but how would they find the crate in a local cache if not via the links trick? Nix builds are another wrinkle. Not hard, but might require some plumbing. |
Beta Was this translation helpful? Give feedback.
-
Building Up the Implementation
Sparse ranges and overlapping ranges were giving me the most pause. The biggest question ✋ is how do we want the update methods to look like? While there will be a single witness result defining every byte of all push ranges, this range might also be sparse (interior bytes marked unused) and cannot implicitly be pushed. Split ranges for a single stage are another wrinkle! For now we can simply implement pushing only for whole ranges, but when remixing shaders without re-writing their push ranges, it would be ergonomic to be able to define an extra push method that does not interfere with the range definitions. At first, if we can just name the push methods, that will also be ergonomic. Starting with compute pipelines to get something working for the simple compute pipeline case: single range for a single stage, single composite type, only check length, and name the push method. Developing this alongside stage and pipeline macros is creating a lot of moving parts. ⚙️ 🏙️ 🥽 It's pretty complex type soup. I'll try to update this comment as I figure out the pieces. |
Beta Was this translation helpful? Give feedback.
-
|
As I was writing some of the foundation type relations for some macros, it dawned on me that I don't know how to predict which layout rules are in use. Slang can express using alternate layouts per block. SPIR-V and probably MSL are fine with multiple layouts. struct SmallPair
{
float x;
float y;
};
// Under std140: SmallPair has 16-byte alignment (struct rounded to vec4)
// -> array stride = 16 bytes
// Under std430: SmallPair has 4-byte alignment (largest member = float)
// -> array stride = 8 bytes
// Under scalar: same as std430 in this case
struct LayoutProbeData
{
float sentinel0; // offset 0 in both
SmallPair pairs[4]; // offset diverges: 16 (std140) vs 4 (std430)
float sentinel1; // offset further diverges
};
ConstantBuffer<LayoutProbeData, Std140DataLayout> cbuf140;
ConstantBuffer<LayoutProbeData, Std430DataLayout> cbuf430;
ConstantBuffer<LayoutProbeData, ScalarDataLayout> cbufscalar;Here's some extracted reflection data for such a declaration: cbuf140:
- pairs.uniformStride: 16
- pairs.binding: offset=16, size=64, elementStride=16
- sentinel1.offset: 80
- elementVarLayout.binding.size: 96
cbuf430:
- pairs.uniformStride: 8
- pairs.binding: offset=4, size=32, elementStride=8
- sentinel1.offset: 36
- elementVarLayout.binding.size: 40
cbufscalar:
- pairs.uniformStride: 8
- pairs.binding: offset=4, size=32, elementStride=8
- sentinel1.offset: 36
- elementVarLayout.binding.size: 40Note there is no indication of what layout rules are in use. This means when the layout checks fail, we won't be sure why. The user should know what layout they declared. We can match that and detect errors. Manual packing is always an option. We're defaulting to scalar layout. Let's not get too excited. To support alternative layouts, we need layout rule parameters on push constants, ssbos, and ubos. Those decisions will persist all the way up into the pipeline and the checks between pipelines,. I had been concerned we would need to define types from reflection code instead of using reflection code as a source of truth for checking agreement. We will not have to. The consequences of not knowing the layout rules in use will mainly only make it harder to make useful error messages. So, no code generation for padded types, no types per pipeline etc. We're mostly safe. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The foundation of pipeline input layout agreement is to be able to verify type and struct agreement. It's introspection + agreement. Agreement means we can put our data in like slang expects. Introspection means we know what slang expects.
TL;DR
We're probably going to aggressively target scalar block and 430 to minimize padding and focus instead just on agreement with field locations within the structs.
Choosing Targets & Reading / Writing Them
There is a decent amount of prior art, but it doesn't seem like a lot of these crates are obviously heavily loved:
Based on recommendations for persistently mapped staging buffers, it's questionable what value std430 packing in CPU space could ever bring. The usual outcome is that we have to assemble a few bytes (oh my!) rather than treat them as a slice temporarily for writing push constants etc.
To just get rid of std140, we can use https://docs.vulkan.org/refpages/latest/refpages/source/VK_KHR_uniform_buffer_standard_layout.html.
There are extensions we need to enable to use even less padding in edge cases:
Using scalars directly is advantageous maybe for SoA techniques - if something doesn't fit well in a std430 array, we can store it in a separate byte array, making a padding problem into an SoA that is better for other reasons. If there's no performance advantage, we can avoid 430 except for UBOs and push constants.
Default to Scalar Block?
A
to_430implementation would be nice to have. Only the result of that implementation needs to be structurally compatible with the destination. For magic structs that are 430 already, the result is to do nothing. It really seems like the ergonomic way to go for now is to target scalar block layout and then backport support for 430 later as an optimization, which it really is if most hardware can support scalar block.For magic structures that work with 430 or 8/16bit storage, we don't care. We absolutely just don't care. They are already fast and convenient from storage buffers. Amirite?
Back to Introspection
The slang compiler can output JSON or evidently can speak libslang. Why would we want to talk to libslang if the jsons can exist? The JSON output is basically a cache. Talking to slang may require re-injesting the shader. Due to the build time / macro constraints already present, let's just read the JSONs and rely on normal build.rs behavior. The bulk of the work is figuring out slangc's introspection output.
This seems done and dusted. Comment if you have anything to add.
Beta Was this translation helpful? Give feedback.
All reactions