Disable instrument_if_optimizing in libraries-pgo#8
Open
AndyAyersMS wants to merge 3244 commits into
Open
Conversation
…tnet#126230) This test was disabled because the non-backtring sampling implementation used to enables it was hanging. This fixes that: when all minterms lead to dead-end states, UndoTransition restores original states but the break condition never triggers; added transitionSucceeded tracking to break the loop. The addition of another 18K tests (generated inputs to the "real-world" data set of regexes) highlighted a latent bug in CanReduceLoopBacktrackingToSinglePosition, fixed by changing its iterateNullableSubsequent from true to false. When the post-literal sequence is entirely nullable, any backtrack position can succeed (the nullable part matches empty), so reducing to a single position is incorrect. This caused Compiled/SourceGenerated to reject valid matches for some complex patterns, e.g. `([0-9\w\+]+\.)|([0-9\w\+]+\+)([\(\)]*)` with input `2+_`. The PatternsDataSet_GenerateInputsWithNonBacktracking is now enabled. Fixes dotnet#79373 --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Revise to handle small typed indices; review feedback.
…NVOKE_PROLOG, GT_PINVOKE_EPILOG, GT_EMITNOP (dotnet#126282) ## Description `PSEUDORANDOM_NOP_INSERTION` was a JIT spraying mitigation for partial trust scenarios. The define has been commented out for years and partial trust is not coming back. Remove it and all associated dead code, including the now-unused `GT_PINVOKE_PROLOG`, `GT_PINVOKE_EPILOG`, and `GT_EMITNOP` GenTree node types. **Removed across 18 files:** - **`target.h`**: The commented-out `#define` - **`emit.h`/`emit.cpp`**: `emitNextNop`, `emitRandomNops`, `emitInInstrumentation` fields; `emitEnableRandomNops()`/`emitDisableRandomNops()`/`emitNextRandomNop()` methods; random NOP insertion logic in `emitAllocAnyInstr` - **`compiler.h`/`compiler.cpp`**: `compChecksum`, `compRNG` fields; `adler32()` and `getMethodBodyChecksum()` functions - **`gtlist.h`**: `GT_EMITNOP`, `GT_PINVOKE_PROLOG`, `GT_PINVOKE_EPILOG` GTNODE definitions - **`lower.cpp`**: `GT_PINVOKE_PROLOG` node creation in `LowerNonvirtPinvokeCall`; stale comment referencing the feature - **`codegenxarch.cpp`**, **`codegenarmarch.cpp`**, **`codegenloongarch64.cpp`**, **`codegenriscv64.cpp`**: `GT_PINVOKE_PROLOG` case handlers and `emitDisableRandomNops()` calls - **`codegenwasm.cpp`**: `GT_PINVOKE_PROLOG` case handler - **`lsraarm.cpp`**: `GT_PINVOKE_PROLOG` from BuildSimple case list - **`liveness.cpp`**: `GT_PINVOKE_PROLOG` and `GT_PINVOKE_EPILOG` from side-effecting node list - **`gentree.cpp`**: All three node types from `TryGetUse`, `UseEdgeIterator`, and `gtDispLeaf` switch statements - **`compiler.h`/`compiler.hpp`**: All three node types from switch case lists - **`instr.cpp`**: `INS_lock` NOP delay workaround - **`debug/di/module.cpp`**: NOP sled skipping logic and dead `skipBytes` variable --- 🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. <a href="https://gh.io/cca-advanced-security">Learn more about Advanced Security.</a> <!-- START COPILOT CODING AGENT TIPS --> --- ⌨️ Start Copilot coding agent tasks without leaving your editor — available in [VS Code](https://gh.io/cca-vs-code-docs), [Visual Studio](https://gh.io/cca-visual-studio-docs), [JetBrains IDEs](https://gh.io/cca-jetbrains-docs) and [Eclipse](https://gh.io/cca-eclipse-docs). --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com>
> [!NOTE] > This PR description was AI/Copilot-generated. Extracted from dotnet#126269 I wonder if we should move all Polyfils for NS/NETF to just one common place for all libraries. Currently, libs just define their own in-place. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
HANDLE_LEAK JsonDocument.Parse(json, new JsonDocumentOptions() {
AllowTrailingCommas = true }) becomes out of scope and is not disposed
A copy of the string is extracted from the object
(jsonDocument.RootElement.....Value.EnumerateObject()->Value.GetString())
, so it is not needed when the method completes. 'using' operator is
safe here.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Co-authored-by: alex <spiridonov@keysystems.ru>
…26170) These are causing failures in pri1 test builds
This pull request updates the following dependencies [marker]: <> (Begin:4247a230-8931-4538-5b64-08d8d8feb47e) ## From https://github.com/dotnet/icu - **Subscription**: [4247a230-8931-4538-5b64-08d8d8feb47e](https://maestro.dot.net/subscriptions?search=4247a230-8931-4538-5b64-08d8d8feb47e) - **Build**: [20260318.1](https://dev.azure.com/dnceng/internal/_build/results?buildId=2929628) ([306754](https://maestro.dot.net/channel/8297/github:dotnet:icu/build/306754)) - **Date Produced**: March 18, 2026 1:24:27 PM UTC - **Commit**: [f510e59f6829c081b33b86cb51197db2e387d1a0](dotnet/icu@f510e59) - **Branch**: [dotnet/main](https://github.com/dotnet/icu/tree/dotnet/main) [DependencyUpdate]: <> (Begin) - **Dependency Updates**: - From [11.0.0-alpha.1.26167.1 to 11.0.0-alpha.1.26168.1][1] - Microsoft.NETCore.Runtime.ICU.Transport [1]: dotnet/icu@fb85cf3...f510e59 [DependencyUpdate]: <> (End) [marker]: <> (End:4247a230-8931-4538-5b64-08d8d8feb47e) Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Backport of dotnet#125970 to main fixes test failure like ``` Starting Azure Pipelines Test Run host-windows-x86-Debug @ windows.11.amd64.client ##[error].packages\microsoft.dotnet.helix.sdk\10.0.0-beta.26167.104\tools\Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(47,5): error : (NETCORE_ENGINEERING_TELEMETRY=Build) Creator is forbidden when using authenticated access. ``` --------- Co-authored-by: wfurt <tweinfurt@yahoo.com>
By assigning correct asset type to non-DLL runtime files. The _AddRuntimeLibsToPublishAssets target was setting AssetType="runtime" on all files from RuntimeFiles and LibrariesRuntimeFiles, including .pdb, .so, .dbg, and .dwarf files. The SDK's _ComputeAssembliesToPostprocessOnPublish uses AssetType="runtime" to stamp PostprocessAssembly=true, which caused these non-PE files to be passed as -r: references to ILC, resulting in BadImageFormatException: Unknown file format. Only .dll files should be classified as AssetType="runtime" (managed assemblies). All other extensions are now classified as AssetType="native". Fixes dotnet#126117
…otnet#125874) Noticed that e.g. `DOTNET_JitHashBreak=9b29601d` hits the error path below, which hits an assert due to `%ws`. Fix the format string, and then also fix the parsing so that it works properly when the upper bit is set.
Three fixes: - Skip reading compile results for superpmi replay. If the collection has compile results we never want to reuse them for superpmi replay. Various recording functions do not work properly when there already are existing compile results -- e.g. `recAllocMem` will keep the old instruction buffer instead of the new one from the replay. - Support passing a directory to -mch_files, and allow .mc files passed to -mch_files - Fix diffs report generation when a collection has zero succeeding contexts. Previously we would hit a division-by-zero error.
This is a small GCInfo size reduction (~0.1%). However, the main benefit is that this removes diffs from being introduced by `IGF_EXTEND` IGs, which are supposed to be 'transparent'. Q: should this be done in `emitGenNoGCLst`? A: probably not since the method has contracts related to various code region 'kinds'. We can only coalesce across regions of the same 'kind', which would be more complicated than this change and not as general.
The leg didn't trigger on dotnet#126282 and now it's broken.
…et#126203) This change allows the analyzer to recognize type equality patterns even when long-form or short-parameter opcodes are used. Also updated the file to use modern C# pattern matching.
fixes dotnet#126107 Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
With the JIT baseline now at SSE4.2, these fallback implementations are no longer reachable.
## Description
`fgOptimizeDelegateConstructor` asserts `targetMethodHnd == nullptr`
when `ldftnToken` is null. This fires when
`UnsafeAccessorKind.Constructor` targets a delegate type — the generated
IL uses `newobj Del::.ctor(object, IntPtr)` without a preceding `ldftn`,
so `ldftnToken` is null. However, after multi-level inlining, the
function pointer argument can become a `GT_FTN_ADDR` node, setting
`targetMethodHnd` non-null via pattern matching.
```csharp
[UnsafeAccessor(UnsafeAccessorKind.Constructor)]
extern static Del Construct(object o, IntPtr p);
// Triggers: Assert failed 'targetMethodHnd == nullptr' during 'Morph - Inlining'
Construct(null, (nint)Resolver<Program>.Resolve())();
```
Changes:
- **Remove overly strict assert** — `targetMethodHnd` can legitimately
be non-null when `ldftnToken` is null (pattern matching found a
`GT_FTN_ADDR` from an inlined `ldftn` in a different IL stream)
- **Guard R2R path against null `ldftnToken`** — the `else if (oper ==
GT_FTN_ADDR)` branch dereferences `ldftnToken->m_token`; added `&&
ldftnToken != nullptr` to prevent null dereference
- **Add regression test** —
`Verify_ConstructDelegateFromFunctionPointer` constructs a delegate via
`UnsafeAccessor` with a function pointer from a static virtual interface
method
<!-- START COPILOT ORIGINAL PROMPT -->
<details>
<summary>Original prompt</summary>
>
> ----
>
> *This section details on the original issue you should resolve*
>
> <issue_title>JIT assert with UnsafeAccessor and delegate
ctor</issue_title>
> <issue_description>### Description
>
> Using UnsafeAccessor to make a delegate from a pointer to a static
virtual ends up with a JIT assert.
>
> ### Reproduction Steps
>
> ```cs
> using System;
> using System.Runtime.CompilerServices;
>
> internal unsafe class Program : IInterface
> {
> public delegate object Del();
>
> [UnsafeAccessor(UnsafeAccessorKind.Constructor)]
> extern static Del Construct(object o, IntPtr p);
>
> public static void Main()
> {
> Construct(null, (nint)Resolver<Program>.Resolve())();
> }
>
> public static class Resolver<T> where T : IInterface
> {
> public static delegate*<object> Resolve() => &T.A;
> }
> }
>
> public interface IInterface
> {
> public static virtual object A() => null;
> }
> ```
>
> ### Expected behavior
>
> Works
>
> ### Actual behavior
>
> ```
> Assert failure(PID 47564 [0x0000b9cc], Thread: 42584 [0xa658]):
Assertion failed 'targetMethodHnd == nullptr' in
'Program:Main():System.Void' during 'Morph - Inlining' (IL size 17; hash
0xb6bd7cc8; FullOpts)
>
> File: H:\Projects\dotnet\runtime\src\coreclr\jit\flowgraph.cpp:1109
> Image:
H:\Projects\dotnet\runtime\artifacts\bin\coreclr\windows.x64.Checked\CoreRun.exe
> ```
>
> ### Regression?
>
> No idea
>
> ### Known Workarounds
>
> _No response_
>
> ### Configuration
>
> Main Windows x64
>
> ### Other information
>
> _No response_</issue_description>
>
> ## Comments on the Issue (you are @copilot in this section)
>
> <comments>
> </comments>
>
</details>
<!-- START COPILOT CODING AGENT SUFFIX -->
- Fixes dotnet#125866
<!-- START COPILOT CODING AGENT TIPS -->
---
📱 Kick off Copilot coding agent tasks wherever you are with [GitHub
Mobile](https://gh.io/cca-mobile-docs), available on iOS and Android.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com>
Co-authored-by: Egor Bogatov <egorbo@gmail.com>
…BookkeepingMemoryRegions, GetGCFreeRegions) (dotnet#124926) Implements three `ISOSDacInterface13` APIs natively in the cDAC by extending the `IGC` contract: `GetHandleTableMemoryRegions`, `GetGCBookkeepingMemoryRegions`, and `GetGCFreeRegions`. These APIs previously delegated to legacy DAC or returned `E_NOTIMPL`. ## Description ### SOSMemoryEnum - Added `SOSMemoryEnum` implementation following the `SOSHandleEnum` pattern, with built-in `#if DEBUG` validation against the legacy DAC enum (validates `Start`, `Size`, `ExtraData`, `Heap` fields on `Next()`, `Skip()`, `Reset()`, and `GetCount()`) ### IGC Contract Extensions - Extended `IGC` contract with three new memory region enumeration methods - Safety caps as named constants matching native DAC: `MaxHandleTableRegions` (8192), `MaxBookkeepingRegions` (32), `MaxSegmentListIterations` (2048) - `GetGCFreeRegions` uses `TryReadGlobal` for `CountFreeRegionKinds` to gracefully handle non-regions GC builds - Added null guard for `heap.FreeRegions` value in server GC path - Bookkeeping loop includes underflow guard (`next > cardTableInfoSize`) matching native DAC `daccess.cpp:8313` ### Data Types & Descriptors - Added `CardTableInfo` and `RegionFreeList` data classes - Extended `GCHeap` type descriptors with optional fields (`FreeableSohSegment`, `FreeableUohSegment`, `FreeRegions`) conditional on GC build configuration (`BACKGROUND_GC`, `USE_REGIONS`) - Native datadescriptor changes for new globals and types ### Documentation & Tests - Updated `docs/design/datacontracts/GC.md` with new API types, data descriptors, globals, constants, and pseudocode following cDAC best practices (field-access patterns using `target.ReadPointer(addr + /* offset */)` instead of `Read<DataType>()` syntax) - Added comprehensive unit tests (`GCMemoryRegionTests.cs`) covering single/multiple segments, empty buckets, linked segments, bookkeeping entries, and free regions - Added dump tests in `ServerGCDumpTests.cs` and `WorkstationGCDumpTests.cs` ### Merge Conflict Resolution - Resolved merge conflicts with `origin/main` in `GC.md` and `GC_1.cs`, integrating the new `GetHandleExtraInfo` implementation from main alongside the memory region APIs - Fixed merge issue where extra unrelated files were pulled into the diff by resetting ~130 non-PR files back to match `origin/main` - Removed duplicate `HandleSegmentSize` constant introduced by the merge ### CI Fix - Investigated CI failures showing "action_required" status (awaiting maintainer approval) rather than actual build failures - Reset 12 additional extra non-PR files that were still present in the diff back to `origin/main`, ensuring the diff contains exactly 16 PR-specific files - Verified all 1362 cDAC tests pass (including 32 GC memory region tests) and managed projects build with 0 errors <!-- START COPILOT CODING AGENT TIPS --> --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --------- Co-authored-by: Max Charlamb <maxcharlamb@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add area-Diagnostics-cdac to docs/area-owners.md and .github/policies/resourceManagement.yml with the same owners as area-Diagnostics-coreclr (@steveisok, @tommcdon, @dotnet/dotnet-diag). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…otnet#125884) ## Split Dictionary.Remove inner loops for value-type key optimization `FindValue`, `TryInsert`, and `CollectionsMarshalHelper.GetValueRefOrAddDefault` all split their inner loops into separate value-type and comparer paths: ```csharp if (typeof(TKey).IsValueType && comparer == null) { while (...) { ... EqualityComparer<TKey>.Default.Equals(...) ... } } else { while (...) { ... comparer.Equals(...) ... } } ``` This lets the JIT register-allocate each loop independently. The value-type loop has no virtual calls, so the register allocator can keep all loop variables in registers. `Remove` (both overloads) was not split during the PR dotnet#75663 refactoring that introduced this pattern. It uses a single loop with an inline ternary: ```csharp while (i >= 0) { if (entry.hashCode == hashCode && (typeof(TKey).IsValueType && comparer == null ? EqualityComparer<TKey>.Default.Equals(entry.key, key) : comparer!.Equals(entry.key, key))) { ... } } ``` The JIT keeps both code paths alive in a single loop, so the register allocator must plan for the comparer virtual call even on the value-type path. This causes unnecessary stack spills and reloads every iteration. This PR applies the same split to both `Remove(TKey key)` and `Remove(TKey key, out TValue value)`. ### Codegen impact (x64, `Dictionary<int, int>.Remove`) **Before** (single merged loop): 3 spills + 3 reloads per iteration ```asm ; LOOP ENTRY - spills happen every iteration mov [rsp+0x30], r10d ; SPILL entries.Length mov [rsp+0x34], eax ; SPILL i mov [rsp+0x28], r9 ; SPILL &entries[i] ... ; LOOP ADVANCE - reloads mov eax, [rsp+0x34] ; RELOAD i mov eax, [r9+0x04] ; i = entry.next cmp [rsp+0x30], edi ; RELOAD entries.Length ``` **After** (value-type loop): 0 spills, 0 reloads ```asm ; LOOP BODY - all registers cmp eax, ebp ; bounds check (register) lea rcx, [r13+rcx+0x10] ; &entries[i] (register) cmp [rcx], r14d ; hashCode (register) cmp [rcx+0x08], esi ; key compare - direct, no virtual call ; LOOP ADVANCE mov eax, [rcx+0x04] ; i = entry.next (register) cmp ebp, edi ; entries.Length (register) ``` arm64 shows the same pattern: baseline has 1 spill + 2 reloads per iteration (fewer due to more callee-saved registers); modified value-type loop has 0. ### Benchmark results Both builds from identical commit (b38cc2a), only `Dictionary.cs` differs. BDN v0.16.0-nightly, `--coreRun` A/B, MannWhitney statistical test with 3ns threshold. Two independent runs: Run 1 (15 iterations), Run 2 (30 iterations, high priority process). Run 2 shown below; Run 1 results were consistent. Machine: Intel Core i9-14900K 3.20GHz, 24 cores (hybrid P/E), 64GB RAM, Windows 11. #### All Remove scenarios (Run 2, 30 iterations) | Type | Method | Size | Baseline (ns) | Modified (ns) | Ratio | MannWhitney | |------|--------|------|---------------|---------------|-------|-------------| | Guid, Int32 | Remove_Hit | 16 | 101.0 | 84.1 | 0.83 | **Faster** <-- | | Guid, Int32 | Remove_Hit | 512 | 3,081.4 | 2,864.2 | 0.93 | **Faster** <-- | | Guid, Int32 | Remove_Hit | 4096 | 30,195.3 | 27,925.7 | 0.92 | **Faster** <-- | | Guid, Int32 | Remove_Miss | 16 | 30.4 | 30.2 | 0.99 | Same | | Guid, Int32 | Remove_Miss | 512 | 1,080.9 | 926.8 | 0.89 | Same | | Guid, Int32 | Remove_Miss | 4096 | 8,750.4 | 8,850.4 | 1.01 | Slower | | Guid, Int32 | TryRemove_Hit | 16 | 99.6 | 87.1 | 0.87 | **Faster** <-- | | Guid, Int32 | TryRemove_Hit | 512 | 3,657.3 | 2,975.0 | 0.82 | **Faster** <-- | | Guid, Int32 | TryRemove_Hit | 4096 | 29,808.7 | 28,089.5 | 0.94 | **Faster** <-- | | Int32, Int32 | Remove_Hit | 16 | 67.5 | 63.3 | 0.94 | **Faster** <-- | | Int32, Int32 | Remove_Hit | 512 | 2,266.9 | 2,124.9 | 0.94 | **Faster** <-- | | Int32, Int32 | Remove_Hit | 4096 | 20,806.0 | 20,775.1 | 1.00 | Same <-- | | Int32, Int32 | Remove_Miss | 16 | 24.7 | 23.5 | 0.95 | Same | | Int32, Int32 | Remove_Miss | 512 | 716.1 | 668.5 | 0.93 | **Faster** | | Int32, Int32 | Remove_Miss | 4096 | 6,414.8 | 6,647.6 | 1.04 | Slower | | Int32, Int32 | TryRemove_Hit | 16 | 67.8 | 63.2 | 0.93 | **Faster** <-- | | Int32, Int32 | TryRemove_Hit | 512 | 2,316.2 | 2,092.7 | 0.90 | **Faster** <-- | | Int32, Int32 | TryRemove_Hit | 4096 | 21,362.0 | 20,311.9 | 0.95 | **Faster** <-- | | String, String | Remove_Hit | 16 | 185.0 | 185.6 | 1.00 | Same | | String, String | Remove_Hit | 512 | 6,986.0 | 6,827.3 | 0.98 | **Faster** | | String, String | Remove_Hit | 4096 | 109,184.1 | 113,931.7 | 1.04 | Slower | | String, String | Remove_Miss | 16 | 81.3 | 81.5 | 1.00 | Same | | String, String | Remove_Miss | 512 | 2,482.6 | 2,491.2 | 1.00 | Same | | String, String | Remove_Miss | 4096 | 49,162.6 | 48,452.3 | 0.99 | **Faster** | | String, String | TryRemove_Hit | 16 | 185.8 | 185.8 | 1.00 | Same | | String, String | TryRemove_Hit | 512 | 7,012.7 | 6,837.4 | 0.98 | **Faster** | | String, String | TryRemove_Hit | 4096 | 113,241.7 | 111,874.3 | 0.99 | **Faster** | #### Analysis - **Value-type Hit (Guid, Int32): 6-18% faster** — consistent improvement across all sizes and both Remove overloads, statistically significant in both runs. These are marked with `<--` - **String (ref type): neutral** — expected, as ref types always store a non-null comparer (since dotnet#75663) and always take the comparer path regardless of the loop split. - **Miss scenarios: neutral** — expected, as the optimization eliminates register spills in the equality comparison, which is rarely reached on a miss (most iterations fail the hashCode check and skip immediately). Small fluctuations in both directions are within machine noise. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…kups (dotnet#126175) ## Description Crossgen2-comparison outerloop tests intermittently fail with `KeyError` (e.g. `KeyError: 'System.DirectoryServices'`) when an assembly is present in the base output but missing from the diff output. **Root cause:** When generating XML results for assemblies in `omitted_from_diff_dir` (assemblies in base but not in diff), line 922 looked them up in `diff_results_by_name` — the dictionary that by definition does not contain them: ```python # Before (bug): looks up in the wrong dictionary for assembly_name in omitted_from_diff_dir: base_result = diff_results_by_name[assembly_name] # After (fix): looks up in the correct dictionary for assembly_name in omitted_from_diff_dir: base_result = base_results_by_name[assembly_name] ``` The adjacent `omitted_from_base_dir` loop correctly uses `diff_results_by_name`; this was a copy-paste error in the opposite direction. <!-- START COPILOT CODING AGENT TIPS --> --- 📍 Connect Copilot coding agent with [Jira](https://gh.io/cca-jira-docs), [Azure Boards](https://gh.io/cca-azure-boards-docs) or [Linear](https://gh.io/cca-linear-docs) to delegate work to Copilot in one click without leaving your project management tool. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jtschuster <36744439+jtschuster@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
At the end of the opt phases, if we've modified the flowgraph, look for unreachable blocks and remove them. In particular we want to try and find unreachable try entries earlier, so the try regions themselves can get cleaned up. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…otnet#126000) ## Description ARM64 `JitDisasm` output for indirect calls (`blr`/`br_tail`) was missing method name annotations. On x64, indirect calls render as `call [addr] ; MethodName`, but on ARM64 the equivalent `blr x2` had no function name comment. The method handle is already stored in `idMemCookie` for all call types. The `IF_BR_1B` display case in `emitDispInsHelp` simply never read it. **Before:** ```asm blr x2 ``` **After:** ```asm blr x2 // code for IndirectCallAnnotation:VirtualCallee():int ``` ### Changes - **`src/coreclr/jit/emitarm64.cpp`**: In the `IF_BR_1B` case of `emitDispInsHelp`, call `emitDispCommentForHandle` with `GTF_ICON_FTN_ADDR` to display the method name — matching the format used by `movz`/`movk` constant annotations in checked builds and the convention used by other architectures. <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>JIT: Release `JitDisasm` does not include function names for calls on arm64</issue_title> > <issue_description>E.g. I see > ```asm > ; Assembly listing for method vb_playground.M1:Main() (FullOpts) > ; Emitting BLENDED_CODE for arm64 on Apple > ; FullOpts code > ; optimized code > ; fp based frame > ; partially interruptible > ; No PGO data > ; 0 inlinees with PGO data; 2 single block inlinees; 0 inlinees without PGO data > > G_M000_IG01: ;; offset=0x0000 > stp fp, lr, [sp, #-0x10]! > mov fp, sp > > G_M000_IG02: ;; offset=0x0008 > movz x0, #0x2570 > movk x0, #0x99F LSL dotnet#16 > movk x0, dotnet#1 LSL dotnet#32 > bl CORINFO_HELP_NEWSFAST > mov w1, dotnet#3 > movz x2, #0x7C00 > movk x2, dotnet#590 LSL dotnet#16 > movk x2, dotnet#1 LSL dotnet#32 > ldr x2, [x2] > blr x2 > > G_M000_IG03: ;; offset=0x0030 > ldp fp, lr, [sp], #0x10 > ret lr > > ; Total bytes of code 56 > ``` > when using `JitDisasm` on preview3. I would expect a comment with the function name for `x2` (IIRC that was the primary reason we did work to store debug info in this case) > > cc @dotnet/jit-contrib </issue_description> > > <agent_instructions>Please improve JitDisasm for arm64 for Release. Most (if not all) managed-to-managed calls are indirect, but on x64 we render them as `call [Foo]` while on arm64 we do what is described in the issue with no function name. Please write a small FILECHECK test for arm64 specifically.</agent_instructions> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > </comments> > </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes dotnet#125951 <!-- START COPILOT CODING AGENT TIPS --> --- 💬 Send tasks to Copilot coding agent from [Slack](https://gh.io/cca-slack-docs) and [Teams](https://gh.io/cca-teams-docs) to turn conversations into code. Copilot posts an update in your thread when it's finished. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com> Co-authored-by: Egor Bogatov <egorbo@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Try to fold `SequenceEqual` with the help of VN.
Codegen for
```cs
Test.CompareStr();
class Test
{
public static string Str1 => "abcdef";
public static string Str2 => "bcdefg";
[MethodImpl(MethodImplOptions.NoInlining)]
public static bool CompareStr()
{
return Str1.Equals(Str2);
}
}
```
Before:
```asm
G_M64625_IG01: ;; offset=0x0000
;; size=0 bbWeight=1 PerfScore 0.00
G_M64625_IG02: ;; offset=0x0000
mov rax, 0x2DA4A1AEFF0 ; 'abcdef'
add rax, 12
mov rcx, 0x2DA4A1AF024
mov rdx, qword ptr [rax]
mov rax, qword ptr [rax+0x04]
mov r8, qword ptr [rcx]
xor rdx, r8
xor rax, qword ptr [rcx+0x04]
or rax, rdx
sete al
movzx rax, al
;; size=50 bbWeight=1 PerfScore 11.50
G_M64625_IG03: ;; offset=0x0032
ret
;; size=1 bbWeight=1 PerfScore 1.00
```
After:
```asm
G_M64625_IG01: ;; offset=0x0000
;; size=0 bbWeight=1 PerfScore 0.00
G_M64625_IG02: ;; offset=0x0000
xor eax, eax
;; size=2 bbWeight=1 PerfScore 0.25
G_M64625_IG03: ;; offset=0x0002
ret
;; size=1 bbWeight=1 PerfScore 1.00
```
---------
Co-authored-by: Egor Bogatov <egorbo@gmail.com>
Fixes dotnet#126196 > [!NOTE] > This PR was created with assistance from GitHub Copilot. ## Problem The cDAC Helix dump test commands use `exit` (Unix: `exit $_rc`, Windows: `exit /b 1`) which terminates the entire Helix work item shell prematurely, preventing: - Post-commands from running - Test results from being reported to the xUnit reporter - Proper Helix work item completion ## Fix Build command lines in `_HelixCommandLines` ItemGroup, write to `HelixCommand.txt` via `WriteLinesToFile`, then read back as the `HelixWorkItem` Command via `$([System.IO.File]::ReadAllText())`. This follows the pattern established by `helixpublishwitharcade.proj` (lines 363-412, 733-734, 799-801). ### Exit code handling **Windows**: Uses `%ComSpec% /C exit %test_exit_code%` instead of `exit /b` — spawns a sub-shell to set ERRORLEVEL without killing the parent shell. Uses `%25` encoding for `%VAR%` references to avoid conflict with MSBuild `%(metadata)` syntax. **Unix**: Uses `set_return` pattern (defining a function that uses `return` instead of `exit`) to propagate exit code without terminating the shell: ```bash set_return() { return $1; } set_return $test_exit_code ``` Both patterns are taken directly from `helixpublishwitharcade.proj`. --------- Co-authored-by: Max Charlamb <maxcharlamb@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Validate that the length passed to `Buffer.MemoryCopy`, `Buffer.BlockCopy`, and `Unsafe.Copy*` doesn't silently overflow. <!-- START COPILOT CODING AGENT SUFFIX --> Created from Copilot CLI via the copilot delegate command. <!-- START COPILOT CODING AGENT TIPS --> --- 🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. [Learn more about Advanced Security.](https://gh.io/cca-advanced-security) --------- Co-authored-by: EgorBo <egorbo@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Corrects comma placement logic in ArrayValue.ToString() to properly separate elements and nested values.
…superlinear behavior with polymorphic recursion (dotnet#126534) `GetVersionResilientTypeHashCode` recomputes hash codes recursively without memoization. With polymorphic recursion, each level doubles the number of unique generic instantiations, causing exponential blowup — `n=31` took ~15s before this fix. On 64-bit release builds, `MethodTableAuxiliaryData` already had 4 bytes of implicit alignment padding between the union and `m_pLoaderModule`. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: davidwrighton <10779849+davidwrighton@users.noreply.github.com> Co-authored-by: David Wrighton <davidwr@microsoft.com> Co-authored-by: Jan Kotas <jkotas@microsoft.com> Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
…aders (dotnet#126709) PR dotnet#126388 bumped the shared webcil version from 0 to 1, we need to update the rest of webcil ecosystem ## Changes - **Webcil.cs**: Add `TableBase` field to managed `WebcilHeader` struct (32 bytes for V1) - **WebcilConverter.cs**: Parameterize version (default V0), write V0 (28-byte) or V1 (32-byte) header based on `webcilVersion` parameter. Add version validation and WriteStructure bounds checks. - **WebcilReader.cs**: Read V0 header first (28 bytes), then conditionally read extra 4-byte TableBase for V1. Version-aware `SectionDirectoryOffset`. - **webcil-loader.c** (Mono): Accept both V0 and V1 headers with proper bounds checking, use correct header size for section offset calculation. The WebcilConverter defaults to V0, which is what non-R2R builds use. The R2R/crossgen2 path (WasmObjectWriter + WebcilEncoder) produces V1 independently. Both CoreCLR and Mono decoders now handle V0 and V1. ## Testing - CoreCLR browser-wasm library tests: 474/476 passed (2 skipped, 0 failed) - Mono browser-wasm library tests: 469/476 passed (5 pre-existing DateTime failures, 0 webcil-related) Supersedes dotnet#126698 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Description `PEDecoder` uses `const_cast<PEDecoder *>(this)->` in 8 places across `const` methods to lazily cache validation flags and header pointers. This is the textbook use case for `mutable`. Mark the four cache fields as `mutable` and remove all `const_cast` usages: - **`src/coreclr/inc/pedecoder.h`** — Add `mutable` to `m_flags`, `m_pNTHeaders`, `m_pCorHeader`, `m_pReadyToRunHeader` - **`src/coreclr/utilcode/pedecoder.cpp`** — Remove 7 `const_cast` sites (`HasNTHeaders`, `CheckNTHeaders`, `CheckCorHeader`, `CheckILOnly` ×2, `FindReadyToRunHeader` ×2) - **`src/coreclr/inc/pedecoder.inl`** — Remove 1 `const_cast` site (`GetCorHeader`) No layout or behavioral change — `mutable` only affects const-correctness at compile time. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: adamperlin <10533886+adamperlin@users.noreply.github.com>
Fixes dotnet#124509 Brings support for RandomTraceId flag known from W3C context propagation level 2. More details in linked issue. --------- Co-authored-by: Tarek Mahmoud Sayed <tarekms@microsoft.com> Co-authored-by: Tarek Mahmoud Sayed <10833894+tarekgh@users.noreply.github.com>
…artial class declarations (dotnet#124698) Fixes dotnet#77409 ## Problem When the SYSLIB1045 `BatchFixer` applies multiple code fixes concurrently to separate partial class declarations, each fix independently checks existing members via the semantic model. Since all fixers see the **original** compilation (before any fix is applied), they all conclude `MyRegex` is available and generate duplicate property names, causing CS0756. ## Fix In `UpgradeToGeneratedRegexCodeFixer.CreateGeneratedRegexProperty`, after finding the first available name via the semantic model, the new `CountPrecedingRegexCallSites` helper scans all `DeclaringSyntaxReferences` of the containing type for Regex constructor/static-method call sites that would also generate new property names. These are sorted deterministically (by file path, then span position). The current node's index in that sorted list determines how many names to skip, giving each concurrent fixer a unique name. ## Test Added `CodeFixerGeneratesUniqueNamesAcrossPartialClassDeclarations` — two `new Regex(...)` calls in separate partial class declarations of the same type, verifying `MyRegex` and `MyRegex1` are generated. Verified the test fails without the fix (both declarations produce `MyRegex`). --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ain (dotnet#125853) I noticed that my agents kept checking out the main branch and doing a build. Especially if/when main had diverged from whatever the working branch was based on, this was a lot of unnecessary work/time. We just want a 'before new changes' build, not main specifically. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: elinor-fung <47805090+elinor-fung@users.noreply.github.com> Co-authored-by: Elinor Fung <elfung@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…LR and NativeAOT) (dotnet#126680) `runtime-async=on` was only set for `Release` builds of CoreCLR's `System.Private.CoreLib`, and was gated behind a riscv64/loongarch64 exclusion in NativeAOT's `System.Private.CoreLib`. This enables it unconditionally in both projects. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: agocke <515774+agocke@users.noreply.github.com> Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com> Co-authored-by: MichalStrehovsky <13110571+MichalStrehovsky@users.noreply.github.com>
…erlying max size of context (dotnet#125819) IXClrDataStackWalk methods don't take into account that the Windows context size is the one of the Windows SDK CONTEXT on AMD64. It does not include inline XSTATE. If some customer passes in XSTATE, ContextSizeForFlags always returns 0x4D0 regardless of context flag. So CheckContextSizeForBuffer always returns true. Then CopyMemory(&m_context, context, ...) will overrun for any XSTATE input and overwrite other fields.
Contributes to prio3 from dotnet#123864 --------- Co-authored-by: Jan Kotas <jkotas@microsoft.com>
…et#126715) PR dotnet#126550 disabled `FEATURE_CORPROFILER` for tvOS (`AND NOT CLR_CMAKE_TARGET_TVOS` uncommented) but left iOS, MacCatalyst, and Android commented out (profiler still enabled). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ThreadPoolWorkQueue uses FrameworkEventSource during ThreadPool initialization. If that is the first time the FrameworkEventSource is being used, creating it on-demand will acquire the EventListener lock which can deadlock. We avoid that deadlock by creating the EventSource earlier on the startup path where nothing is waiting and holding the lock. We've seen several of these EventSource lock deadlocks over the years and it would be nice to make a more comprehensive fix, but unforetunately the only broad fix I know of would be make EventSource callbacks asynchronous relative to the underlying ETW callbacks. That isn't out of the question but it does carry a broader set of compatibility risks so we've generally favored these lower risk spot fixes instead. Fixes dotnet#126591
…stripping (dotnet#126631) ## Summary Fixes test failures on iOS, tvOS, and MacCatalyst caused by IL stripping removing method bodies. ## Changes - **`ProcessTests.Start_Disposed_ThrowsObjectDisposedException`**: Added `[SkipOnPlatform]` for iOS, tvOS, and MacCatalyst. - **`CustomMethodInfoTests.GetMethodImplementationFlags_ReturnsIL`** and **`CustomConstructorInfoTests.GetMethodImplementationFlags_ReturnsIL`**: Changed from `[Fact]` to `[ConditionalFact(...IsMethodBodySupported)]` since AOT IL stripping changes implementation flags from `IL` to `NoInlining`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…otnet#126638) Add Copilot agentic artifacts for Microsoft.Extensions.* and System.IO.Compression libraries — a review agent, a writing skill, and folder-scoped instruction files for various extensions areas. Includes a follow-up review pass, explicitly invokable as a separate agent. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ssHandlesTests (dotnet#126629) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com> Co-authored-by: Adam Sitnik <adam.sitnik@gmail.com>
…6157) > [!NOTE] > This PR description was AI/Copilot-generated. ## Summary Reduce Helix queue pressure by grouping ~172 individual WASM CoreCLR library test work items into ~21 batched work items (**88% reduction**), saving **56% of total machine time**. ## Verified CI Results All numbers from the latest CI run (build 1361877), compared against an unbatched baseline from the same queue: | Metric | Unbatched | Batched | Change | |--------|-----------|---------|--------| | Work items | 172 | 21 | **-88%** | | Tests run | 259,527 | 259,527 | **identical** | | Tests passed | 256,963 | 256,963 | **identical** | | Tests failed | 0 | 0 | | | Total machine time | 437m (7.3h) | 195m (3.2h) | **-56%** | | Longest | 30m13s | well within timeout (160m for 8-suite batch) |batch | | Queue slots consumed | 172 | 21 | **-151** | ### Per-Batch Breakdown (latest run) | Batch | Suites | Duration | Batch | Suites | Duration | |-------|--------|----------|-------|--------|----------| | Batch-4 | 8 | 30m13s | Batch-14 | 9 | 7m16s | | Batch-6 | 8 | 19m07s | Batch-1 | 8 | 7m12s | | Batch-5 | 8 | 18m23s | Batch-12 | 9 | 7m06s | | Batch-8 | 9 | 17m16s | Batch-9 | 9 | 6m57s | | Batch--1 | 1 | 13m29s | Batch-15 | 9 | 5m59s | | Batch-0 | 7 | 10m37s | Batch-3 | 8 | 5m42s | | Batch-10 | 9 | 10m16s | Batch-2 | 8 | 5m40s | | Batch-19 | 9 | 5m33s | Batch-17 | 9 | 4m30s | | Batch-13 | 9 | 5m03s | Batch-7 | 8 | 3m38s | | Batch-11 | 9 | 4m42s | Batch-18 | 9 | 3m19s | | | | | Batch-16 | 9 | 2m48s | ### Where the Savings Come From The 56% machine time reduction (~245 minutes saved) comes from eliminating **Helix per-work-item overhead**: | Overhead source | Per item | Items eliminated | Total saved | |---|---|---|---| | Helix scheduling + payload download | ~45s | 151 | ~113m | | Artifact upload + reporting | ~30s | 151 | ~75m | | Queue slot acquisition | ~15s | 151 | ~38m | | **Total** | **~1.5m** | **151** | **~227m** | Chrome and xharness are **not** re-downloaded per they are in the Helix correlation payload (shared per machine). Chrome restarts per suite within batches (~1s each), so reusing Chrome would only save ~3 minutes total (not worth xharness changes).item ## Changes - **`eng/testing/WasmBatchRunner.sh`** (new): Batch runner script that extracts and runs multiple test suites sequentially within a single Helix work item, with per-suite result isolation via separate `HELIX_WORKITEM_UPLOAD_ROOT` directories. Includes error handling for failed extractions (preserving actual unzip exit code) and restores `HELIX_WORKITEM_UPLOAD_ROOT` after the batch loop for Helix post-commands. - **`src/libraries/sendtohelix-browser.targets`** (modified): - `WasmBatchLibraryTests` property (defaults `true` for CoreCLR+Chrome, `false` otherwise) - `_AddBatchedWorkItemsForLibraryTests` target: stages batch directories, zips them (PayloadArchive), and constructs HelixCommand via `Replace` with a build-time validation guard - Sample apps excluded from batching, kept as individual work items - Original target gated on `WasmBatchLibraryTests != true` - **`src/tasks/HelixTestTasks/`** (new project): - `GroupWorkItems` compiled MSBuild task: greedy bin-packing by file size, large suites (>50MB) stay solo - `ComputeBatchTimeout` compiled MSBuild task: 20 min/suite, 30 min minimum (accounts for WASM startup overhead + heaviest suites like Cryptography at ~17m) - Non-shipping task assembly, referenced via `UsingTask` from `sendtohelix-browser.targets` ## Design The batch runner is a thin loop wrapper around each suite's existing `RunTests.sh`. It does **not** duplicate any test setup all xharness/Chrome/WASM configuration stays inside each suite's generated runner script. The coupling surface is minimal:logic - `HelixCommand` prefix (env vars, dev-certs) is preserved via `.Replace()` on the `./RunTests.sh` suffix - A build-time `<Error>` guard validates the replacement succeeded - Each suite's `HELIX_WORKITEM_UPLOAD_ROOT` is isolated to a per-suite subdirectory - Extracted suite directories are cleaned up between runs to free disk space - Stale batch staging directories are cleaned via `<RemoveDir>` before each build ## Opt-out Disable with `/p:WasmBatchLibraryTests=false` to fall back to individual work items. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: David Wrighton <davidwr@microsoft.com>
Defer calling genEmitStartBlock until after we've set up the funclet prolog IG. Pass the BasicBlock back from emitter to codegen for the funclet epilog generation since epilog codegen is position dependent. Also, add an on-by default R2R fail-over at the end of codegen for methods with funclets, since the Wasm for method with funclets can't be validated. We can remove this once the JIT host extracts the funclets as separate functions. You can override this via `JitWasmFunclets=1`. Also includes a fix for dotnet#125756 (comment), which was causing node references to be collected more than once; this would corrupt the node's internal registers.
…scs (dotnet#126728) Regressed by dotnet#125900 from dotnet#124354 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…shmentRateLimiters` incorrectly (dotnet#126158) 1. Fixed `ChainedRateLimiter.IdleDuration`. In the current PR if it detects any of chained child rateLimiters having `null` IdleDuration, it will return null now. `IdleDuration` equalling `now` means rateLimiter is in use or is not ready to be idle. 2. `DefaultPartitionedRateLimiter` does not replenish the children rateLimiters of `ChainedRateLimiter`. Now there are 2 implementations of chained limiter depending on the passed limiters to `RateLimiter.CreateChained(...)` Related dotnet#68776 Fixes dotnet#125560
Copy the hotreload-utils source code from dotnet/hotreload-utils into src/tools/hotreload-delta-gen/ and replace the NuGet PackageReference with direct in-repo MSBuild targets, eliminating the cross-repo dependency. Changes: - Add 5 projects + shared code from hotreload-utils under src/tools/hotreload-delta-gen/ - Create eng/testing/hotreload-delta-gen.targets replacing the NuGet package's .targets file - Update 3 consumers (ApplyUpdate tests, WASM HotReload test, Mono sample) to import local targets instead of NuGet package - Add BuildTool and Tasks to src/libraries/pretest.proj for build ordering - Remove hotreload-utils dependency from eng/Version.Details.xml and eng/Version.Details.props - Suppress analyzers on imported code (RunAnalyzers=false) matching the illink pattern --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Alexander Köplinger <alex.koeplinger@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
… item Update (dotnet#125630) ## Description Follow-up to PR dotnet#124801 review feedback: the "intersection via include/remove, then remove+re-include" pattern in `_PrepareTrimConfiguration` was complex, mutated item ordering, and used two throwaway item groups. Replace with a direct MSBuild `Update`. ### Change **Before** — compute intersection manually to set metadata, then remove+re-add items: ```xml <__SingleWarnIntermediateAssembly Include="@(ResolvedFileToPublish)" /> <__SingleWarnIntermediateAssembly Remove="@(IntermediateAssembly)" /> <_SingleWarnIntermediateAssembly Include="@(ResolvedFileToPublish)" /> <_SingleWarnIntermediateAssembly Remove="@(__SingleWarnIntermediateAssembly)" /> <_SingleWarnIntermediateAssembly> <TrimmerSingleWarn Condition="...">false</TrimmerSingleWarn> </_SingleWarnIntermediateAssembly> <ResolvedFileToPublish Remove="@(_SingleWarnIntermediateAssembly)" /> <ResolvedFileToPublish Include="@(_SingleWarnIntermediateAssembly)" /> ``` **After** — update matching items directly, preserving order: ```xml <ResolvedFileToPublish Update="@(IntermediateAssembly)"> <TrimmerSingleWarn Condition=" '%(ResolvedFileToPublish.TrimmerSingleWarn)' == '' ">false</TrimmerSingleWarn> </ResolvedFileToPublish> ``` ## Changes proposed in this pull request - [`Microsoft.NET.ILLink.targets`] Replace 13-line intersection pattern with a 3-line `Update` in `_PrepareTrimConfiguration` - [`Microsoft.NET.ILLink.targets`] Qualify `%(TrimmerSingleWarn)` as `%(ResolvedFileToPublish.TrimmerSingleWarn)` in the `Update` condition to prevent MSB4096 (unqualified metadata batching over all `ResolvedFileToPublish` items, including those without the metadata defined) ## Additional context <!-- START COPILOT CODING AGENT TIPS --> --- 🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. [Learn more about Advanced Security.](https://gh.io/cca-advanced-security) --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: sbomer <787361+sbomer@users.noreply.github.com>
This PR contains one of @davidwrighton's commits extracted from dotnet#126662 to reduce the PR size and complexity. It adds a relocation type that is essentially `R_WASM_MEMORY_ADDR_REL_LEB`, so that a relocation corresponding to a relative offset from r2r `imageBase` can be used directly as the `offset` in a load, like: ```wasm global.get $imageBase i32.load align=... offset=<reloc> ``` The functionality isn't used yet, but will be in dotnet#126662. --------- Co-authored-by: David Wrighton <davidwr@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
) ## Summary Add typed field extension methods for the cDAC managed reader and convert all `IData<T>` classes to use them. This builds on the native data descriptor type system (dotnet#126163) to add managed-side type validation. ### Changes **`TargetFieldExtensions.cs`** — New extension methods on `Target`: - `ReadField<T>` — reads a primitive field with `Debug.Assert` type validation - `ReadPointerField` / `ReadPointerFieldOrNull` — reads pointer fields (required/optional) - `ReadNUIntField` — reads native unsigned integer fields - `ReadCodePointerField` — reads code pointer fields - `ReadDataField<T>` / `ReadDataFieldPointer<T>` — reads inline/pointer-to Data struct fields - `ReadFieldOrDefault<T>` — reads optional primitive fields with default value When the data descriptor includes type information (debug/checked builds), these methods assert that the declared field type is compatible with the C# read type. **All ~130 `IData<T>` classes converted** to use the new extension methods, replacing direct `target.Read<T>(address + (ulong)type.Fields[...].Offset)` calls. ### Bugs found and fixed The type system caught several pre-existing issues: 1. **DynamicStaticsInfo.GCStatics/NonGCStatics** — descriptor said `uint32`, native type is `TADDR` (pointer) 2. **EEClass.FieldDescList** — used `Read<ulong>` for a pointer field (broken on 32-bit) 3. **ExceptionClause.ClassToken** — read `TypeHandle` (nuint) as `uint` 4. **SimpleComCallWrapper.RefCount** — property was `ulong`, native is `LONGLONG` (signed) 5. **NativeCodeVersionNode.NativeCode** — descriptor was `T_POINTER`, actually `CodePointer` 6. **MethodTable.EEClassOrCanonMT** — descriptor was `T_NUINT`, managed reads as pointer > [!NOTE] > This content was generated with the assistance of GitHub Copilot. --------- Co-authored-by: Max Charlamb <maxcharlamb@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes dotnet#126619. This was a regression introduced in dotnet#114609 When `System.StubHelpers.GetCOMIPFromRCW()` misses the fast path on a native-created thread, the slow path can create OLE TLS and still resolve the COM interface pointer from the RCW cache. Those cache hits are borrowed and should not be released by the stub cleanup path. This change moves the `pfNeedsRelease` decision to the slow helper so it can distinguish between: - RCW cache hits after OLE TLS initialization - freshly acquired interface pointers from `ComObject::GetComIPFromRCWThrowing` Testing: - verified against the local repro from dotnet#126619 --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Original fix here missed some cases, as msbuild or helix cuts off the failure list after 300: dotnet@95f94b6 This is applying the same fix to other locations we needed this. I grabbed this codechange from @ylpoonlg (thanks in advance) Ran jitstress, arm64 is green
Remove the instrument_if_optimizing scenario from the libraries-pgo pipeline so the very slow JIT instrumentation mode no longer runs for libraries tests. The shared scenario definition remains available for coreclr tests.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AndyAyersMS
pushed a commit
that referenced
this pull request
Apr 17, 2026
AndyAyersMS
pushed a commit
that referenced
this pull request
Jun 19, 2026
This PR adds lane construction and composition APIs approved in dotnet#122557, and the corresponding JIT intrinsics. The JIT now recognizes the new vector APIs and expands them using existing SIMD nodes. The managed implementation allows decomposition through smaller vector widths when wider hardware support is unavailable. The xarch lowering uses fixed shuffle forms where profitable: - `vpbroadcast*` for sequence and alternating construction - vshufps and vpunpck* for 128-bit concat/zip/unzip patterns - full-width unpack plus `vperm2i128` for 256-bit zip/unzip - EVEX shuffle/permute forms for 512-bit operations when AVX512 is available - fixed immediate shuffles for reverse The ARM64 lowering avoids table-lookup forms for small fixed concat/reverse operations and uses direct element moves where applicable, such as `ins` and `rev64`. `CreateCauchySequence` has been removed as it doesn't make sense at all. And `SignSequence` has been moved to an extension property as we only want to expose it for signed numbers. Codegen: <details> <summary>Vector128</summary> ```asm ; Vector128.CreateGeometricSequence<int>(initial, 3) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M55419_IG01: ;; offset=0x0000 G_M55419_IG02: ;; offset=0x0000 vpbroadcastd xmm0, edx vpmulld xmm0, xmm0, xmmword ptr [reloc @rwd00] vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M55419_IG03: ;; offset=0x0016 ret RWD00 dq 0000000300000001h, 0000001B00000009h ; Total bytes of code 23 ; Vector128.CreateAlternatingSequence<int>(even, odd) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M23632_IG01: ;; offset=0x0000 G_M23632_IG02: ;; offset=0x0000 vpbroadcastd xmm0, edx vpbroadcastd xmm1, r8d vpunpckldq xmm0, xmm0, xmm1 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M23632_IG03: ;; offset=0x0017 ret ; Total bytes of code 24 ; Vector128<int>.SignSequence ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M37526_IG01: ;; offset=0x0000 G_M37526_IG02: ;; offset=0x0000 vmovddup xmm0, qword ptr [reloc @rwd00] vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M37526_IG03: ;; offset=0x000F ret RWD00 dq FFFFFFFF00000001h ; Total bytes of code 16 ; Vector128.CreateHarmonicSequence<float>(start, step) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M24925_IG01: ;; offset=0x0000 G_M24925_IG02: ;; offset=0x0000 vbroadcastss xmm0, xmm2 vmulps xmm0, xmm0, xmmword ptr [reloc @rwd00] vbroadcastss xmm1, xmm1 vaddps xmm0, xmm1, xmm0 vbroadcastss xmm1, dword ptr [reloc @RWD16] vdivps xmm0, xmm1, xmm0 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M24925_IG03: ;; offset=0x002A ret RWD00 dq 3F80000000000000h, 4040000040000000h RWD16 dd 3F800000h ; 1 ; Total bytes of code 43 ; Vector128.CreateGeometricSequence<float>(initial, 3.0f) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M52564_IG01: ;; offset=0x0000 G_M52564_IG02: ;; offset=0x0000 vbroadcastss xmm0, xmm1 vmulps xmm0, xmm0, xmmword ptr [reloc @rwd00] vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M52564_IG03: ;; offset=0x0014 ret RWD00 dq 404000003F800000h, 41D8000041100000h ; Total bytes of code 21 ; Vector128.ConcatLowerLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M47101_IG01: ;; offset=0x0000 G_M47101_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vshufps xmm0, xmm0, xmmword ptr [r8], 68 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M47101_IG03: ;; offset=0x0011 ret ; Total bytes of code 18 ; Vector128.ConcatLowerUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M48932_IG01: ;; offset=0x0000 G_M48932_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vshufps xmm0, xmm0, xmmword ptr [r8], -28 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M48932_IG03: ;; offset=0x0011 ret ; Total bytes of code 18 ; Vector128.ConcatUpperLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M24196_IG01: ;; offset=0x0000 G_M24196_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vshufps xmm0, xmm0, xmmword ptr [r8], 78 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M24196_IG03: ;; offset=0x0011 ret ; Total bytes of code 18 ; Vector128.ConcatUpperUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M45405_IG01: ;; offset=0x0000 G_M45405_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vshufps xmm0, xmm0, xmmword ptr [r8], -18 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M45405_IG03: ;; offset=0x0011 ret ; Total bytes of code 18 ; Vector128.ZipLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M10057_IG01: ;; offset=0x0000 G_M10057_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vpunpckldq xmm0, xmm0, xmmword ptr [r8] vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M10057_IG03: ;; offset=0x0010 ret ; Total bytes of code 17 ; Vector128.ZipUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M48936_IG01: ;; offset=0x0000 G_M48936_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vpunpckhdq xmm0, xmm0, xmmword ptr [r8] vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M48936_IG03: ;; offset=0x0010 ret ; Total bytes of code 17 ; Vector128.Zip<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M32727_IG01: ;; offset=0x0000 G_M32727_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vmovups xmm1, xmmword ptr [r8] vpunpckldq xmm2, xmm0, xmm1 vpunpckhdq xmm0, xmm0, xmm1 vmovups xmmword ptr [rcx], xmm2 vmovups xmmword ptr [rcx+0x10], xmm0 mov rax, rcx G_M32727_IG03: ;; offset=0x001D ret ; Total bytes of code 30 ; Vector128.UnzipEven<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M32745_IG01: ;; offset=0x0000 G_M32745_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vshufps xmm0, xmm0, xmmword ptr [r8], -120 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M32745_IG03: ;; offset=0x0011 ret ; Total bytes of code 18 ; Vector128.UnzipOdd<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M32638_IG01: ;; offset=0x0000 G_M32638_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vshufps xmm0, xmm0, xmmword ptr [r8], -35 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M32638_IG03: ;; offset=0x0011 ret ; Total bytes of code 18 ; Vector128.Unzip<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M7116_IG01: ;; offset=0x0000 G_M7116_IG02: ;; offset=0x0000 vmovups xmm0, xmmword ptr [rdx] vmovups xmm1, xmmword ptr [r8] vshufps xmm2, xmm0, xmm1, -120 vshufps xmm0, xmm0, xmm1, -35 vmovups xmmword ptr [rcx], xmm2 vmovups xmmword ptr [rcx+0x10], xmm0 mov rax, rcx G_M7116_IG03: ;; offset=0x001F ret ; Total bytes of code 32 ; Vector128.Reverse<int>(vector) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M36184_IG01: ;; offset=0x0000 G_M36184_IG02: ;; offset=0x0000 vpshufd xmm0, xmmword ptr [rdx], 27 vmovups xmmword ptr [rcx], xmm0 mov rax, rcx G_M36184_IG03: ;; offset=0x000C ret ; Total bytes of code 13 ``` </details> <details> <summary>Vector256</summary> ```asm ; Vector256.CreateGeometricSequence<int>(initial, 3) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M59643_IG01: ;; offset=0x0000 G_M59643_IG02: ;; offset=0x0000 vpbroadcastd ymm0, edx vpmulld ymm0, ymm0, ymmword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M59643_IG03: ;; offset=0x0016 vzeroupper ret RWD00 dq 0000000300000001h, 0000001B00000009h, 000000F300000051h, 0000088B000002D9h ; Total bytes of code 26 ; Vector256.CreateAlternatingSequence<int>(even, odd) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M40720_IG01: ;; offset=0x0000 G_M40720_IG02: ;; offset=0x0000 vpbroadcastd ymm0, edx vmovups ymm1, ymmword ptr [reloc @rwd00] vpbroadcastd ymm2, r8d vpermt2d ymm2, ymm1, ymm0 vmovups ymmword ptr [rcx], ymm2 mov rax, rcx G_M40720_IG03: ;; offset=0x0021 vzeroupper ret RWD00 dq 0000000000000008h, 0000000100000009h, 000000020000000Ah, 000000030000000Bh ; Total bytes of code 37 ; Vector256<int>.SignSequence ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M7766_IG01: ;; offset=0x0000 G_M7766_IG02: ;; offset=0x0000 vbroadcastsd ymm0, qword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M7766_IG03: ;; offset=0x0010 vzeroupper ret RWD00 dq FFFFFFFF00000001h ; Total bytes of code 20 ; Vector256.CreateHarmonicSequence<float>(start, step) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M59741_IG01: ;; offset=0x0000 G_M59741_IG02: ;; offset=0x0000 vbroadcastss ymm0, ymm2 vmulps ymm0, ymm0, ymmword ptr [reloc @rwd00] vbroadcastss ymm1, ymm1 vaddps ymm0, ymm1, ymm0 vbroadcastss ymm1, dword ptr [reloc @RWD32] vdivps ymm0, ymm1, ymm0 vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M59741_IG03: ;; offset=0x002A vzeroupper ret RWD00 dq 3F80000000000000h, 4040000040000000h, 40A0000040800000h, 40E0000040C00000h RWD32 dd 3F800000h ; 1 ; Total bytes of code 46 ; Vector256.CreateGeometricSequence<float>(initial, 3.0f) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M42068_IG01: ;; offset=0x0000 G_M42068_IG02: ;; offset=0x0000 vbroadcastss ymm0, ymm1 vmulps ymm0, ymm0, ymmword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M42068_IG03: ;; offset=0x0014 vzeroupper ret RWD00 dq 404000003F800000h, 41D8000041100000h, 4373000042A20000h, 4508B00044364000h ; Total bytes of code 24 ; Vector256.ConcatLowerLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M46973_IG01: ;; offset=0x0000 G_M46973_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vmovdqu xmm1, xmmword ptr [r8] vinserti128 ymm0, ymm0, xmm1 vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M46973_IG03: ;; offset=0x0016 vzeroupper ret ; Total bytes of code 26 ; Vector256.ConcatLowerUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M9060_IG01: ;; offset=0x0000 G_M9060_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vmovups ymm1, ymmword ptr [r8] vextracti128 xmm1, ymm1 vinserti128 ymm0, ymm0, xmm1 vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M9060_IG03: ;; offset=0x001C vzeroupper ret ; Total bytes of code 32 ; Vector256.ConcatUpperLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M57796_IG01: ;; offset=0x0000 G_M57796_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vextracti128 xmm0, ymm0 vmovdqu xmm1, xmmword ptr [r8] vinserti128 ymm0, ymm0, xmm1 vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M57796_IG03: ;; offset=0x001C vzeroupper ret ; Total bytes of code 32 ; Vector256.ConcatUpperUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M7901_IG01: ;; offset=0x0000 G_M7901_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vextracti128 xmm0, ymm0 vmovups ymm1, ymmword ptr [r8] vinserti128 ymm0, ymm1, xmm0 vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M7901_IG03: ;; offset=0x001C vzeroupper ret ; Total bytes of code 32 ; Vector256.ZipLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M61705_IG01: ;; offset=0x0000 G_M61705_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vmovups ymm1, ymmword ptr [reloc @rwd00] vpermt2d ymm0, ymm1, ymmword ptr [r8] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M61705_IG03: ;; offset=0x0019 vzeroupper ret RWD00 dq 0000000800000000h, 0000000900000001h, 0000000A00000002h, 0000000B00000003h ; Total bytes of code 29 ; Vector256.ZipUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M38056_IG01: ;; offset=0x0000 G_M38056_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vmovups ymm1, ymmword ptr [reloc @rwd00] vpermt2d ymm0, ymm1, ymmword ptr [r8] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M38056_IG03: ;; offset=0x0019 vzeroupper ret RWD00 dq 0000000C00000004h, 0000000D00000005h, 0000000E00000006h, 0000000F00000007h ; Total bytes of code 29 ; Vector256.Zip<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M24445_IG01: ;; offset=0x0000 G_M24445_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vmovups ymm1, ymmword ptr [r8] vpunpckldq ymm2, ymm0, ymm1 vpunpckhdq ymm0, ymm0, ymm1 vperm2i128 ymm1, ymm2, ymm0, 32 vperm2i128 ymm0, ymm2, ymm0, 49 vmovups ymmword ptr [rcx], ymm1 vmovups ymmword ptr [rcx+0x20], ymm0 mov rax, rcx G_M24445_IG03: ;; offset=0x0029 vzeroupper ret ; Total bytes of code 45 ; Vector256.UnzipEven<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M4905_IG01: ;; offset=0x0000 G_M4905_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vmovups ymm1, ymmword ptr [reloc @rwd00] vpermt2d ymm0, ymm1, ymmword ptr [r8] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M4905_IG03: ;; offset=0x0019 vzeroupper ret RWD00 dq 0000000200000000h, 0000000600000004h, 0000000A00000008h, 0000000E0000000Ch ; Total bytes of code 29 ; Vector256.UnzipOdd<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M54526_IG01: ;; offset=0x0000 G_M54526_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [rdx] vmovups ymm1, ymmword ptr [reloc @rwd00] vpermt2d ymm0, ymm1, ymmword ptr [r8] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M54526_IG03: ;; offset=0x0019 vzeroupper ret RWD00 dq 0000000300000001h, 0000000700000005h, 0000000B00000009h, 0000000F0000000Dh ; Total bytes of code 29 ; Vector256.Unzip<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M52326_IG01: ;; offset=0x0000 G_M52326_IG02: ;; offset=0x0000 vpshufd ymm0, ymmword ptr [rdx], -40 vpermq ymm0, ymm0, -40 vpshufd ymm1, ymmword ptr [r8], -40 vpermq ymm1, ymm1, -40 vperm2i128 ymm2, ymm0, ymm1, 32 vperm2i128 ymm0, ymm0, ymm1, 49 vmovups ymmword ptr [rcx], ymm2 vmovups ymmword ptr [rcx+0x20], ymm0 mov rax, rcx G_M52326_IG03: ;; offset=0x002F vzeroupper ret ; Total bytes of code 51 ; Vector256.Reverse<int>(vector) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M23794_IG01: ;; offset=0x0000 G_M23794_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [reloc @rwd00] vpermd ymm0, ymm0, ymmword ptr [rdx] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M23794_IG03: ;; offset=0x0014 vzeroupper ret RWD00 dq 0000000600000007h, 0000000400000005h, 0000000200000003h, 0000000000000001h ; Total bytes of code 24 ``` </details> <details> <summary>Vector512</summary> ```asm ; Vector512.CreateGeometricSequence<int>(initial, 3) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M55451_IG01: ;; offset=0x0000 G_M55451_IG02: ;; offset=0x0000 vpbroadcastd zmm0, edx vpmulld zmm0, zmm0, zmmword ptr [reloc @rwd00] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M55451_IG03: ;; offset=0x0019 vzeroupper ret RWD00 dq 0000000300000001h, 0000001B00000009h, 000000F300000051h, 0000088B000002D9h, 00004CE3000019A1h, 0002B3FB0000E6A9h, 001853D300081BF1h, 00DAF26B0048FB79h ; Total bytes of code 29 ; Vector512.CreateAlternatingSequence<int>(even, odd) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M560_IG01: ;; offset=0x0000 G_M560_IG02: ;; offset=0x0000 vpbroadcastd zmm0, edx vmovups zmm1, zmmword ptr [reloc @rwd00] vpbroadcastd zmm2, r8d vpermt2d zmm2, zmm1, zmm0 vmovups zmmword ptr [rcx], zmm2 mov rax, rcx G_M560_IG03: ;; offset=0x0025 vzeroupper ret RWD00 dq 0000000000000010h, 0000000100000011h, 0000000200000012h, 0000000300000013h, 0000000400000014h, 0000000500000015h, 0000000600000016h, 0000000700000017h ; Total bytes of code 41 ; Vector512<int>.SignSequence ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M63542_IG01: ;; offset=0x0000 G_M63542_IG02: ;; offset=0x0000 vbroadcastsd zmm0, qword ptr [reloc @rwd00] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M63542_IG03: ;; offset=0x0013 vzeroupper ret RWD00 dq FFFFFFFF00000001h ; Total bytes of code 23 ; Vector512.CreateHarmonicSequence<float>(start, step) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M41245_IG01: ;; offset=0x0000 G_M41245_IG02: ;; offset=0x0000 vbroadcastss zmm0, zmm2 vmulps zmm0, zmm0, zmmword ptr [reloc @rwd00] vbroadcastss zmm1, zmm1 vaddps zmm0, zmm1, zmm0 vbroadcastss zmm1, dword ptr [reloc @RWD64] vdivps zmm0, zmm1, zmm0 vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M41245_IG03: ;; offset=0x0035 vzeroupper ret RWD00 dq 3F80000000000000h, 4040000040000000h, 40A0000040800000h, 40E0000040C00000h, 4110000041000000h, 4130000041200000h, 4150000041400000h, 4170000041600000h RWD64 dd 3F800000h ; 1 ; Total bytes of code 57 ; Vector512.CreateGeometricSequence<float>(initial, 3.0f) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M35892_IG01: ;; offset=0x0000 G_M35892_IG02: ;; offset=0x0000 vbroadcastss zmm0, zmm1 vmulps zmm0, zmm0, zmmword ptr [reloc @rwd00] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M35892_IG03: ;; offset=0x0019 vzeroupper ret RWD00 dq 404000003F800000h, 41D8000041100000h, 4373000042A20000h, 4508B00044364000h, 4699C60045CD0800h, 482CFEC04766A900h, 49C29E984901BF10h, 4B5AF26B4A91F6F2h ; Total bytes of code 29 ; Vector512.ConcatLowerLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M13789_IG01: ;; offset=0x0000 G_M13789_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovdqu ymm1, ymmword ptr [r8] vinserti32x8 zmm0, zmm0, ymm1, 1 vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M13789_IG03: ;; offset=0x001B vzeroupper ret ; Total bytes of code 31 ; Vector512.ConcatLowerUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M36420_IG01: ;; offset=0x0000 G_M36420_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovups zmm1, zmmword ptr [r8] vextracti32x8 ymm1, zmm1, 1 vinserti32x8 zmm0, zmm0, ymm1, 1 vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M36420_IG03: ;; offset=0x0023 vzeroupper ret ; Total bytes of code 39 ; Vector512.ConcatUpperLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M56164_IG01: ;; offset=0x0000 G_M56164_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vextracti32x8 ymm0, zmm0, 1 vmovdqu ymm1, ymmword ptr [r8] vinserti32x8 zmm0, zmm0, ymm1, 1 vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M56164_IG03: ;; offset=0x0022 vzeroupper ret ; Total bytes of code 38 ; Vector512.ConcatUpperUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M13821_IG01: ;; offset=0x0000 G_M13821_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vextracti32x8 ymm0, zmm0, 1 vmovups zmm1, zmmword ptr [r8] vinserti32x8 zmm0, zmm1, ymm0, 0 vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M13821_IG03: ;; offset=0x0023 vzeroupper ret ; Total bytes of code 39 ; Vector512.ZipLower<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M59881_IG01: ;; offset=0x0000 G_M59881_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovups zmm1, zmmword ptr [reloc @rwd00] vpermt2d zmm0, zmm1, zmmword ptr [r8] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M59881_IG03: ;; offset=0x001F vzeroupper ret RWD00 dq 0000001000000000h, 0000001100000001h, 0000001200000002h, 0000001300000003h, 0000001400000004h, 0000001500000005h, 0000001600000006h, 0000001700000007h ; Total bytes of code 35 ; Vector512.ZipUpper<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M10568_IG01: ;; offset=0x0000 G_M10568_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovups zmm1, zmmword ptr [reloc @rwd00] vpermt2d zmm0, zmm1, zmmword ptr [r8] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M10568_IG03: ;; offset=0x001F vzeroupper ret RWD00 dq 0000001800000008h, 0000001900000009h, 0000001A0000000Ah, 0000001B0000000Bh, 0000001C0000000Ch, 0000001D0000000Dh, 0000001E0000000Eh, 0000001F0000000Fh ; Total bytes of code 35 ; Vector512.Zip<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M59130_IG01: ;; offset=0x0000 G_M59130_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovups zmm1, zmmword ptr [r8] vpunpckhdq zmm2, zmm0, zmm1 vpunpckldq zmm0, zmm0, zmm1 vshufi32x4 zmm1, zmm0, zmm2, 68 vshufi32x4 zmm1, zmm1, zmm1, -40 vshufi32x4 zmm0, zmm0, zmm2, -18 vshufi32x4 zmm0, zmm0, zmm0, -40 vmovups zmmword ptr [rcx], zmm1 vmovups zmmword ptr [rcx+0x40], zmm0 mov rax, rcx G_M59130_IG03: ;; offset=0x0044 vzeroupper ret ; Total bytes of code 72 ; Vector512.UnzipEven<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M30057_IG01: ;; offset=0x0000 G_M30057_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovups zmm1, zmmword ptr [reloc @rwd00] vpermt2d zmm0, zmm1, zmmword ptr [r8] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M30057_IG03: ;; offset=0x001F vzeroupper ret RWD00 dq 0000000200000000h, 0000000600000004h, 0000000A00000008h, 0000000E0000000Ch, 0000001200000010h, 0000001600000014h, 0000001A00000018h, 0000001E0000001Ch ; Total bytes of code 35 ; Vector512.UnzipOdd<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M65310_IG01: ;; offset=0x0000 G_M65310_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovups zmm1, zmmword ptr [reloc @rwd00] vpermt2d zmm0, zmm1, zmmword ptr [r8] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M65310_IG03: ;; offset=0x001F vzeroupper ret RWD00 dq 0000000300000001h, 0000000700000005h, 0000000B00000009h, 0000000F0000000Dh, 0000001300000011h, 0000001700000015h, 0000001B00000019h, 0000001F0000001Dh ; Total bytes of code 35 ; Vector512.Unzip<int>(left, right) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M1377_IG01: ;; offset=0x0000 G_M1377_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [rdx] vmovups zmm1, zmmword ptr [reloc @rwd00] vmovups zmm2, zmmword ptr [r8] vmovaps zmm3, zmm2 vpermt2d zmm3, zmm1, zmm0 vmovups zmm1, zmmword ptr [reloc @RWD64] vpermt2d zmm2, zmm1, zmm0 vmovups zmmword ptr [rcx], zmm2 vmovups zmmword ptr [rcx+0x40], zmm3 mov rax, rcx G_M1377_IG03: ;; offset=0x0042 vzeroupper ret RWD00 dq 0000001300000011h, 0000001700000015h, 0000001B00000019h, 0000001F0000001Dh, 0000000300000001h, 0000000700000005h, 0000000B00000009h, 0000000F0000000Dh RWD64 dq 0000001200000010h, 0000001600000014h, 0000001A00000018h, 0000001E0000001Ch, 0000000200000000h, 0000000600000004h, 0000000A00000008h, 0000000E0000000Ch ; Total bytes of code 70 ; Vector512.Reverse<int>(vector) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M34517_IG01: ;; offset=0x0000 G_M34517_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [reloc @rwd00] vpermd zmm0, zmm0, zmmword ptr [rdx] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M34517_IG03: ;; offset=0x0019 vzeroupper ret RWD00 dq 0000000E0000000Fh, 0000000C0000000Dh, 0000000A0000000Bh, 0000000800000009h, 0000000600000007h, 0000000400000005h, 0000000200000003h, 0000000000000001h ; Total bytes of code 29 ``` </details> <details> <summary>ARM64</summary> ```asm ; Vector64.CreateGeometricSequence<int>(initial, 3) ; Emitting BLENDED_CODE for arm64 on Windows G_M6971_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M6971_IG02: ;; offset=0x0008 dup v0.2s, w0 ldr d16, [@rwd00] mul v0.2s, v0.2s, v16.2s G_M6971_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr RWD00 dq 0000000300000001h ; Total bytes of code 28 ; Vector64.CreateAlternatingSequence<int>(even, odd) ; Emitting BLENDED_CODE for arm64 on Windows G_M10160_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M10160_IG02: ;; offset=0x0008 dup v0.2s, w0 dup v16.2s, w1 zip1 v0.2s, v0.2s, v16.2s G_M10160_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 28 ; Vector64<int>.SignSequence ; Emitting BLENDED_CODE for arm64 on Windows G_M46198_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M46198_IG02: ;; offset=0x0008 ldr d0, [@rwd00] G_M46198_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr RWD00 dq FFFFFFFF00000001h ; Total bytes of code 20 ; Vector64.CreateHarmonicSequence<float>(start, step) ; Emitting BLENDED_CODE for arm64 on Windows G_M43901_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M43901_IG02: ;; offset=0x0008 ldr d16, [@rwd00] fmul v16.2s, v16.2s, v1.s[0] dup v0.2s, v0.s[0] fadd v0.2s, v16.2s, v0.2s ldr d16, [@RWD08] fdiv v0.2s, v16.2s, v0.2s G_M43901_IG03: ;; offset=0x0020 ldp fp, lr, [sp], #0x10 ret lr RWD00 dq 3F80000000000000h RWD08 dq 3F8000003F800000h ; Total bytes of code 40 ; Vector64.ConcatLowerLower<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M48061_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M48061_IG02: ;; offset=0x0008 ins v0.s[1], v1.s[0] G_M48061_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector64.ConcatLowerUpper<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M62724_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M62724_IG02: ;; offset=0x0008 ins v0.s[1], v1.s[1] G_M62724_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector64.ConcatUpperLower<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M2660_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M2660_IG02: ;; offset=0x0008 ins v0.s[0], v0.s[1] ins v0.s[1], v1.s[0] G_M2660_IG03: ;; offset=0x0010 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 24 ; Vector64.ConcatUpperUpper<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M55837_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M55837_IG02: ;; offset=0x0008 ins v0.s[0], v0.s[1] ins v0.s[1], v1.s[1] G_M55837_IG03: ;; offset=0x0010 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 24 ; Vector64.ZipLower<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M32585_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M32585_IG02: ;; offset=0x0008 zip1 v0.2s, v0.2s, v1.2s G_M32585_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector64.ZipUpper<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M43976_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M43976_IG02: ;; offset=0x0008 zip2 v0.2s, v0.2s, v1.2s G_M43976_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector64.Zip<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M14030_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M14030_IG02: ;; offset=0x0008 zip1 v16.2s, v0.2s, v1.2s zip2 v1.2s, v0.2s, v1.2s mov v0.8b, v16.8b G_M14030_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 28 ; Vector64.UnzipEven<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M26249_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M26249_IG02: ;; offset=0x0008 uzp1 v0.2s, v0.2s, v1.2s G_M26249_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector64.UnzipOdd<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M25438_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M25438_IG02: ;; offset=0x0008 uzp2 v0.2s, v0.2s, v1.2s G_M25438_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector64.Unzip<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M41813_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M41813_IG02: ;; offset=0x0008 uzp1 v16.2s, v0.2s, v1.2s uzp2 v1.2s, v0.2s, v1.2s mov v0.8b, v16.8b G_M41813_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 28 ; Vector64.Reverse<int>(vector) ; Emitting BLENDED_CODE for arm64 on Windows G_M63937_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M63937_IG02: ;; offset=0x0008 rev64 v0.2s, v0.2s G_M63937_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector128.CreateGeometricSequence<int>(initial, 3) ; Emitting BLENDED_CODE for arm64 on Windows G_M55419_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M55419_IG02: ;; offset=0x0008 dup v0.4s, w0 ldr q16, [@rwd00] mul v0.4s, v0.4s, v16.4s G_M55419_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr RWD00 dq 0000000300000001h, 0000001B00000009h ; Total bytes of code 28 ; Vector128.CreateAlternatingSequence<int>(even, odd) ; Emitting BLENDED_CODE for arm64 on Windows G_M23632_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M23632_IG02: ;; offset=0x0008 dup v0.4s, w0 dup v16.4s, w1 zip1 v0.4s, v0.4s, v16.4s G_M23632_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 28 ; Vector128<int>.SignSequence ; Emitting BLENDED_CODE for arm64 on Windows G_M37526_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M37526_IG02: ;; offset=0x0008 ldr q0, [@rwd00] G_M37526_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr RWD00 dq FFFFFFFF00000001h, FFFFFFFF00000001h ; Total bytes of code 20 ; Vector128.CreateHarmonicSequence<float>(start, step) ; Emitting BLENDED_CODE for arm64 on Windows G_M24925_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M24925_IG02: ;; offset=0x0008 ldr q16, [@rwd00] fmul v16.4s, v16.4s, v1.s[0] dup v0.4s, v0.s[0] fadd v0.4s, v16.4s, v0.4s ldr q16, [@RWD16] fdiv v0.4s, v16.4s, v0.4s G_M24925_IG03: ;; offset=0x0020 ldp fp, lr, [sp], #0x10 ret lr RWD00 dq 3F80000000000000h, 4040000040000000h RWD16 dq 3F8000003F800000h, 3F8000003F800000h ; Total bytes of code 40 ; Vector128.ConcatLowerLower<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M47101_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M47101_IG02: ;; offset=0x0008 ins v0.d[1], v1.d[0] G_M47101_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector128.ConcatLowerUpper<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M48932_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M48932_IG02: ;; offset=0x0008 ext v16.16b, v1.16b, v1.16b, #8 ins v0.d[1], v16.d[0] G_M48932_IG03: ;; offset=0x0010 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 24 ; Vector128.ConcatUpperLower<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M24196_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M24196_IG02: ;; offset=0x0008 ext v0.16b, v0.16b, v0.16b, #8 ins v0.d[1], v1.d[0] G_M24196_IG03: ;; offset=0x0010 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 24 ; Vector128.ConcatUpperUpper<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M45405_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M45405_IG02: ;; offset=0x0008 ext v0.16b, v0.16b, v0.16b, #8 ins v1.d[0], v0.d[0] mov v0.16b, v1.16b G_M45405_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 28 ; Vector128.ZipLower<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M10057_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M10057_IG02: ;; offset=0x0008 zip1 v0.4s, v0.4s, v1.4s G_M10057_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector128.ZipUpper<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M48936_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M48936_IG02: ;; offset=0x0008 zip2 v0.4s, v0.4s, v1.4s G_M48936_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector128.Zip<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M32727_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M32727_IG02: ;; offset=0x0008 zip1 v16.4s, v0.4s, v1.4s zip2 v1.4s, v0.4s, v1.4s mov v0.16b, v16.16b G_M32727_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 28 ; Vector128.UnzipEven<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M32745_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M32745_IG02: ;; offset=0x0008 uzp1 v0.4s, v0.4s, v1.4s G_M32745_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector128.UnzipOdd<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M32638_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M32638_IG02: ;; offset=0x0008 uzp2 v0.4s, v0.4s, v1.4s G_M32638_IG03: ;; offset=0x000C ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 20 ; Vector128.Unzip<int>(left, right) ; Emitting BLENDED_CODE for arm64 on Windows G_M7116_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M7116_IG02: ;; offset=0x0008 uzp1 v16.4s, v0.4s, v1.4s uzp2 v1.4s, v0.4s, v1.4s mov v0.16b, v16.16b G_M7116_IG03: ;; offset=0x0014 ldp fp, lr, [sp], #0x10 ret lr ; Total bytes of code 28 ; Vector128.Reverse<int>(vector) ; Emitting BLENDED_CODE for arm64 on Windows G_M36184_IG01: ;; offset=0x0000 stp fp, lr, [sp, #-0x10]! mov fp, sp G_M36184_IG02: ;; offset=0x0008 ldr q16, [@rwd00] tbl v0.16b, {v0.16b}, v16.16b G_M36184_IG03: ;; offset=0x0010 ldp fp, lr, [sp], #0x10 ret lr RWD00 dq 0B0A09080F0E0D0Ch, 0302010007060504h ; Total bytes of code 24 ``` </details> Codegen for constant input: <details> <summary>Vector256</summary> ```asm ; Vector256.CreateGeometricSequence<int>(2, 3) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M53517_IG01: ;; offset=0x0000 G_M53517_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M53517_IG03: ;; offset=0x000F vzeroupper ret RWD00 dq 0000000600000002h, 0000003600000012h, 000001E6000000A2h, 00001116000005B2h ; Total bytes of code 19 ; Vector256.CreateAlternatingSequence<int>(5, 7) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M39644_IG01: ;; offset=0x0000 G_M39644_IG02: ;; offset=0x0000 vbroadcastsd ymm0, qword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M39644_IG03: ;; offset=0x0010 vzeroupper ret RWD00 dq 0000000700000005h ; Total bytes of code 20 ; Vector256.CreateHarmonicSequence<float>(1.0f, 2.0f) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M64276_IG01: ;; offset=0x0000 G_M64276_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 mov rax, rcx G_M64276_IG03: ;; offset=0x000F vzeroupper ret RWD00 dq 3EAAAAAB3F800000h, 3E1249253E4CCCCDh, 3DBA2E8C3DE38E39h, 3D8888893D9D89D9h ; Total bytes of code 19 ``` </details> <details> <summary>Vector512</summary> ```asm ; Vector512.CreateGeometricSequence<int>(2, 3) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M22445_IG01: ;; offset=0x0000 G_M22445_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [reloc @rwd00] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M22445_IG03: ;; offset=0x0013 vzeroupper ret RWD00 dq 0000000600000002h, 0000003600000012h, 000001E6000000A2h, 00001116000005B2h, 000099C600003342h, 000567F60001CD52h, 0030A7A6001037E2h, 01B5E4D60091F6F2h ; Total bytes of code 23 ; Vector512.CreateAlternatingSequence<int>(5, 7) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M9468_IG01: ;; offset=0x0000 G_M9468_IG02: ;; offset=0x0000 vbroadcastsd zmm0, qword ptr [reloc @rwd00] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M9468_IG03: ;; offset=0x0013 vzeroupper ret RWD00 dq 0000000700000005h ; Total bytes of code 23 ; Vector512.CreateHarmonicSequence<float>(1.0f, 2.0f) ; Emitting BLENDED_CODE for x64 + VEX + EVEX on Windows G_M13716_IG01: ;; offset=0x0000 G_M13716_IG02: ;; offset=0x0000 vmovups zmm0, zmmword ptr [reloc @rwd00] vmovups zmmword ptr [rcx], zmm0 mov rax, rcx G_M13716_IG03: ;; offset=0x0013 vzeroupper ret RWD00 dq 3EAAAAAB3F800000h, 3E1249253E4CCCCDh, 3DBA2E8C3DE38E39h, 3D8888893D9D89D9h, 3D5794363D70F0F1h, 3D3216433D430C31h, 3D17B4263D23D70Ah, 3D0421083D0D3DCBh ; Total bytes of code 23 ``` </details> <details> <summary>Vector512 without AVX512 - Vector256 decomposition path</summary> ```asm ; Vector512.CreateGeometricSequence<int>(2, 3) ; Emitting BLENDED_CODE for x64 + VEX on Windows G_M22445_IG01: ;; offset=0x0000 G_M22445_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [reloc @rwd00] vmovups ymm1, ymmword ptr [reloc @RWD32] vmovups ymmword ptr [rcx], ymm1 vmovups ymmword ptr [rcx+0x20], ymm0 mov rax, rcx G_M22445_IG03: ;; offset=0x001C vzeroupper ret RWD00 dq 000099C600003342h, 000567F60001CD52h, 0030A7A6001037E2h, 01B5E4D60091F6F2h RWD32 dq 0000000600000002h, 0000003600000012h, 000001E6000000A2h, 00001116000005B2h ; Total bytes of code 32 ; Vector512.CreateAlternatingSequence<int>(5, 7) ; Emitting BLENDED_CODE for x64 + VEX on Windows G_M9468_IG01: ;; offset=0x0000 G_M9468_IG02: ;; offset=0x0000 vbroadcastsd ymm0, qword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 vmovups ymmword ptr [rcx+0x20], ymm0 mov rax, rcx G_M9468_IG03: ;; offset=0x0015 vzeroupper ret RWD00 dq 0000000700000005h ; Total bytes of code 25 ; Vector512.CreateHarmonicSequence<float>(1.0f, 2.0f) ; Emitting BLENDED_CODE for x64 + VEX on Windows G_M13716_IG01: ;; offset=0x0000 G_M13716_IG02: ;; offset=0x0000 vmovups ymm0, ymmword ptr [reloc @rwd00] vmovups ymmword ptr [rcx], ymm0 vmovups ymm0, ymmword ptr [reloc @RWD32] vmovups ymmword ptr [rcx+0x20], ymm0 mov rax, rcx G_M13716_IG03: ;; offset=0x001C vzeroupper ret RWD00 dq 3EAAAAAB3F800000h, 3E1249253E4CCCCDh, 3DBA2E8C3DE38E39h, 3D8888893D9D89D9h RWD32 dq 3D5794363D70F0F1h, 3D3216433D430C31h, 3D17B4263D23D70Ah, 3D0421083D0D3DCBh ; Total bytes of code 32 ``` </details> Closes dotnet#122557 cc: @tannergooding
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
This pull request description was generated with GitHub Copilot.
Summary
Fixes dotnet#120902.
This removes the
instrument_if_optimizingscenario from thelibraries-pgopipeline. That mode is still defined in the shared test infrastructure and remains available to the coreclr test jobs that use it; this change only stops scheduling it for libraries tests, where it has been causing repeated Helix timeouts.Testing
Pipeline configuration change only.