Sync dev branch with main on 6.18 kernel#148
Merged
Merged
Conversation
added 10 commits
June 23, 2026 08:37
Add NixOS flake configuration and helper scripts for reproducible kernel builds. Files added: - flake.nix: Nix environment with pinned toolchain (GCC 13.2.0, binutils, etc.) - flake.lock: Locked package versions for reproducibility - Microsoft/nix-setup.sh: One-time Nix installation helper - Microsoft/nix-clean.sh: Build artifact cleanup - .gitignore: Add Nix-related entries This establishes the foundation for bit-reproducible kernel builds across different machines by providing a hermetic build environment with pinned dependencies. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Add nix-build.sh that orchestrates reproducible kernel builds using the Nix environment established in the previous commit. Features: - Pure Nix environment with --ignore-environment flag - Fixed build paths for reproducible absolute path embeddings - Reproducible environment variables: - SOURCE_DATE_EPOCH= timestamp of top git commit embedded - KBUILD_BUILD_USER=builder - KBUILD_BUILD_HOST=nixos - KBUILD_BUILD_VERSION=1 - Copies source to fixed path to ensure identical embedded paths - Invokes build-hcl-kernel.sh within the controlled environment - Copies artifacts back to original location - Cleanup on exit Usage: ./Microsoft/nix-build.sh x64 # Build x64 kernel ./Microsoft/nix-build.sh arm64 # Build arm64 kernel ./Microsoft/nix-build.sh x64 cvm # Build x64 cvm kernel ./Microsoft/nix-build.sh arm64 cvm # Build arm64 cvm kernel Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Enhance build-hcl-kernel.sh to support reproducible builds when invoked from nix-build.sh or other reproducible environments. Changes: - Detect host architecture to avoid unnecessary cross-compilation - Set CC explicitly to gcc/cross-compiler for Nix toolchain - Add LOCALVERSION= to prevent '+' suffix in version string - Add KCFLAGS=-fdebug-prefix-map to normalize debug paths - Add SHA256 checksum output of vmlinux for verification - Remove KBUILD_BUILD_ID=none (not needed) When REPRODUCIBLE_BUILD=1: - Uses Nix's gcc instead of system gcc for native builds - Only uses cross-compiler when actually cross-compiling - Ensures consistent compiler identification in kernel binary Otherwise, let users continue using this script for dev work as before. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Add build-hcl-kernel-pipeline.sh for Azure DevOps CI integration with reproducible build support. Features: - Supports amd64 and arm64 architectures - CVM config merge support via merge_cvm_config() - Optional reproducible build mode (--reproducible flag) - Generates kernel, headers, modules, and debug symbols - Progress indicators for build stages [1/5] through [5/5] - SHA256 checksum output for reproducibility verification Key differences from build-hcl-kernel.sh: - Standalone script that doesn't depend on nix-build.sh wrapper - Implements complete build workflow in one script - Uses KBUILD_OUTPUT=$BUILD_DIR/linux subdirectory structure - Handles CVM config merging inline - Moves artifacts from /linux subdirectory to BUILD_DIR root for pipeline - When --reproducible: sets up Nix environment and reproducible variables Build directory structure: - $BUILD_DIR/linux/ # KBUILD_OUTPUT during build - $BUILD_DIR/vmlinux # Final artifacts at root - $BUILD_DIR/linux-headers/ - $BUILD_DIR/debug_symbols/ Usage: ./build-hcl-kernel-pipeline.sh -s <source> -b <build> -c <config> -a <arch> ./build-hcl-kernel-pipeline.sh ... --reproducible ./build-hcl-kernel-pipeline.sh ... --cvm-config <config> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Use the Nix-pinned toolchain in build_linux so x64, cvm-x64, and arm64 produce deterministic outputs for a given commit. - Build via Microsoft/build-hcl-kernel-pipeline.sh --reproducible. - Set SOURCE_DATE_EPOCH from the commit timestamp. - Add Microsoft/package-ci-artifacts.sh for deterministic packaging and sha256 reporting. - Keep build_perf and create_release behavior unchanged.
- Build each Linux arch variant twice in parallel (run1/run2) - Preserve per-run logs and artifact uploads - Add compare_repro job to diff full artifact hashes across runs - Keep release creation tag-gated and publish artifacts from run1
Since v6.15 (aed877c, d3f7922), GUP no longer takes a pgmap reference for ZONE_DEVICE pages and walks huge entries through the unified folio path. With vmf_insert_pfn_{pmd,pud}() the mapping holds no folio reference, so a zap racing with pin_user_pages_fast() can briefly drop the folio refcount to 0 and trigger a WARN in try_grab_folio() with the I/O failing as -ENOMEM. Switch the PMD/PUD fault paths to vmf_insert_folio_{pmd,pud}(), mirroring drivers/dax/device.c. Each map takes folio_get(); the matching folio_put() in zap keeps the refcount above 0. Gate the huge inserters on pfn_valid() + ZONE_DEVICE + MEMORY_DEVICE_GENERIC via mshv_vtl_low_resolve_page(); fall back to VM_FAULT_FALLBACK when the folio order does not match PMD_ORDER/PUD_ORDER or the PFN is not yet pgmap-backed, so the core can retry at smaller order. Add VM_DONTEXPAND to the VMA to block mremap() growth past the pgmap. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Extend the folio-aware fault path to the 4K case so GUP into /dev/mshv_vtl_low works after MSHV_ADD_VTL0_MEMORY has registered the range. With the previous vmf_insert_mixed() path the PTE was always pte_special, vm_normal_page() returned NULL during pin_user_pages*(), follow_pfn_pte() returned -EEXIST, and io_uring O_DIRECT surfaced it as "disk io error: io error: File exists (os error 17)" on the first DMA into a freshly-registered VTL0 chunk. The 4K path now resolves the PFN via mshv_vtl_low_resolve_page(): when backed by an mshv_vtl pgmap the PTE is installed with vmf_insert_page_mkwrite(), giving GUP a normal pinnable page; otherwise it falls back to vmf_insert_mixed() so early CPU accesses (e.g. the VTL2 guest-memory self test reading GPA 0 before any add_vtl0_mem ioctl) still succeed instead of SIGBUSing. Such fallback PTEs would persist across registration and break later GUP. Capture the cdev's address_space on first open and, on successful MSHV_ADD_VTL0_MEMORY, invalidate the file-offset range via unmap_mapping_range() for both the encrypted (pfn) and decrypted (pfn | DECRYPTED_MASK) aliases that mshv_vtl_low_mmap() exposes. The next access re-faults into the folio path and GUP works. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
memremap_pages() makes a pgmap visible to get_dev_pagemap() before arch_add_memory() populates the vmemmap. A concurrent mshv_vtl_low_huge_fault() running while another thread is still inside MSHV_ADD_VTL0_MEMORY can resolve a pfn whose struct page sits behind an empty vmemmap PMD, oopsing on the first page_folio() deref: BUG: unable to handle page fault for address: ffffea000404ca08 PGD ... PUD ... PMD 0 RIP: 0010:mshv_vtl_low_huge_fault+0x4b/0x240 Call Trace: mshv_vtl_low_fault+0xb/0x10 __do_fault+0x32/0xa0 __handle_mm_fault+0xc2f/0x2110 Replace get_dev_pagemap()-based resolution with a driver-owned RCU list of completed VTL0 ranges. Each range is added only after devm_memremap_pages() returns, so a hit guarantees the vmemmap is populated and the struct page is initialized. Entries are never removed (pgmaps live for the life of the module). Fixes: 775741a ("Drivers: hv: mshv_vtl: use folio-aware inserters for huge VTL0 mappings") Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
mshv_vtl_exit() calls misc_deregister(&mshv_vtl_sint_dev) and misc_deregister(&mshv_vtl_low) twice. The first pair (added when the TDX APIC handling code was introduced) is redundant: the same deregistrations are performed again a few lines below, in the order that mirrors the registration sequence in mshv_vtl_init(). Calling misc_deregister() twice on the same struct miscdevice is not safe -- it ends up doing list_del() twice on the device's misc_list node, corrupting the global list and yielding 'list_del corruption' splats on rmmod (or any module exit path). Drop the redundant calls so each device is deregistered exactly once, in the reverse order of registration. Fixes: 06eb1e3 ("mshv_vtl/tdx: Handle some APIC functionality in kernel") Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
hargar19
approved these changes
Jun 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bridge the diff between main and dev branches by bringing changes from main to dev branch.