-
Notifications
You must be signed in to change notification settings - Fork 10
[RLC-10] Rebase Custom Changes to rlc-10/6.12.0-124.28.1.el10_1 #839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
roxanan1996
merged 18 commits into
rlc-10/6.12.0-124.28.1.el10_1
from
jmaple_rlc-10/6.12.0-124.28.1.el10_1
Jan 30, 2026
Merged
[RLC-10] Rebase Custom Changes to rlc-10/6.12.0-124.28.1.el10_1 #839
roxanan1996
merged 18 commits into
rlc-10/6.12.0-124.28.1.el10_1
from
jmaple_rlc-10/6.12.0-124.28.1.el10_1
Jan 30, 2026
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jira LE-3207 feature tools_hv commit-author Shradha Gupta <[email protected]> commit a9c0b33 Allow the KVP daemon to log the KVP updates triggered in the VM with a new debug flag(-d). When the daemon is started with this flag, it logs updates and debug information in syslog with loglevel LOG_DEBUG. This information comes in handy for debugging issues where the key-value pairs for certain pools show mismatch/incorrect values. The distro-vendors can further consume these changes and modify the respective service files to redirect the logs to specific files as needed. Signed-off-by: Shradha Gupta <[email protected]> Reviewed-by: Naman Jain <[email protected]> Reviewed-by: Dexuan Cui <[email protected]> Link: https://lore.kernel.org/r/1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Wei Liu <[email protected]> Message-ID: <1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com> (cherry picked from commit a9c0b33) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4478 commit-author Shiraz Saleem <[email protected]> commit baa640d Add support for mana device level statistics. Co-developed-by: Solom Tamawy <[email protected]> Signed-off-by: Solom Tamawy <[email protected]> Signed-off-by: Shiraz Saleem <[email protected]> Signed-off-by: Konstantin Taranov <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Long Li <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit baa640d) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4467 commit-author Shradha Gupta <[email protected]> commit 5da8a8b For supporting dynamic MSI-X vector allocation by PCI controllers, enabling the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN is not enough, msix_prepare_msi_desc() to prepare the MSI descriptor is also needed. Export pci_msix_prepare_desc() to allow PCI controllers to support dynamic MSI-X vector allocation. Signed-off-by: Shradha Gupta <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Reviewed-by: Saurabh Sengar <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> (cherry picked from commit 5da8a8b) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4467 commit-author Shradha Gupta <[email protected]> commit ad518f2 Allow dynamic MSI-X vector allocation for pci_hyperv PCI controller by adding support for the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN and using pci_msix_prepare_desc() to prepare the MSI-X descriptors. Feature support added for both x86 and ARM64 Signed-off-by: Shradha Gupta <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Reviewed-by: Saurabh Sengar <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> (cherry picked from commit ad518f2) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4467 commit-author Yury Norov <[email protected]> commit 4607617 Commit 91bfe21 ("net: mana: add a function to spread IRQs per CPUs") added the irq_setup() function that distributes IRQs on CPUs according to a tricky heuristic. The corresponding commit message explains the heuristic. Duplicate it in the source code to make available for readers without digging git in history. Also, add more detailed explanation about how the heuristics is implemented. Signed-off-by: Yury Norov <[email protected]> Signed-off-by: Shradha Gupta <[email protected]> (cherry picked from commit 4607617) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4467 commit-author Shradha Gupta <[email protected]> commit 845c62c In order to prepare the MANA driver to allocate the MSI-X IRQs dynamically, we need to enhance irq_setup() to allow skipping affinitizing IRQs to the first CPU sibling group. This would be for cases when the number of IRQs is less than or equal to the number of online CPUs. In such cases for dynamically added IRQs the first CPU sibling group would already be affinitized with HWC IRQ. Signed-off-by: Shradha Gupta <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Reviewed-by: Yury Norov [NVIDIA] <[email protected]> (cherry picked from commit 845c62c) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4467 commit-author Shradha Gupta <[email protected]> commit 7553911 upstream-diff There were conflicts seen when applying this patch due to following commit present in our tree before this patch. 590bcf1 ("net: mana: Add handler for hardware servicing events") Currently, the MANA driver allocates MSI-X vectors statically based on MANA_MAX_NUM_QUEUES and num_online_cpus() values and in some cases ends up allocating more vectors than it needs. This is because, by this time we do not have a HW channel and do not know how many IRQs should be allocated. To avoid this, we allocate 1 MSI-X vector during the creation of HWC and after getting the value supported by hardware, dynamically add the remaining MSI-X vectors. Signed-off-by: Shradha Gupta <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> (cherry picked from commit 7553911) Signed-off-by: Shreeya Patel <[email protected]> Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4473 commit-author Erni Sri Satya Vennela <[email protected]> commit 75cabb4 Introduce support for net_shaper_ops in the MANA driver, enabling configuration of rate limiting on the MANA NIC. To apply rate limiting, the driver issues a HWC command via mana_set_bw_clamp() and updates the corresponding shaper object in the net_shaper cache. If an error occurs during this process, the driver restores the previous speed by querying the current link configuration using mana_query_link_cfg(). The minimum supported bandwidth is 100 Mbps, and only values that are exact multiples of 100 Mbps are allowed. Any other values are rejected. To remove a shaper, the driver resets the bandwidth to the maximum supported by the SKU using mana_set_bw_clamp() and clears the associated cache entry. If an error occurs during this process, the shaper details are retained. On the hardware that does not support these APIs, the net-shaper calls to set speed would fail. Set the speed: ./tools/net/ynl/pyynl/cli.py \ --spec Documentation/netlink/specs/net_shaper.yaml \ --do set --json '{"ifindex":'$IFINDEX', "handle":{"scope": "netdev", "id":'$ID' }, "bw-max": 200000000 }' Get the shaper details: ./tools/net/ynl/pyynl/cli.py \ --spec Documentation/netlink/specs/net_shaper.yaml \ --do get --json '{"ifindex":'$IFINDEX', "handle":{"scope": "netdev", "id":'$ID' }}' > {'bw-max': 200000000, > 'handle': {'scope': 'netdev'}, > 'ifindex': $IFINDEX, > 'metric': 'bps'} Delete the shaper object: ./tools/net/ynl/pyynl/cli.py \ --spec Documentation/netlink/specs/net_shaper.yaml \ --do delete --json '{"ifindex":'$IFINDEX', "handle":{"scope": "netdev","id":'$ID' }}' Signed-off-by: Erni Sri Satya Vennela <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Reviewed-by: Shradha Gupta <[email protected]> Reviewed-by: Saurabh Singh Sengar <[email protected]> Reviewed-by: Long Li <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit 75cabb4) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4473 commit-author Erni Sri Satya Vennela <[email protected]> commit a6d5edf Allow mana ethtool get_link_ksettings operation to report the maximum speed supported by the SKU in mbps. The driver retrieves this information by issuing a HWC command to the hardware via mana_query_link_cfg(), which retrieves the SKU's maximum supported speed. These APIs when invoked on hardware that are older/do not support these APIs, the speed would be reported as UNKNOWN. Before: $ethtool enP30832s1 > Settings for enP30832s1: Supported ports: [ ] Supported link modes: Not reported Supported pause frame use: No Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: Unknown! Duplex: Full Auto-negotiation: off Port: Other PHYAD: 0 Transceiver: internal Link detected: yes After: $ethtool enP30832s1 > Settings for enP30832s1: Supported ports: [ ] Supported link modes: Not reported Supported pause frame use: No Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 16000Mb/s Duplex: Full Auto-negotiation: off Port: Other PHYAD: 0 Transceiver: internal Link detected: yes Signed-off-by: Erni Sri Satya Vennela <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Reviewed-by: Shradha Gupta <[email protected]> Reviewed-by: Saurabh Singh Sengar <[email protected]> Reviewed-by: Long Li <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit a6d5edf) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4473 commit-author Erni Sri Satya Vennela <[email protected]> commit ca8ac48 upstream-diff There were conflicts seen when applying this patch due to the following patch being in our tree before this one. 7a3c235 ("net: mana: Handle Reset Request from MANA NIC") If any of the HWC commands are not recognized by the underlying hardware, the hardware returns the response header status of -1. Log the information using netdev_info_once to avoid multiple error logs in dmesg. Signed-off-by: Erni Sri Satya Vennela <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Reviewed-by: Shradha Gupta <[email protected]> Reviewed-by: Saurabh Singh Sengar <[email protected]> Reviewed-by: Dipayaan Roy <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit ca8ac48) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4473 commit-author Erni Sri Satya Vennela <[email protected]> commit 11cd020 Fix build errors when CONFIG_NET_SHAPER is disabled, including: drivers/net/ethernet/microsoft/mana/mana_en.c:804:10: error: 'const struct net_device_ops' has no member named 'net_shaper_ops' 804 | .net_shaper_ops = &mana_shaper_ops, drivers/net/ethernet/microsoft/mana/mana_en.c:804:35: error: initialization of 'int (*)(struct net_device *, struct neigh_parms *)' from incompatible pointer type 'const struct net_shaper_ops *' [-Werror=incompatible-pointer-types] 804 | .net_shaper_ops = &mana_shaper_ops, Signed-off-by: Erni Sri Satya Vennela <[email protected]> Fixes: 75cabb4 ("net: mana: Add support for net_shaper_ops") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 11cd020) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4527 commit-author Zhiyue Qiu <[email protected]> commit 084f35b Add packet and request port counters to mana_ib. Signed-off-by: Zhiyue Qiu <[email protected]> Signed-off-by: Konstantin Taranov <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Long Li <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit 084f35b) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4524 commit-author Konstantin Taranov <[email protected]> commit 44d69d3 Drain send WRs of the GSI QP on device removal. In rare servicing scenarios, the hardware may delete the state of the GSI QP, preventing it from generating CQEs for pending send WRs. Since WRs submitted to the GSI QP hold CM resources, the device cannot be removed until those WRs are completed. This patch marks all pending send WRs as failed, allowing the GSI QP to release the CM resources and enabling safe device removal. Signed-off-by: Konstantin Taranov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit 44d69d3) Signed-off-by: Shreeya Patel <[email protected]>
…nnel open. jira LE-4494 commit-author Dipayaan Roy <[email protected]> commit 9448ccd The hv_netvsc driver currently enables NAPI after opening the primary and subchannels. This ordering creates a race: if the Hyper-V host places data in the host -> guest ring buffer and signals the channel before napi_enable() has been called, the channel callback will run but napi_schedule_prep() will return false. As a result, the NAPI poller never gets scheduled, the data in the ring buffer is not consumed, and the receive queue may remain permanently stuck until another interrupt happens to arrive. Fix this by enabling NAPI and registering it with the RX/TX queues before vmbus channel is opened. This guarantees that any early host signal after open will correctly trigger NAPI scheduling and the ring buffer will be drained. Fixes: 76bb5db ("netvsc: fix use after free on module removal") Signed-off-by: Dipayaan Roy <[email protected]> Link: https://patch.msgid.link/20250825115627.GA32189@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 9448ccd) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4497 commit-author Haiyang Zhang <[email protected]> commit c4deabb If HW Channel (HWC) is not responding, reduce the waiting time, so further steps will fail quickly. This will prevent getting stuck for a long time (30 minutes or more), for example, during unloading while HWC is not responding. Signed-off-by: Haiyang Zhang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit c4deabb) Signed-off-by: Shreeya Patel <[email protected]>
jira LE-4521 commit-author Shiraz Saleem <[email protected]> commit 2bd7dd3 Extend modify QP to support further attributes: local_ack_timeout, UD qkey, rate_limit, qp_access_flags, flow_label, max_rd_atomic. Signed-off-by: Shiraz Saleem <[email protected]> Signed-off-by: Konstantin Taranov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit 2bd7dd3) Signed-off-by: Shreeya Patel <[email protected]>
…/O issuing CPU jira LE-4537 commit-author Long Li <[email protected]> commit b69ffea When selecting an outgoing channel for I/O, storvsc tries to select a channel with a returning CPU that is not the same as issuing CPU. This worked well in the past, however it doesn't work well when the Hyper-V exposes a large number of channels (up to the number of all CPUs). Use a different CPU for returning channel is not efficient on Hyper-V. Change this behavior by preferring to the channel with the same CPU as the current I/O issuing CPU whenever possible. Tests have shown improvements in newer Hyper-V/Azure environment, and no regression with older Hyper-V/Azure environments. Tested-by: Raheel Abdul Faizy <[email protected]> Signed-off-by: Long Li <[email protected]> Message-Id: <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit b69ffea) Signed-off-by: Shreeya Patel <[email protected]>
…es to improve memory efficiency. jira LE-4490 commit-author Dipayaan Roy <[email protected]> commit 730ff06 upstream-diff This patch was causing build failures due to missing commit 0f92140 ("memory-provider: dmabuf devmem memory provider") To fix it, we have removed pprm.queue_idx parameter which seems redundant in this case. This patch enhances RX buffer handling in the mana driver by allocating pages from a page pool and slicing them into MTU-sized fragments, rather than dedicating a full page per packet. This approach is especially beneficial on systems with large base page sizes like 64KB. Key improvements: - Proper integration of page pool for RX buffer allocations. - MTU-sized buffer slicing to improve memory utilization. - Reduce overall per Rx queue memory footprint. - Automatic fallback to full-page buffers when: * Jumbo frames are enabled (MTU > PAGE_SIZE / 2). * The XDP path is active, to avoid complexities with fragment reuse. Testing on VMs with 64KB pages shows around 200% throughput improvement. Memory efficiency is significantly improved due to reduced wastage in page allocations. Example: We are now able to fit 35 rx buffers in a single 64kb page for MTU size of 1500, instead of 1 rx buffer per page previously. Tested: - iperf3, iperf2, and nttcp benchmarks. - Jumbo frames with MTU 9000. - Native XDP programs (XDP_PASS, XDP_DROP, XDP_TX, XDP_REDIRECT) for testing the XDP path in driver. - Memory leak detection (kmemleak). - Driver load/unload, reboot, and stress scenarios. Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Saurabh Sengar <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Signed-off-by: Dipayaan Roy <[email protected]> Link: https://patch.msgid.link/20250814140410.GA22089@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit 730ff06) Signed-off-by: Shreeya Patel <[email protected]>
shreeya-patel98
approved these changes
Jan 28, 2026
bmastbergen
approved these changes
Jan 28, 2026
Collaborator
bmastbergen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥌
|
If this is good, can we please get it merged and run the Jenkins dist-git MR job so I can do the builds? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://ciqinc.atlassian.net/browse/KERNEL-485
Update process (This kernel CentOS base for 6.12.0-124)
src.rpms hosted by RESFrlc-10/6.12.0-124.X.1.el10_1branchelrelease.Rebuild Log
Build Log
KSelfTests