Author: Rafael J. Wysocki <[email protected]> Date: Wed Jun 18 14:17:45 2025 +0200 ACPICA: Refuse to evaluate a method if arguments are missing [ Upstream commit 6fcab2791543924d438e7fa49276d0998b0a069f ] As reported in [1], a platform firmware update that increased the number of method parameters and forgot to update a least one of its callers, caused ACPICA to crash due to use-after-free. Since this a result of a clear AML issue that arguably cannot be fixed up by the interpreter (it cannot produce missing data out of thin air), address it by making ACPICA refuse to evaluate a method if the caller attempts to pass fewer arguments than expected to it. Closes: https://github.com/acpica/acpica/issues/1027 [1] Reported-by: Peter Williams <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Tested-by: Hans de Goede <[email protected]> # Dell XPS 9640 with BIOS 1.12.0 Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Takashi Iwai <[email protected]> Date: Tue Jun 10 08:43:19 2025 +0200 ALSA: sb: Don't allow changing the DMA mode during operations [ Upstream commit ed29e073ba93f2d52832804cabdd831d5d357d33 ] When a PCM stream is already running, one shouldn't change the DMA mode via kcontrol, which may screw up the hardware. Return -EBUSY instead. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218185 Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Takashi Iwai <[email protected]> Date: Tue Jun 10 08:43:20 2025 +0200 ALSA: sb: Force to disable DMAs once when DMA mode is changed [ Upstream commit 4c267ae2ef349639b4d9ebf00dd28586a82fdbe6 ] When the DMA mode is changed on the (still real!) SB AWE32 after playing a stream and closing, the previous DMA setup was still silently kept, and it can confuse the hardware, resulting in the unexpected noises. As a workaround, enforce the disablement of DMA setups when the DMA setup is changed by the kcontrol. https://bugzilla.kernel.org/show_bug.cgi?id=218185 Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Raju Rangoju <[email protected]> Date: Tue Jul 1 00:56:36 2025 +0530 amd-xgbe: align CL37 AN sequence as per databook [ Upstream commit 42fd432fe6d320323215ebdf4de4d0d7e56e6792 ] Update the Clause 37 Auto-Negotiation implementation to properly align with the PCS hardware specifications: - Fix incorrect bit settings in Link Status and Link Duplex fields - Implement missing sequence steps 2 and 7 These changes ensure CL37 auto-negotiation protocol follows the exact sequence patterns as specified in the hardware databook. Fixes: 1bf40ada6290 ("amd-xgbe: Add support for clause 37 auto-negotiation") Signed-off-by: Raju Rangoju <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Raju Rangoju <[email protected]> Date: Tue Jul 1 12:20:16 2025 +0530 amd-xgbe: do not double read link status [ Upstream commit 16ceda2ef683a50cd0783006c0504e1931cd8879 ] The link status is latched low so that momentary link drops can be detected. Always double-reading the status defeats this design feature. Only double read if link was already down This prevents unnecessary duplicate readings of the link status. Fixes: 4f3b20bfbb75 ("amd-xgbe: add support for rx-adaptation") Signed-off-by: Raju Rangoju <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Christian Brauner <[email protected]> Date: Wed Jul 2 11:23:55 2025 +0200 anon_inode: rework assertions commit 1e7ab6f67824343ee3e96f100f0937c393749a8a upstream. Making anonymous inodes regular files comes with a lot of risk and regression potential as evidenced by a recent hickup in io_uring. We're better of continuing to not have them be regular files. Since we have S_ANON_INODE we can port all of our assertions easily. Link: https://lore.kernel.org/[email protected] Fixes: cfd86ef7e8e7 ("anon_inode: use a proper mode internally") Acked-by: Jens Axboe <[email protected]> Cc: [email protected] Reported-by: Jens Axboe <[email protected]> Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Justin Sanders <[email protected]> Date: Tue Jun 10 17:06:00 2025 +0000 aoe: defer rexmit timer downdev work to workqueue [ Upstream commit cffc873d68ab09a0432b8212008c5613f8a70a2c ] When aoe's rexmit_timer() notices that an aoe target fails to respond to commands for more than aoe_deadsecs, it calls aoedev_downdev() which cleans the outstanding aoe and block queues. This can involve sleeping, such as in blk_mq_freeze_queue(), which should not occur in irq context. This patch defers that aoedev_downdev() call to the aoe device's workqueue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=212665 Signed-off-by: Justin Sanders <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-By: Valentin Kleibel <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sven Peter <[email protected]> Date: Tue Jun 10 19:19:24 2025 +0000 arm64: dts: apple: Drop {address,size}-cells from SPI NOR [ Upstream commit 811a909978bf59caa25359e0aca4e30500dcff26 ] Fix the following warning by dropping #{address,size}-cells from the SPI NOR node which only has a single child node without reg property: spi1-nvram.dtsi:19.10-38.4: Warning (avoid_unnecessary_addr_size): /soc/spi@235104000/flash@0: unnecessary #address-cells/#size-cells without "ranges", "dma-ranges" or child "reg" property Fixes: 3febe9de5ca5 ("arm64: dts: apple: Add SPI NOR nvram partition to all devices") Reviewed-by: Janne Grunau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sven Peter <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sven Peter <[email protected]> Date: Wed Jun 11 15:18:53 2025 +0000 arm64: dts: apple: Move touchbar mipi {address,size}-cells from dtsi to dts [ Upstream commit 08a0d93c353bd55de8b5fb77b464d89425be0215 ] Move the {address,size}-cells property from the (disabled) touchbar screen mipi node inside the dtsi file to the model-specific dts file where it's enabled to fix the following W=1 warnings: t8103.dtsi:404.34-433.5: Warning (avoid_unnecessary_addr_size): /soc/dsi@228600000: unnecessary #address-cells/#size-cells without "ranges", "dma-ranges" or child "reg" property t8112.dtsi:419.34-448.5: Warning (avoid_unnecessary_addr_size): /soc/dsi@228600000: unnecessary #address-cells/#size-cells without "ranges", "dma-ranges" or child "reg" property Fixes: 7275e795e520 ("arm64: dts: apple: Add touchbar screen nodes") Reviewed-by: Janne Grunau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sven Peter <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Janne Grunau <[email protected]> Date: Wed Jun 11 22:30:31 2025 +0200 arm64: dts: apple: t8103: Fix PCIe BCM4377 nodename [ Upstream commit ac1daa91e9370e3b88ef7826a73d62a4d09e2717 ] Fix the following `make dtbs_check` warnings for all t8103 based devices: arch/arm64/boot/dts/apple/t8103-j274.dtb: network@0,0: $nodename:0: 'network@0,0' does not match '^wifi(@.*)?$' from schema $id: http://devicetree.org/schemas/net/wireless/brcm,bcm4329-fmac.yaml# arch/arm64/boot/dts/apple/t8103-j274.dtb: network@0,0: Unevaluated properties are not allowed ('local-mac-address' was unexpected) from schema $id: http://devicetree.org/schemas/net/wireless/brcm,bcm4329-fmac.yaml# Fixes: bf2c05b619ff ("arm64: dts: apple: t8103: Expose PCI node for the WiFi MAC address") Signed-off-by: Janne Grunau <[email protected]> Reviewed-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sven Peter <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Gabriel Santese <[email protected]> Date: Fri May 30 02:52:32 2025 +0200 ASoC: amd: yc: Add quirk for MSI Bravo 17 D7VF internal mic [ Upstream commit ba06528ad5a31923efc24324706116ccd17e12d8 ] MSI Bravo 17 (D7VF), like other laptops from the family, has broken ACPI tables and needs a quirk for internal mic to work properly. Signed-off-by: Gabriel Santese <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Raven Black <[email protected]> Date: Fri Jun 13 07:51:25 2025 -0400 ASoC: amd: yc: update quirk data for HP Victus [ Upstream commit 13b86ea92ebf0fa587fbadfb8a60ca2e9993203f ] Make the internal microphone work on HP Victus laptops. Signed-off-by: Raven Black <[email protected]> Link: https://patch.msgid.link/20250613-support-hp-victus-microphone-v1-1-bebc4c3a2041@gmail.com Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tasos Sahanidis <[email protected]> Date: Mon May 19 11:56:55 2025 +0300 ata: libata-acpi: Do not assume 40 wire cable if no devices are enabled [ Upstream commit 33877220b8641b4cde474a4229ea92c0e3637883 ] On at least an ASRock 990FX Extreme 4 with a VIA VT6330, the devices have not yet been enabled by the first time ata_acpi_cbl_80wire() is called. This means that the ata_for_each_dev loop is never entered, and a 40 wire cable is assumed. The VIA controller on this board does not report the cable in the PCI config space, thus having to fall back to ACPI even though no SATA bridge is present. The _GTM values are correctly reported by the firmware through ACPI, which has already set up faster transfer modes, but due to the above the controller is forced down to a maximum of UDMA/33. Resolve this by modifying ata_acpi_cbl_80wire() to directly return the cable type. First, an unknown cable is assumed which preserves the mode set by the firmware, and then on subsequent calls when the devices have been enabled, an 80 wire cable is correctly detected. Since the function now directly returns the cable type, it is renamed to ata_acpi_cbl_pata_type(). Signed-off-by: Tasos Sahanidis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Johannes Berg <[email protected]> Date: Fri Jun 6 11:01:11 2025 +0200 ata: pata_cs5536: fix build on 32-bit UML [ Upstream commit fe5b391fc56f77cf3c22a9dd4f0ce20db0e3533f ] On 32-bit ARCH=um, CONFIG_X86_32 is still defined, so it doesn't indicate building on real X86 machines. There's no MSR on UML though, so add a check for CONFIG_X86. Reported-by: Arnd Bergmann <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Christian Eggers <[email protected]> Date: Fri Jun 27 09:05:08 2025 +0200 Bluetooth: HCI: Set extended advertising data synchronously commit 89fb8acc38852116d38d721ad394aad7f2871670 upstream. Currently, for controllers with extended advertising, the advertising data is set in the asynchronous response handler for extended adverstising params. As most advertising settings are performed in a synchronous context, the (asynchronous) setting of the advertising data is done too late (after enabling the advertising). Move setting of adverstising data from asynchronous response handler into synchronous context to fix ordering of HCI commands. Signed-off-by: Christian Eggers <[email protected]> Fixes: a0fb3726ba55 ("Bluetooth: Use Set ext adv/scan rsp data if controller supports") Cc: [email protected] v2: https://lore.kernel.org/linux-bluetooth/[email protected]/ Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Christian Eggers <[email protected]> Date: Wed Jun 25 15:09:29 2025 +0200 Bluetooth: hci_sync: revert some mesh modifications commit 46c0d947b64ac8efcf89dd754213dab5d1bd00aa upstream. This reverts minor parts of the changes made in commit b338d91703fa ("Bluetooth: Implement support for Mesh"). It looks like these changes were only made for development purposes but shouldn't have been part of the commit. Fixes: b338d91703fa ("Bluetooth: Implement support for Mesh") Cc: [email protected] Signed-off-by: Christian Eggers <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Christian Eggers <[email protected]> Date: Wed Jun 25 15:09:31 2025 +0200 Bluetooth: MGMT: mesh_send: check instances prior disabling advertising commit f3cb5676e5c11c896ba647ee309a993e73531588 upstream. The unconditional call of hci_disable_advertising_sync() in mesh_send_done_sync() also disables other LE advertisings (non mesh related). I am not sure whether this call is required at all, but checking the adv_instances list (like done at other places) seems to solve the problem. Fixes: b338d91703fa ("Bluetooth: Implement support for Mesh") Cc: [email protected] Signed-off-by: Christian Eggers <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Christian Eggers <[email protected]> Date: Wed Jun 25 15:09:30 2025 +0200 Bluetooth: MGMT: set_mesh: update LE scan interval and window commit e5af67a870f738bb8a4594b6c60c2caf4c87a3c9 upstream. According to the message of commit b338d91703fa ("Bluetooth: Implement support for Mesh"), MGMT_OP_SET_MESH_RECEIVER should set the passive scan parameters. Currently the scan interval and window parameters are silently ignored, although user space (bluetooth-meshd) expects that they can be used [1] [1] https://git.kernel.org/pub/scm/bluetooth/bluez.git/tree/mesh/mesh-io-mgmt.c#n344 Fixes: b338d91703fa ("Bluetooth: Implement support for Mesh") Cc: [email protected] Signed-off-by: Christian Eggers <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Yang Li <[email protected]> Date: Thu Jun 19 11:01:07 2025 +0800 Bluetooth: Prevent unintended pause by checking if advertising is active [ Upstream commit 1f029b4e30a602db33dedee5ac676e9236ad193c ] When PA Create Sync is enabled, advertising resumes unexpectedly. Therefore, it's necessary to check whether advertising is currently active before attempting to pause it. < HCI Command: LE Add Device To... (0x08|0x0011) plen 7 #1345 [hci0] 48.306205 Address type: Random (0x01) Address: 4F:84:84:5F:88:17 (Resolvable) Identity type: Random (0x01) Identity: FC:5B:8C:F7:5D:FB (Static) < HCI Command: LE Set Address Re.. (0x08|0x002d) plen 1 #1347 [hci0] 48.308023 Address resolution: Enabled (0x01) ... < HCI Command: LE Set Extended A.. (0x08|0x0039) plen 6 #1349 [hci0] 48.309650 Extended advertising: Enabled (0x01) Number of sets: 1 (0x01) Entry 0 Handle: 0x01 Duration: 0 ms (0x00) Max ext adv events: 0 ... < HCI Command: LE Periodic Adve.. (0x08|0x0044) plen 14 #1355 [hci0] 48.314575 Options: 0x0000 Use advertising SID, Advertiser Address Type and address Reporting initially enabled SID: 0x02 Adv address type: Random (0x01) Adv address: 4F:84:84:5F:88:17 (Resolvable) Identity type: Random (0x01) Identity: FC:5B:8C:F7:5D:FB (Static) Skip: 0x0000 Sync timeout: 20000 msec (0x07d0) Sync CTE type: 0x0000 Fixes: ad383c2c65a5 ("Bluetooth: hci_sync: Enable advertising when LL privacy is enabled") Signed-off-by: Yang Li <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Filipe Manana <[email protected]> Date: Sat Jun 7 13:43:32 2025 +0100 btrfs: fix failure to rebuild free space tree using multiple transactions [ Upstream commit 1e6ed33cabba8f06f532f2e5851a102602823734 ] If we are rebuilding a free space tree, while modifying the free space tree we may need to allocate a new metadata block group. If we end up using multiple transactions for the rebuild, when we call btrfs_end_transaction() we enter btrfs_create_pending_block_groups() which calls add_block_group_free_space() to add items to the free space tree for the block group. Then later during the free space tree rebuild, at btrfs_rebuild_free_space_tree(), we may find such new block groups and call populate_free_space_tree() for them, which fails with -EEXIST because there are already items in the free space tree. Then we abort the transaction with -EEXIST at btrfs_rebuild_free_space_tree(). Notice that we say "may find" the new block groups because a new block group may be inserted in the block groups rbtree, which is being iterated by the rebuild process, before or after the current node where the rebuild process is currently at. Syzbot recently reported such case which produces a trace like the following: ------------[ cut here ]------------ BTRFS: Transaction aborted (error -17) WARNING: CPU: 1 PID: 7626 at fs/btrfs/free-space-tree.c:1341 btrfs_rebuild_free_space_tree+0x470/0x54c fs/btrfs/free-space-tree.c:1341 Modules linked in: CPU: 1 UID: 0 PID: 7626 Comm: syz.2.25 Not tainted 6.15.0-rc7-syzkaller-00085-gd7fa1af5b33e-dirty #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : btrfs_rebuild_free_space_tree+0x470/0x54c fs/btrfs/free-space-tree.c:1341 lr : btrfs_rebuild_free_space_tree+0x470/0x54c fs/btrfs/free-space-tree.c:1341 sp : ffff80009c4f7740 x29: ffff80009c4f77b0 x28: ffff0000d4c3f400 x27: 0000000000000000 x26: dfff800000000000 x25: ffff70001389eee8 x24: 0000000000000003 x23: 1fffe000182b6e7b x22: 0000000000000000 x21: ffff0000c15b73d8 x20: 00000000ffffffef x19: ffff0000c15b7378 x18: 1fffe0003386f276 x17: ffff80008f31e000 x16: ffff80008adbe98c x15: 0000000000000001 x14: 1fffe0001b281550 x13: 0000000000000000 x12: 0000000000000000 x11: ffff60001b281551 x10: 0000000000000003 x9 : 1c8922000a902c00 x8 : 1c8922000a902c00 x7 : ffff800080485878 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff80008047843c x2 : 0000000000000001 x1 : ffff80008b3ebc40 x0 : 0000000000000001 Call trace: btrfs_rebuild_free_space_tree+0x470/0x54c fs/btrfs/free-space-tree.c:1341 (P) btrfs_start_pre_rw_mount+0xa78/0xe10 fs/btrfs/disk-io.c:3074 btrfs_remount_rw fs/btrfs/super.c:1319 [inline] btrfs_reconfigure+0x828/0x2418 fs/btrfs/super.c:1543 reconfigure_super+0x1d4/0x6f0 fs/super.c:1083 do_remount fs/namespace.c:3365 [inline] path_mount+0xb34/0xde0 fs/namespace.c:4200 do_mount fs/namespace.c:4221 [inline] __do_sys_mount fs/namespace.c:4432 [inline] __se_sys_mount fs/namespace.c:4409 [inline] __arm64_sys_mount+0x3e8/0x468 fs/namespace.c:4409 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x58/0x17c arch/arm64/kernel/entry-common.c:767 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:786 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600 irq event stamp: 330 hardirqs last enabled at (329): [<ffff80008048590c>] raw_spin_rq_unlock_irq kernel/sched/sched.h:1525 [inline] hardirqs last enabled at (329): [<ffff80008048590c>] finish_lock_switch+0xb0/0x1c0 kernel/sched/core.c:5130 hardirqs last disabled at (330): [<ffff80008adb9e60>] el1_dbg+0x24/0x80 arch/arm64/kernel/entry-common.c:511 softirqs last enabled at (10): [<ffff8000801fbf10>] local_bh_enable+0x10/0x34 include/linux/bottom_half.h:32 softirqs last disabled at (8): [<ffff8000801fbedc>] local_bh_disable+0x10/0x34 include/linux/bottom_half.h:19 ---[ end trace 0000000000000000 ]--- Fix this by flagging new block groups which had their free space tree entries already added and then skip them in the rebuild process. Also, since the rebuild may be triggered when doing a remount, make sure that when we clear an existing free space tree that we clear such flag from every existing block group, otherwise we would skip those block groups during the rebuild. Reported-by: [email protected] Link: https://lore.kernel.org/linux-btrfs/[email protected]/ Fixes: 882af9f13e83 ("btrfs: handle free space tree rebuild in multiple transactions") Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Filipe Manana <[email protected]> Date: Wed Jun 18 15:58:31 2025 +0100 btrfs: fix inode lookup error handling during log replay [ Upstream commit 5f61b961599acbd2bed028d3089105a1f7d224b8 ] When replaying log trees we use read_one_inode() to get an inode, which is just a wrapper around btrfs_iget_logging(), which in turn is a wrapper for btrfs_iget(). But read_one_inode() always returns NULL for any error that btrfs_iget_logging() / btrfs_iget() may return and this is a problem because: 1) In many callers of read_one_inode() we convert the NULL into -EIO, which is not accurate since btrfs_iget() may return -ENOMEM and -ENOENT for example, besides -EIO and other errors. So during log replay we may end up reporting a false -EIO, which is confusing since we may not have had any IO error at all; 2) When replaying directory deletes, at replay_dir_deletes(), we assume the NULL returned from read_one_inode() means that the inode doesn't exist and then proceed as if no error had happened. This is wrong because unless btrfs_iget() returned ERR_PTR(-ENOENT), we had an actual error and the target inode may exist in the target subvolume root - this may later result in the log replay code failing at a later stage (if we are "lucky") or succeed but leaving some inconsistency in the filesystem. So fix this by not ignoring errors from btrfs_iget_logging() and as a consequence remove the read_one_inode() wrapper and just use btrfs_iget_logging() directly. Also since btrfs_iget_logging() is supposed to be called only against subvolume roots, just like read_one_inode() which had a comment about it, add an assertion to btrfs_iget_logging() to check that the target root corresponds to a subvolume root. Fixes: 5d4f98a28c7d ("Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE)") Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Filipe Manana <[email protected]> Date: Mon Jun 23 12:11:58 2025 +0100 btrfs: fix iteration of extrefs during log replay [ Upstream commit 54a7081ed168b72a8a2d6ef4ba3a1259705a2926 ] At __inode_add_ref() when processing extrefs, if we jump into the next label we have an undefined value of victim_name.len, since we haven't initialized it before we did the goto. This results in an invalid memory access in the next iteration of the loop since victim_name.len was not initialized to the length of the name of the current extref. Fix this by initializing victim_name.len with the current extref's name length. Fixes: e43eec81c516 ("btrfs: use struct qstr instead of name and namelen pairs") Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Filipe Manana <[email protected]> Date: Wed Jun 18 16:57:07 2025 +0100 btrfs: fix missing error handling when searching for inode refs during log replay [ Upstream commit 6561a40ceced9082f50c374a22d5966cf9fc5f5c ] During log replay, at __add_inode_ref(), when we are searching for inode ref keys we totally ignore if btrfs_search_slot() returns an error. This may make a log replay succeed when there was an actual error and leave some metadata inconsistency in a subvolume tree. Fix this by checking if an error was returned from btrfs_search_slot() and if so, return it to the caller. Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations") Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Filipe Manana <[email protected]> Date: Fri Jun 20 15:54:05 2025 +0100 btrfs: propagate last_unlink_trans earlier when doing a rmdir [ Upstream commit c466e33e729a0ee017d10d919cba18f503853c60 ] In case the removed directory had a snapshot that was deleted, we are propagating its inode's last_unlink_trans to the parent directory after we removed the entry from the parent directory. This leaves a small race window where someone can log the parent directory after we removed the entry and before we updated last_unlink_trans, and as a result if we ever try to replay such a log tree, we will fail since we will attempt to remove a snapshot during log replay, which is currently not possible and results in the log replay (and mount) to fail. This is the type of failure described in commit 1ec9a1ae1e30 ("Btrfs: fix unreplayable log after snapshot delete + parent dir fsync"). So fix this by propagating the last_unlink_trans to the parent directory before we remove the entry from it. Fixes: 44f714dae50a ("Btrfs: improve performance on fsync against new inode after rename/unlink") Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Filipe Manana <[email protected]> Date: Thu Jun 19 13:13:38 2025 +0100 btrfs: record new subvolume in parent dir earlier to avoid dir logging races [ Upstream commit bf5bcf9a6fa070ec8a725b08db63fb1318f77366 ] Instead of recording that a new subvolume was created in a directory after we add the entry do the directory, record it before adding the entry. This is to avoid races where after creating the entry and before recording the new subvolume in the directory (the call to btrfs_record_new_subvolume()), another task logs the directory, so we end up with a log tree where we logged a directory that has an entry pointing to a root that was not yet committed, resulting in an invalid entry if the log is persisted and replayed later due to a power failure or crash. Also state this requirement in the function comment for btrfs_record_new_subvolume(), similar to what we do for the btrfs_record_unlink_dir() and btrfs_record_snapshot_destroy(). Fixes: 45c4102f0d82 ("btrfs: avoid transaction commit on any fsync after subvolume creation") Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Filipe Manana <[email protected]> Date: Fri Jun 20 16:37:01 2025 +0100 btrfs: use btrfs_record_snapshot_destroy() during rmdir [ Upstream commit 157501b0469969fc1ba53add5049575aadd79d80 ] We are setting the parent directory's last_unlink_trans directly which may result in a concurrent task starting to log the directory not see the update and therefore can log the directory after we removed a child directory which had a snapshot within instead of falling back to a transaction commit. Replaying such a log tree would result in a mount failure since we can't currently delete snapshots (and subvolumes) during log replay. This is the type of failure described in commit 1ec9a1ae1e30 ("Btrfs: fix unreplayable log after snapshot delete + parent dir fsync"). Fix this by using btrfs_record_snapshot_destroy() which updates the last_unlink_trans field while holding the inode's log_mutex lock. Fixes: 44f714dae50a ("Btrfs: improve performance on fsync against new inode after rename/unlink") Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Shyam Prasad N <[email protected]> Date: Mon Jun 30 23:09:34 2025 +0530 cifs: all initializations for tcon should happen in tcon_info_alloc commit 74ebd02163fde05baa23129e06dde4b8f0f2377a upstream. Today, a few work structs inside tcon are initialized inside cifs_get_tcon and not in tcon_info_alloc. As a result, if a tcon is obtained from tcon_info_alloc, but not called as a part of cifs_get_tcon, we may trip over. Cc: <[email protected]> Signed-off-by: Shyam Prasad N <[email protected]> Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Christian König <[email protected]> Date: Tue Jan 28 10:47:48 2025 +0100 dma-buf: fix timeout handling in dma_resv_wait_timeout v2 commit 2b95a7db6e0f75587bffddbb490399cbb87e4985 upstream. Even the kerneldoc says that with a zero timeout the function should not wait for anything, but still return 1 to indicate that the fences are signaled now. Unfortunately that isn't what was implemented, instead of only returning 1 we also waited for at least one jiffies. Fix that by adjusting the handling to what the function is actually documented to do. v2: improve code readability Reported-by: Marek Olšák <[email protected]> Reported-by: Lucas Stach <[email protected]> Signed-off-by: Christian König <[email protected]> Reviewed-by: Lucas Stach <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Fushuai Wang <[email protected]> Date: Thu Jun 26 21:30:03 2025 +0800 dpaa2-eth: fix xdp_rxq_info leak [ Upstream commit 2def09ead4ad5907988b655d1e1454003aaf8297 ] The driver registered xdp_rxq_info structures via xdp_rxq_info_reg() but failed to properly unregister them in error paths and during removal. Fixes: d678be1dc1ec ("dpaa2-eth: add XDP_REDIRECT support") Signed-off-by: Fushuai Wang <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Ioana Ciornei <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Dmitry Baryshkov <[email protected]> Date: Sun Jun 8 18:52:04 2025 +0300 drm/bridge: aux-hpd-bridge: fix assignment of the of_node [ Upstream commit e8537cad824065b0425fb0429e762e14a08067c2 ] Perform fix similar to the one in the commit 85e444a68126 ("drm/bridge: Fix assignment of the of_node of the parent to aux bridge"). The assignment of the of_node to the aux HPD bridge needs to mark the of_node as reused, otherwise driver core will attempt to bind resources like pinctrl, which is going to fail as corresponding pins are already marked as used by the parent device. Fix that by using the device_set_of_node_from_dev() helper instead of assigning it directly. Fixes: e560518a6c2e ("drm/bridge: implement generic DP HPD bridge") Signed-off-by: Dmitry Baryshkov <[email protected]> Reviewed-by: Neil Armstrong <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Dmitry Baryshkov <[email protected]> Date: Thu Feb 20 17:07:26 2025 +0200 drm/bridge: panel: move prepare_prev_first handling to drm_panel_bridge_add_typed [ Upstream commit eb028cd884e1b0976ff8c5944ee6650fe3ed0a6c ] The commit 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") and commit 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") added handling of panel's prepare_prev_first to devm_panel_bridge_add() and drmm_panel_bridge_add(). However if the driver calls drm_panel_bridge_add_typed() directly, then the flag won't be handled and thus the drm_bridge.pre_enable_prev_first will not be set. Move prepare_prev_first handling to the drm_panel_bridge_add_typed() so that there is no way to miss the flag. Fixes: 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") Fixes: 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") Reported-by: Svyatoslav Ryhel <[email protected]> Closes: https://lore.kernel.org/dri-devel/CAPVz0n3YZass3Bns1m0XrFxtAC0DKbEPiW6vXimQx97G243sXw@mail.gmail.com/ Signed-off-by: Dmitry Baryshkov <[email protected]> Reviewed-by: Neil Armstrong <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Marek Szyprowski <[email protected]> Date: Wed Jun 18 14:06:26 2025 +0200 drm/exynos: fimd: Guard display clock control with runtime PM calls [ Upstream commit 5d91394f236167ac624b823820faf4aa928b889e ] Commit c9b1150a68d9 ("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable") changed the call sequence to the CRTC enable/disable and bridge pre_enable/post_disable methods, so those bridge methods are now called when CRTC is not yet enabled. This causes a lockup observed on Samsung Peach-Pit/Pi Chromebooks. The source of this lockup is a call to fimd_dp_clock_enable() function, when FIMD device is not yet runtime resumed. It worked before the mentioned commit only because the CRTC implemented by the FIMD driver was always enabled what guaranteed the FIMD device to be runtime resumed. This patch adds runtime PM guards to the fimd_dp_clock_enable() function to enable its proper operation also when the CRTC implemented by FIMD is not yet enabled. Fixes: 196e059a8a6a ("drm/exynos: convert clock_enable crtc callback to pipeline clock") Signed-off-by: Marek Szyprowski <[email protected]> Reviewed-by: Tomi Valkeinen <[email protected]> Signed-off-by: Inki Dae <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Junxiao Chang <[email protected]> Date: Fri Apr 25 23:11:07 2025 +0800 drm/i915/gsc: mei interrupt top half should be in irq disabled context [ Upstream commit 8cadce97bf264ed478669c6f32d5603b34608335 ] MEI GSC interrupt comes from i915. It has top half and bottom half. Top half is called from i915 interrupt handler. It should be in irq disabled context. With RT kernel, by default i915 IRQ handler is in threaded IRQ. MEI GSC top half might be in threaded IRQ context. generic_handle_irq_safe API could be called from either IRQ or process context, it disables local IRQ then calls MEI GSC interrupt top half. This change fixes A380/A770 GPU boot hang issue with RT kernel. Fixes: 1e3dc1d8622b ("drm/i915/gsc: add gsc as a mei auxiliary device") Tested-by: Furong Zhou <[email protected]> Suggested-by: Sebastian Andrzej Siewior <[email protected]> Acked-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Junxiao Chang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> (cherry picked from commit dccf655f69002d496a527ba441b4f008aa5bebbf) Signed-off-by: Joonas Lahtinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Janusz Krzysztofik <[email protected]> Date: Wed Jun 11 12:42:13 2025 +0200 drm/i915/gt: Fix timeline left held on VMA alloc error [ Upstream commit a5aa7bc1fca78c7fa127d9e33aa94a0c9066c1d6 ] The following error has been reported sporadically by CI when a test unbinds the i915 driver on a ring submission platform: <4> [239.330153] ------------[ cut here ]------------ <4> [239.330166] i915 0000:00:02.0: [drm] drm_WARN_ON(dev_priv->mm.shrink_count) <4> [239.330196] WARNING: CPU: 1 PID: 18570 at drivers/gpu/drm/i915/i915_gem.c:1309 i915_gem_cleanup_early+0x13e/0x150 [i915] ... <4> [239.330640] RIP: 0010:i915_gem_cleanup_early+0x13e/0x150 [i915] ... <4> [239.330942] Call Trace: <4> [239.330944] <TASK> <4> [239.330949] i915_driver_late_release+0x2b/0xa0 [i915] <4> [239.331202] i915_driver_release+0x86/0xa0 [i915] <4> [239.331482] devm_drm_dev_init_release+0x61/0x90 <4> [239.331494] devm_action_release+0x15/0x30 <4> [239.331504] release_nodes+0x3d/0x120 <4> [239.331517] devres_release_all+0x96/0xd0 <4> [239.331533] device_unbind_cleanup+0x12/0x80 <4> [239.331543] device_release_driver_internal+0x23a/0x280 <4> [239.331550] ? bus_find_device+0xa5/0xe0 <4> [239.331563] device_driver_detach+0x14/0x20 ... <4> [357.719679] ---[ end trace 0000000000000000 ]--- If the test also unloads the i915 module then that's followed with: <3> [357.787478] ============================================================================= <3> [357.788006] BUG i915_vma (Tainted: G U W N ): Objects remaining on __kmem_cache_shutdown() <3> [357.788031] ----------------------------------------------------------------------------- <3> [357.788204] Object 0xffff888109e7f480 @offset=29824 <3> [357.788670] Allocated in i915_vma_instance+0xee/0xc10 [i915] age=292729 cpu=4 pid=2244 <4> [357.788994] i915_vma_instance+0xee/0xc10 [i915] <4> [357.789290] init_status_page+0x7b/0x420 [i915] <4> [357.789532] intel_engines_init+0x1d8/0x980 [i915] <4> [357.789772] intel_gt_init+0x175/0x450 [i915] <4> [357.790014] i915_gem_init+0x113/0x340 [i915] <4> [357.790281] i915_driver_probe+0x847/0xed0 [i915] <4> [357.790504] i915_pci_probe+0xe6/0x220 [i915] ... Closer analysis of CI results history has revealed a dependency of the error on a few IGT tests, namely: - igt@api_intel_allocator@fork-simple-stress-signal, - igt@api_intel_allocator@two-level-inception-interruptible, - igt@gem_linear_blits@interruptible, - igt@prime_mmap_coherency@ioctl-errors, which invisibly trigger the issue, then exhibited with first driver unbind attempt. All of the above tests perform actions which are actively interrupted with signals. Further debugging has allowed to narrow that scope down to DRM_IOCTL_I915_GEM_EXECBUFFER2, and ring_context_alloc(), specific to ring submission, in particular. If successful then that function, or its execlists or GuC submission equivalent, is supposed to be called only once per GEM context engine, followed by raise of a flag that prevents the function from being called again. The function is expected to unwind its internal errors itself, so it may be safely called once more after it returns an error. In case of ring submission, the function first gets a reference to the engine's legacy timeline and then allocates a VMA. If the VMA allocation fails, e.g. when i915_vma_instance() called from inside is interrupted with a signal, then ring_context_alloc() fails, leaving the timeline held referenced. On next I915_GEM_EXECBUFFER2 IOCTL, another reference to the timeline is got, and only that last one is put on successful completion. As a consequence, the legacy timeline, with its underlying engine status page's VMA object, is still held and not released on driver unbind. Get the legacy timeline only after successful allocation of the context engine's VMA. v2: Add a note on other submission methods (Krzysztof Karas): Both execlists and GuC submission use lrc_alloc() which seems free from a similar issue. Fixes: 75d0a7f31eec ("drm/i915: Lift timeline into intel_context") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061 Cc: Chris Wilson <[email protected]> Cc: Matthew Auld <[email protected]> Cc: Krzysztof Karas <[email protected]> Reviewed-by: Sebastian Brzezinka <[email protected]> Reviewed-by: Krzysztof Niemiec <[email protected]> Signed-off-by: Janusz Krzysztofik <[email protected]> Reviewed-by: Nitin Gote <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit cc43422b3cc79eacff4c5a8ba0d224688ca9dd4f) Signed-off-by: Joonas Lahtinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Dan Carpenter <[email protected]> Date: Wed Jun 25 10:21:58 2025 -0500 drm/i915/selftests: Change mock_request() to return error pointers [ Upstream commit caa7c7a76b78ce41d347003f84975125383e6b59 ] There was an error pointer vs NULL bug in __igt_breadcrumbs_smoketest(). The __mock_request_alloc() function implements the smoketest->request_alloc() function pointer. It was supposed to return error pointers, but it propogates the NULL return from mock_request() so in the event of a failure, it would lead to a NULL pointer dereference. To fix this, change the mock_request() function to return error pointers and update all the callers to expect that. Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]> (cherry picked from commit 778fa8ad5f0f23397d045c7ebca048ce8def1c43) Signed-off-by: Joonas Lahtinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Rob Clark <[email protected]> Date: Wed May 14 09:33:32 2025 -0700 drm/msm: Fix a fence leak in submit error path [ Upstream commit 5d319f75ccf7f0927425a7545aa1a22b3eedc189 ] In error paths, we could unref the submit without calling drm_sched_entity_push_job(), so msm_job_free() will never get called. Since drm_sched_job_cleanup() will NULL out the s_fence, we can use that to detect this case. Signed-off-by: Rob Clark <[email protected]> Patchwork: https://patchwork.freedesktop.org/patch/653584/ Signed-off-by: Rob Clark <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Rob Clark <[email protected]> Date: Wed May 14 09:33:33 2025 -0700 drm/msm: Fix another leak in the submit error path [ Upstream commit f681c2aa8676a890eacc84044717ab0fd26e058f ] put_unused_fd() doesn't free the installed file, if we've already done fd_install(). So we need to also free the sync_file. Signed-off-by: Rob Clark <[email protected]> Patchwork: https://patchwork.freedesktop.org/patch/653583/ Signed-off-by: Rob Clark <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Maíra Canal <[email protected]> Date: Sat Jun 28 19:42:42 2025 -0300 drm/v3d: Disable interrupts before resetting the GPU commit 226862f50a7a88e4e4de9abbf36c64d19acd6fd0 upstream. Currently, an interrupt can be triggered during a GPU reset, which can lead to GPU hangs and NULL pointer dereference in an interrupt context as shown in the following trace: [ 314.035040] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0 [ 314.043822] Mem abort info: [ 314.046606] ESR = 0x0000000096000005 [ 314.050347] EC = 0x25: DABT (current EL), IL = 32 bits [ 314.055651] SET = 0, FnV = 0 [ 314.058695] EA = 0, S1PTW = 0 [ 314.061826] FSC = 0x05: level 1 translation fault [ 314.066694] Data abort info: [ 314.069564] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 314.075039] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 314.080080] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 314.085382] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000102728000 [ 314.091814] [00000000000000c0] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 314.100511] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP [ 314.106770] Modules linked in: v3d i2c_brcmstb vc4 snd_soc_hdmi_codec gpu_sched drm_shmem_helper drm_display_helper cec drm_dma_helper drm_kms_helper drm drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd backlight [ 314.129654] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.25+rpt-rpi-v8 #1 Debian 1:6.12.25-1+rpt1 [ 314.139388] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT) [ 314.145211] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 314.152165] pc : v3d_irq+0xec/0x2e0 [v3d] [ 314.156187] lr : v3d_irq+0xe0/0x2e0 [v3d] [ 314.160198] sp : ffffffc080003ea0 [ 314.163502] x29: ffffffc080003ea0 x28: ffffffec1f184980 x27: 021202b000000000 [ 314.170633] x26: ffffffec1f17f630 x25: ffffff8101372000 x24: ffffffec1f17d9f0 [ 314.177764] x23: 000000000000002a x22: 000000000000002a x21: ffffff8103252000 [ 314.184895] x20: 0000000000000001 x19: 00000000deadbeef x18: 0000000000000000 [ 314.192026] x17: ffffff94e51d2000 x16: ffffffec1dac3cb0 x15: c306000000000000 [ 314.199156] x14: 0000000000000000 x13: b2fc982e03cc5168 x12: 0000000000000001 [ 314.206286] x11: ffffff8103f8bcc0 x10: ffffffec1f196868 x9 : ffffffec1dac3874 [ 314.213416] x8 : 0000000000000000 x7 : 0000000000042a3a x6 : ffffff810017a180 [ 314.220547] x5 : ffffffec1ebad400 x4 : ffffffec1ebad320 x3 : 00000000000bebeb [ 314.227677] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 [ 314.234807] Call trace: [ 314.237243] v3d_irq+0xec/0x2e0 [v3d] [ 314.240906] __handle_irq_event_percpu+0x58/0x218 [ 314.245609] handle_irq_event+0x54/0xb8 [ 314.249439] handle_fasteoi_irq+0xac/0x240 [ 314.253527] handle_irq_desc+0x48/0x68 [ 314.257269] generic_handle_domain_irq+0x24/0x38 [ 314.261879] gic_handle_irq+0x48/0xd8 [ 314.265533] call_on_irq_stack+0x24/0x58 [ 314.269448] do_interrupt_handler+0x88/0x98 [ 314.273624] el1_interrupt+0x34/0x68 [ 314.277193] el1h_64_irq_handler+0x18/0x28 [ 314.281281] el1h_64_irq+0x64/0x68 [ 314.284673] default_idle_call+0x3c/0x168 [ 314.288675] do_idle+0x1fc/0x230 [ 314.291895] cpu_startup_entry+0x3c/0x50 [ 314.295810] rest_init+0xe4/0xf0 [ 314.299030] start_kernel+0x5e8/0x790 [ 314.302684] __primary_switched+0x80/0x90 [ 314.306691] Code: 940029eb 360ffc13 f9442ea0 52800001 (f9406017) [ 314.312775] ---[ end trace 0000000000000000 ]--- [ 314.317384] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 314.324249] SMP: stopping secondary CPUs [ 314.328167] Kernel Offset: 0x2b9da00000 from 0xffffffc080000000 [ 314.334076] PHYS_OFFSET: 0x0 [ 314.336946] CPU features: 0x08,00002013,c0200000,0200421b [ 314.342337] Memory Limit: none [ 314.345382] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- Before resetting the GPU, it's necessary to disable all interrupts and deal with any interrupt handler still in-flight. Otherwise, the GPU might reset with jobs still running, or yet, an interrupt could be handled during the reset. Cc: [email protected] Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+") Reviewed-by: Juan A. Suarez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maíra Canal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Marko Kiiskila <[email protected]> Date: Wed Jun 18 15:29:26 2025 -0400 drm/vmwgfx: Fix guests running with TDX/SEV [ Upstream commit 7dfede7d7edd18c0c91ca854cde8eaaf4ccf97ea ] Commit 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") changed the default behavior of memremap(MEMREMAP_WB) and started mapping memory as encrypted. The driver requires the fifo memory to be decrypted to communicate with the host but was relaying on the old default behavior of memremap(MEMREMAP_WB) and thus broke. Fix it by explicitly specifying the desired behavior and passing MEMREMAP_DEC to memremap. Fixes: 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") Signed-off-by: Marko Kiiskila <[email protected]> Signed-off-by: Zack Rusin <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Acked-by: Kirill A. Shutemov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Vinay Belgaumkar <[email protected]> Date: Thu Jun 12 00:09:02 2025 -0700 drm/xe/bmg: Update Wa_14022085890 [ Upstream commit a5c7dcdd969f2248cc91d65e5ac852859fc8dac2 ] Set GT min frequency to 1200Mhz once driver load is complete. v2: Review comments (Rodrigo) v3: Apply Wa earlier so user_req_min is not clobbered. v4: Apply to all GTs (Lucas) Cc: Matt Roper <[email protected]> Cc: Rodrigo Vivi <[email protected]> Signed-off-by: Vinay Belgaumkar <[email protected]> Reviewed-by: Stuart Summers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit bdde16c9ac5cb56ad2ee19792222fa1853577af7) Signed-off-by: Lucas De Marchi <[email protected]> Stable-dep-of: 84c0b4a00610 ("drm/xe/bmg: Update Wa_22019338487") Signed-off-by: Sasha Levin <[email protected]>
Author: Vinay Belgaumkar <[email protected]> Date: Wed Jun 18 11:50:01 2025 -0700 drm/xe/bmg: Update Wa_22019338487 [ Upstream commit 84c0b4a00610afbde650fdb8ad6db0424f7b2cc3 ] Limit GT max frequency to 2600MHz and wait for frequency to reduce before proceeding with a transient flush. This is really only needed for the transient flush: if L2 flush is needed due to 16023588340 then there's no need to do this additional wait since we are already using the bigger hammer. v2: Use generic names, ensure user set max frequency requests wait for flush to complete (Rodrigo) v3: - User requests wait via wait_var_event_timeout (Lucas) - Close races on flush + user requests (Lucas) - Fix xe_guc_pc_remove_flush_freq_limit() being called on last gt rather than root gt (Lucas) v4: - Only apply the freq reducing part if a TDF is needed: L2 flush trumps the need for waiting a lower frequency Fixes: aaa08078e725 ("drm/xe/bmg: Apply Wa_22019338487") Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Vinay Belgaumkar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit deea6a7d6d803d6bb874a3e6f1b312e560e6c6df) Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: John Harrison <[email protected]> Date: Thu Apr 3 11:56:11 2025 -0700 drm/xe/guc: Enable w/a 16026508708 [ Upstream commit d3e8349edf7ed9eaf076ab3d9973331ccc20e26c ] The workaround is only relevant to SRIOV but does affect all platforms. Signed-off-by: John Harrison <[email protected]> Reviewed-by: Daniele Ceraolo Spurio <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Stable-dep-of: 84c0b4a00610 ("drm/xe/bmg: Update Wa_22019338487") Signed-off-by: Sasha Levin <[email protected]>
Author: Lucas De Marchi <[email protected]> Date: Wed Jun 18 11:49:58 2025 -0700 drm/xe/guc_pc: Add _locked variant for min/max freq [ Upstream commit d8390768dcf6f5a78af56aa03797a076871b01f3 ] There are places in which the getters/setters are called one after the other causing a multiple lock()/unlock(). These are not currently a problem since they are all happening from the same thread, but there's a race possibility as calls are added outside of the early init when the max/min and stashed values need to be correlated. Add the _locked() variants to prepare for that. Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit 1beae9aa2b88d3a02eb666e7b777eb2d7bc645f4) Signed-off-by: Lucas De Marchi <[email protected]> Stable-dep-of: 84c0b4a00610 ("drm/xe/bmg: Update Wa_22019338487") Signed-off-by: Sasha Levin <[email protected]>
Author: Harry Austen <[email protected]> Date: Fri Jun 27 13:30:35 2025 -0700 drm/xe: Allow dropping kunit dependency as built-in [ Upstream commit aa18d5769fcafe645a3ba01a9a69dde4f8dc8cc3 ] Fix Kconfig symbol dependency on KUNIT, which isn't actually required for XE to be built-in. However, if KUNIT is enabled, it must be built-in too. Fixes: 08987a8b6820 ("drm/xe: Fix build with KUNIT=m") Cc: Lucas De Marchi <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Maarten Lankhorst <[email protected]> Signed-off-by: Harry Austen <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Acked-by: Randy Dunlap <[email protected]> Tested-by: Randy Dunlap <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit a559434880b320b83733d739733250815aecf1b0) Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jia Yao <[email protected]> Date: Thu Jun 12 22:46:20 2025 +0000 drm/xe: Fix out-of-bounds field write in MI_STORE_DATA_IMM [ Upstream commit 2d5cff2b4bc567dcaad7ab5b46c973ba534cc062 ] According to Bspec, bits 0~9 of MI_STORE_DATA_IMM must not exceed 0x3FE. The macro MI_SDI_NUM_QW(x) evaluates to 2 * x + 1, which means the condition 2 * x + 1 <= 0x3FE must be satisfied. Therefore, the maximum valid value for x is 0x1FE, not 0x1FF. v2 - Replace 0x1fe with macro MAX_PTE_PER_SDI (Auld, Matthew & Patelczyk, Maciej) v3 - Change macro MAX_PTE_PER_SDI from 0x1fe to 0x1feU (De Marchi, Lucas) Bspec: 60246 Fixes: 9c44fd5f6e8a ("drm/xe: Add migrate layer functions for SVM support") Cc: Matthew Brost <[email protected]> Cc: Brian3 Nguyen <[email protected]> Cc: Alex Zuo <[email protected]> Cc: Matthew Auld <[email protected]> Cc: Maciej Patelczyk <[email protected]> Cc: Lucas De Marchi <[email protected]> Suggested-by: Shuicheng Lin <[email protected]> Signed-off-by: Jia Yao <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Reviewed-by: Maciej Patelczyk <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit c038bdba98c9f6a36378044a9d4385531a194d3e) Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Lucas De Marchi <[email protected]> Date: Wed Jun 18 11:50:00 2025 -0700 drm/xe: Split xe_device_td_flush() [ Upstream commit a1eec6cae95a1a0888cb8370338822ca81cd9436 ] xe_device_td_flush() has 2 possible implementations: an entire L2 flush or a transient flush, depending on WA 16023588340. Make this clear by splitting the function so it calls each of them. Reviewed-by: Matthew Auld <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit 5e300ed8a545bdffc26b579c526b5fef7b2d5365) Signed-off-by: Lucas De Marchi <[email protected]> Stable-dep-of: 84c0b4a00610 ("drm/xe/bmg: Update Wa_22019338487") Signed-off-by: Sasha Levin <[email protected]>
Author: Krzysztof Kozlowski <[email protected]> Date: Wed Jul 2 08:15:31 2025 +0200 dt-bindings: i2c: realtek,rtl9301: Fix missing 'reg' constraint commit 5f05fc6e2218db7ecc52c60eb34b707fe69262c2 upstream. Lists should have fixed amount if items, so add missing constraint to the 'reg' property (only one address space entry). Fixes: c5eda0333076 ("dt-bindings: i2c: Add Realtek RTL I2C Controller") Cc: <[email protected]> # v6.13+ Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Krzysztof Kozlowski <[email protected]> Date: Tue Jul 1 08:36:22 2025 +0200 dt-bindings: net: sophgo,sg2044-dwmac: Drop status from the example commit f030713e5abf67d0a88864c8855f809c763af954 upstream. Examples should be complete and should not have a 'status' property, especially a disabled one because this disables the dt_binding_check of the example against the schema. Dropping 'status' property shows missing other properties - phy-mode and phy-handle. Fixes: 114508a89ddc ("dt-bindings: net: Add support for Sophgo SG2044 dwmac") Cc: <[email protected]> Signed-off-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Alexander Sverdlin <[email protected]> Reviewed-by: Chen Wang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Alok Tiwari <[email protected]> Date: Sat Jun 28 07:56:05 2025 -0700 enic: fix incorrect MTU comparison in enic_change_mtu() [ Upstream commit aaf2b2480375099c022a82023e1cd772bf1c6a5d ] The comparison in enic_change_mtu() incorrectly used the current netdev->mtu instead of the new new_mtu value when warning about an MTU exceeding the port MTU. This could suppress valid warnings or issue incorrect ones. Fix the condition and log to properly reflect the new_mtu. Fixes: ab123fe071c9 ("enic: handle mtu change for vf properly") Signed-off-by: Alok Tiwari <[email protected]> Acked-by: John Daley <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Thomas Fourier <[email protected]> Date: Wed Jun 25 16:16:24 2025 +0200 ethernet: atl1: Add missing DMA mapping error checks and count errors [ Upstream commit d72411d20905180cdc452c553be17481b24463d2 ] The `dma_map_XXX()` functions can fail and must be checked using `dma_mapping_error()`. This patch adds proper error handling for all DMA mapping calls. In `atl1_alloc_rx_buffers()`, if DMA mapping fails, the buffer is deallocated and marked accordingly. In `atl1_tx_map()`, previously mapped buffers are unmapped and the packet is dropped on failure. If `atl1_xmit_frame()` drops the packet, increment the tx_error counter. Fixes: f3cc28c79760 ("Add Attansic L1 ethernet driver.") Signed-off-by: Thomas Fourier <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sudeep Holla <[email protected]> Date: Wed May 28 09:49:41 2025 +0100 firmware: arm_ffa: Fix memory leak by freeing notifier callback node [ Upstream commit a833d31ad867103ba72a0b73f3606f4ab8601719 ] Commit e0573444edbf ("firmware: arm_ffa: Add interfaces to request notification callbacks") adds support for notifier callbacks by allocating and inserting a callback node into a hashtable during registration of notifiers. However, during unregistration, the code only removes the node from the hashtable without freeing the associated memory, resulting in a memory leak. Resolve the memory leak issue by ensuring the allocated notifier callback node is properly freed after it is removed from the hashtable entry. Fixes: e0573444edbf ("firmware: arm_ffa: Add interfaces to request notification callbacks") Message-Id: <[email protected]> Reviewed-by: Jens Wiklander <[email protected]> Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Viresh Kumar <[email protected]> Date: Tue Jun 3 16:38:53 2025 +0530 firmware: arm_ffa: Fix the missing entry in struct ffa_indirect_msg_hdr [ Upstream commit 4c46a471be12216347ba707f8eadadbf5d68e698 ] As per the spec, one 32 bit reserved entry is missing here, add it. Signed-off-by: Viresh Kumar <[email protected]> Fixes: 910cc1acc9b4 ("firmware: arm_ffa: Add support for passing UUID in FFA_MSG_SEND2") Reviewed-by: Bertrand Marquis <[email protected]> Message-Id: <28a624fbf416975de4fbe08cfbf7c2db89cb630e.1748948911.git.viresh.kumar@linaro.org> Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sudeep Holla <[email protected]> Date: Wed May 28 09:49:42 2025 +0100 firmware: arm_ffa: Move memory allocation outside the mutex locking [ Upstream commit 27e850c88df0e25474a8caeb2903e2e90b62c1dc ] The notifier callback node allocation is currently done while holding the notify_lock mutex. While this is safe even if memory allocation may sleep, we need to move the allocation outside the locked region in preparation to move from using muxtes to rwlocks. Move the memory allocation to avoid potential sleeping in atomic context once the locks are moved from mutex to rwlocks. Fixes: e0573444edbf ("firmware: arm_ffa: Add interfaces to request notification callbacks") Message-Id: <[email protected]> Reviewed-by: Jens Wiklander <[email protected]> Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Sudeep Holla <[email protected]> Date: Wed May 28 09:49:43 2025 +0100 firmware: arm_ffa: Replace mutex with rwlock to avoid sleep in atomic context [ Upstream commit 9ca7a421229bbdfbe2e1e628cff5cfa782720a10 ] The current use of a mutex to protect the notifier hashtable accesses can lead to issues in the atomic context. It results in the below kernel warnings: | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:258 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 9, name: kworker/0:0 | preempt_count: 1, expected: 0 | RCU nest depth: 0, expected: 0 | CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.14.0 #4 | Workqueue: ffa_pcpu_irq_notification notif_pcpu_irq_work_fn | Call trace: | show_stack+0x18/0x24 (C) | dump_stack_lvl+0x78/0x90 | dump_stack+0x18/0x24 | __might_resched+0x114/0x170 | __might_sleep+0x48/0x98 | mutex_lock+0x24/0x80 | handle_notif_callbacks+0x54/0xe0 | notif_get_and_handle+0x40/0x88 | generic_exec_single+0x80/0xc0 | smp_call_function_single+0xfc/0x1a0 | notif_pcpu_irq_work_fn+0x2c/0x38 | process_one_work+0x14c/0x2b4 | worker_thread+0x2e4/0x3e0 | kthread+0x13c/0x210 | ret_from_fork+0x10/0x20 To address this, replace the mutex with an rwlock to protect the notifier hashtable accesses. This ensures that read-side locking does not sleep and multiple readers can acquire the lock concurrently, avoiding unnecessary contention and potential deadlocks. Writer access remains exclusive, preserving correctness. This change resolves warnings from lockdep about potential sleep in atomic context. Cc: Jens Wiklander <[email protected]> Reported-by: Jérôme Forissier <[email protected]> Closes: https://github.com/OP-TEE/optee_os/issues/7394 Fixes: e0573444edbf ("firmware: arm_ffa: Add interfaces to request notification callbacks") Message-Id: <[email protected]> Reviewed-by: Jens Wiklander <[email protected]> Tested-by: Jens Wiklander <[email protected]> Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tudor Ambarus <[email protected]> Date: Fri Jun 6 10:45:37 2025 +0000 firmware: exynos-acpm: fix timeouts on xfers handling [ Upstream commit 8d2c2fa2209e83d0eb10f7330d8a0bbdc1df32ff ] The mailbox framework has a single inflight request at a time. If a request is sent while another is still active, it will be queued to the mailbox core ring buffer. ACPM protocol did not serialize the calls to the mailbox subsystem so we could start the timeout ticks in parallel for multiple requests, while just one was being inflight. Consider a hypothetical case where the xfer timeout is 100ms and an ACPM transaction takes 90ms: | 0ms: Message #0 is queued in mailbox layer and sent out, then sits | at acpm_dequeue_by_polling() with a timeout of 100ms | 1ms: Message #1 is queued in mailbox layer but not sent out yet. | Since send_message() doesn't block, it also sits at | acpm_dequeue_by_polling() with a timeout of 100ms | ... | 90ms: Message #0 is completed, txdone is called and message #1 is sent | 101ms: Message #1 times out since the count started at 1ms. Even though | it has only been inflight for 11ms. Fix the problem by moving mbox_send_message() and mbox_client_txdone() immediately after the message has been written to the TX queue and while still keeping the ACPM TX queue lock. We thus tie together the TX write with the doorbell ring and mark the TX as done after the doorbell has been rung. This guarantees that the doorbell has been rang before starting the timeout ticks. We should also see some performance improvement as we no longer wait to receive a response before ringing the doorbell for the next request, so the ACPM firmware shall be able to drain faster the TX queue. Another benefit is that requests are no longer able to ring the doorbell one for the other, so it eases debugging. Finally, the mailbox software queue will always contain a single doorbell request due to the serialization done at the ACPM TX queue level. Protocols like ACPM, that handle their own hardware queues need a passthrough mailbox API, where they are able to just ring the doorbell or flip a bit directly into the mailbox controller. The mailbox software queue mechanism, the locking done into the mailbox core is not really needed, so hopefully this lays the foundation for a passthrough mailbox API. Reported-by: André Draszik <[email protected]> Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver") Signed-off-by: Tudor Ambarus <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Tigran Mkrtchyan <[email protected]> Date: Thu May 15 21:04:15 2025 +0200 flexfiles/pNFS: update stats on NFS4ERR_DELAY for v4.1 DSes [ Upstream commit e3e3775392f3f0f3e3044f8c162bf47858e01759 ] On NFS4ERR_DELAY nfs slient updates its stats, but misses for flexfiles v4.1 DSes. Signed-off-by: Tigran Mkrtchyan <[email protected]> Signed-off-by: Anna Schumaker <[email protected]> Stable-dep-of: 38074de35b01 ("NFSv4/flexfiles: Fix handling of NFS level errors in I/O") Signed-off-by: Sasha Levin <[email protected]>
Author: Shivank Garg <[email protected]> Date: Fri Jun 20 07:03:30 2025 +0000 fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass [ Upstream commit cbe4134ea4bc493239786220bd69cb8a13493190 ] Export anon_inode_make_secure_inode() to allow KVM guest_memfd to create anonymous inodes with proper security context. This replaces the current pattern of calling alloc_anon_inode() followed by inode_init_security_anon() for creating security context manually. This change also fixes a security regression in secretmem where the S_PRIVATE flag was not cleared after alloc_anon_inode(), causing LSM/SELinux checks to be bypassed for secretmem file descriptors. As guest_memfd currently resides in the KVM module, we need to export this symbol for use outside the core kernel. In the future, guest_memfd might be moved to core-mm, at which point the symbols no longer would have to be exported. When/if that happens is still unclear. Fixes: 2bfe15c52612 ("mm: create security context for memfd_secret inodes") Suggested-by: David Hildenbrand <[email protected]> Suggested-by: Mike Rapoport <[email protected]> Signed-off-by: Shivank Garg <[email protected]> Link: https://lore.kernel.org/[email protected] Acked-by: "Mike Rapoport (Microsoft)" <[email protected]> Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Gyeyoung Baek <[email protected]> Date: Thu Jun 12 21:48:27 2025 +0900 genirq/irq_sim: Initialize work context pointers properly [ Upstream commit 8a2277a3c9e4cc5398f80821afe7ecbe9bdf2819 ] Initialize `ops` member's pointers properly by using kzalloc() instead of kmalloc() when allocating the simulation work context. Otherwise the pointers contain random content leading to invalid dereferencing. Signed-off-by: Gyeyoung Baek <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Qasim Ijaz <[email protected]> Date: Fri Jun 27 12:01:21 2025 +0100 HID: appletb-kbd: fix memory corruption of input_handler_list commit c80f2b047d5cc42fbd2dff9d1942d4ba7545100f upstream. In appletb_kbd_probe an input handler is initialised and then registered with input core through input_register_handler(). When this happens input core will add the input handler (specifically its node) to the global input_handler_list. The input_handler_list is central to the functionality of input core and is traversed in various places in input core. An example of this is when a new input device is plugged in and gets registered with input core. The input_handler in probe is allocated as device managed memory. If a probe failure occurs after input_register_handler() the input_handler memory is freed, yet it will remain in the input_handler_list. This effectively means the input_handler_list contains a dangling pointer to data belonging to a freed input handler. This causes an issue when any other input device is plugged in - in my case I had an old PixArt HP USB optical mouse and I decided to plug it in after a failure occurred after input_register_handler(). This lead to the registration of this input device via input_register_device which involves traversing over every handler in the corrupted input_handler_list and calling input_attach_handler(), giving each handler a chance to bind to newly registered device. The core of this bug is a UAF which causes memory corruption of input_handler_list and to fix it we must ensure the input handler is unregistered from input core, this is done through input_unregister_handler(). [ 63.191597] ================================================================== [ 63.192094] BUG: KASAN: slab-use-after-free in input_attach_handler.isra.0+0x1a9/0x1e0 [ 63.192094] Read of size 8 at addr ffff888105ea7c80 by task kworker/0:2/54 [ 63.192094] [ 63.192094] CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.16.0-rc2-00321-g2aa6621d [ 63.192094] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.164 [ 63.192094] Workqueue: usb_hub_wq hub_event [ 63.192094] Call Trace: [ 63.192094] <TASK> [ 63.192094] dump_stack_lvl+0x53/0x70 [ 63.192094] print_report+0xce/0x670 [ 63.192094] kasan_report+0xce/0x100 [ 63.192094] input_attach_handler.isra.0+0x1a9/0x1e0 [ 63.192094] input_register_device+0x76c/0xd00 [ 63.192094] hidinput_connect+0x686d/0xad60 [ 63.192094] hid_connect+0xf20/0x1b10 [ 63.192094] hid_hw_start+0x83/0x100 [ 63.192094] hid_device_probe+0x2d1/0x680 [ 63.192094] really_probe+0x1c3/0x690 [ 63.192094] __driver_probe_device+0x247/0x300 [ 63.192094] driver_probe_device+0x49/0x210 [ 63.192094] __device_attach_driver+0x160/0x320 [ 63.192094] bus_for_each_drv+0x10f/0x190 [ 63.192094] __device_attach+0x18e/0x370 [ 63.192094] bus_probe_device+0x123/0x170 [ 63.192094] device_add+0xd4d/0x1460 [ 63.192094] hid_add_device+0x30b/0x910 [ 63.192094] usbhid_probe+0x920/0xe00 [ 63.192094] usb_probe_interface+0x363/0x9a0 [ 63.192094] really_probe+0x1c3/0x690 [ 63.192094] __driver_probe_device+0x247/0x300 [ 63.192094] driver_probe_device+0x49/0x210 [ 63.192094] __device_attach_driver+0x160/0x320 [ 63.192094] bus_for_each_drv+0x10f/0x190 [ 63.192094] __device_attach+0x18e/0x370 [ 63.192094] bus_probe_device+0x123/0x170 [ 63.192094] device_add+0xd4d/0x1460 [ 63.192094] usb_set_configuration+0xd14/0x1880 [ 63.192094] usb_generic_driver_probe+0x78/0xb0 [ 63.192094] usb_probe_device+0xaa/0x2e0 [ 63.192094] really_probe+0x1c3/0x690 [ 63.192094] __driver_probe_device+0x247/0x300 [ 63.192094] driver_probe_device+0x49/0x210 [ 63.192094] __device_attach_driver+0x160/0x320 [ 63.192094] bus_for_each_drv+0x10f/0x190 [ 63.192094] __device_attach+0x18e/0x370 [ 63.192094] bus_probe_device+0x123/0x170 [ 63.192094] device_add+0xd4d/0x1460 [ 63.192094] usb_new_device+0x7b4/0x1000 [ 63.192094] hub_event+0x234d/0x3fa0 [ 63.192094] process_one_work+0x5bf/0xfe0 [ 63.192094] worker_thread+0x777/0x13a0 [ 63.192094] </TASK> [ 63.192094] [ 63.192094] Allocated by task 54: [ 63.192094] kasan_save_stack+0x33/0x60 [ 63.192094] kasan_save_track+0x14/0x30 [ 63.192094] __kasan_kmalloc+0x8f/0xa0 [ 63.192094] __kmalloc_node_track_caller_noprof+0x195/0x420 [ 63.192094] devm_kmalloc+0x74/0x1e0 [ 63.192094] appletb_kbd_probe+0x39/0x440 [ 63.192094] hid_device_probe+0x2d1/0x680 [ 63.192094] really_probe+0x1c3/0x690 [ 63.192094] __driver_probe_device+0x247/0x300 [ 63.192094] driver_probe_device+0x49/0x210 [ 63.192094] __device_attach_driver+0x160/0x320 [...] [ 63.192094] [ 63.192094] Freed by task 54: [ 63.192094] kasan_save_stack+0x33/0x60 [ 63.192094] kasan_save_track+0x14/0x30 [ 63.192094] kasan_save_free_info+0x3b/0x60 [ 63.192094] __kasan_slab_free+0x37/0x50 [ 63.192094] kfree+0xcf/0x360 [ 63.192094] devres_release_group+0x1f8/0x3c0 [ 63.192094] hid_device_probe+0x315/0x680 [ 63.192094] really_probe+0x1c3/0x690 [ 63.192094] __driver_probe_device+0x247/0x300 [ 63.192094] driver_probe_device+0x49/0x210 [ 63.192094] __device_attach_driver+0x160/0x320 [...] Fixes: 7d62ba8deacf ("HID: hid-appletb-kbd: add support for fn toggle between media and function mode") Cc: [email protected] Reviewed-by: Aditya Garg <[email protected]> Signed-off-by: Qasim Ijaz <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Qasim Ijaz <[email protected]> Date: Tue Jun 24 13:52:56 2025 +0100 HID: appletb-kbd: fix slab use-after-free bug in appletb_kbd_probe commit 38224c472a038fa9ccd4085511dd9f3d6119dbf9 upstream. In probe appletb_kbd_probe() a "struct appletb_kbd *kbd" is allocated via devm_kzalloc() to store touch bar keyboard related data. Later on if backlight_device_get_by_name() finds a backlight device with name "appletb_backlight" a timer (kbd->inactivity_timer) is setup with appletb_inactivity_timer() and the timer is armed to run after appletb_tb_dim_timeout (60) seconds. A use-after-free is triggered when failure occurs after the timer is armed. This ultimately means probe failure occurs and as a result the "struct appletb_kbd *kbd" which is device managed memory is freed. After 60 seconds the timer will have expired and __run_timers will attempt to access the timer (kbd->inactivity_timer) however the kdb structure has been freed causing a use-after free. [ 71.636938] ================================================================== [ 71.637915] BUG: KASAN: slab-use-after-free in __run_timers+0x7ad/0x890 [ 71.637915] Write of size 8 at addr ffff8881178c5958 by task swapper/1/0 [ 71.637915] [ 71.637915] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.16.0-rc2-00318-g739a6c93cc75-dirty #12 PREEMPT(voluntary) [ 71.637915] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.637915] Call Trace: [ 71.637915] <IRQ> [ 71.637915] dump_stack_lvl+0x53/0x70 [ 71.637915] print_report+0xce/0x670 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] kasan_report+0xce/0x100 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] __run_timers+0x7ad/0x890 [ 71.637915] ? __pfx___run_timers+0x10/0x10 [ 71.637915] ? update_process_times+0xfc/0x190 [ 71.637915] ? __pfx_update_process_times+0x10/0x10 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 71.637915] run_timer_softirq+0x141/0x240 [ 71.637915] ? __pfx_run_timer_softirq+0x10/0x10 [ 71.637915] ? __pfx___hrtimer_run_queues+0x10/0x10 [ 71.637915] ? kvm_clock_get_cycles+0x18/0x30 [ 71.637915] ? ktime_get+0x60/0x140 [ 71.637915] handle_softirqs+0x1b8/0x5c0 [ 71.637915] ? __pfx_handle_softirqs+0x10/0x10 [ 71.637915] irq_exit_rcu+0xaf/0xe0 [ 71.637915] sysvec_apic_timer_interrupt+0x6c/0x80 [ 71.637915] </IRQ> [ 71.637915] [ 71.637915] Allocated by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] __kasan_kmalloc+0x8f/0xa0 [ 71.637915] __kmalloc_node_track_caller_noprof+0x195/0x420 [ 71.637915] devm_kmalloc+0x74/0x1e0 [ 71.637915] appletb_kbd_probe+0x37/0x3c0 [ 71.637915] hid_device_probe+0x2d1/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] [ 71.637915] [ 71.637915] Freed by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] kasan_save_free_info+0x3b/0x60 [ 71.637915] __kasan_slab_free+0x37/0x50 [ 71.637915] kfree+0xcf/0x360 [ 71.637915] devres_release_group+0x1f8/0x3c0 [ 71.637915] hid_device_probe+0x315/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] The root cause of the issue is that the timer is not disarmed on failure paths leading to it remaining active and accessing freed memory. To fix this call timer_delete_sync() to deactivate the timer. Another small issue is that timer_delete_sync is called unconditionally in appletb_kbd_remove(), fix this by checking for a valid kbd->backlight_dev before calling timer_delete_sync. Fixes: 93a0fc489481 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Reviewed-by: Aditya Garg <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Michael J. Ruhl <[email protected]> Date: Fri Jun 27 10:35:11 2025 -0400 i2c/designware: Fix an initialization issue commit 3d30048958e0d43425f6d4e76565e6249fa71050 upstream. The i2c_dw_xfer_init() function requires msgs and msg_write_idx from the dev context to be initialized. amd_i2c_dw_xfer_quirk() inits msgs and msgs_num, but not msg_write_idx. This could allow an out of bounds access (of msgs). Initialize msg_write_idx before calling i2c_dw_xfer_init(). Reviewed-by: Andy Shevchenko <[email protected]> Fixes: 17631e8ca2d3 ("i2c: designware: Add driver support for AMD NAVI GPU") Cc: <[email protected]> # v5.13+ Signed-off-by: Michael J. Ruhl <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Or Har-Toov <[email protected]> Date: Mon Jun 16 11:14:09 2025 +0300 IB/mlx5: Fix potential deadlock in MR deregistration [ Upstream commit 2ed25aa7f7711f508b6120e336f05cd9d49943c0 ] The issue arises when kzalloc() is invoked while holding umem_mutex or any other lock acquired under umem_mutex. This is problematic because kzalloc() can trigger fs_reclaim_aqcuire(), which may, in turn, invoke mmu_notifier_invalidate_range_start(). This function can lead to mlx5_ib_invalidate_range(), which attempts to acquire umem_mutex again, resulting in a deadlock. The problematic flow: CPU0 | CPU1 ---------------------------------------|------------------------------------------------ mlx5_ib_dereg_mr() | → revoke_mr() | → mutex_lock(&umem_odp->umem_mutex) | | mlx5_mkey_cache_init() | → mutex_lock(&dev->cache.rb_lock) | → mlx5r_cache_create_ent_locked() | → kzalloc(GFP_KERNEL) | → fs_reclaim() | → mmu_notifier_invalidate_range_start() | → mlx5_ib_invalidate_range() | → mutex_lock(&umem_odp->umem_mutex) → cache_ent_find_and_store() | → mutex_lock(&dev->cache.rb_lock) | Additionally, when kzalloc() is called from within cache_ent_find_and_store(), we encounter the same deadlock due to re-acquisition of umem_mutex. Solve by releasing umem_mutex in dereg_mr() after umr_revoke_mr() and before acquiring rb_lock. This ensures that we don't hold umem_mutex while performing memory allocations that could trigger the reclaim path. This change prevents the deadlock by ensuring proper lock ordering and avoiding holding locks during memory allocation operations that could trigger the reclaim path. The following lockdep warning demonstrates the deadlock: python3/20557 is trying to acquire lock: ffff888387542128 (&umem_odp->umem_mutex){+.+.}-{4:4}, at: mlx5_ib_invalidate_range+0x5b/0x550 [mlx5_ib] but task is already holding lock: ffffffff82f6b840 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: unmap_vmas+0x7b/0x1a0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}: fs_reclaim_acquire+0x60/0xd0 mem_cgroup_css_alloc+0x6f/0x9b0 cgroup_init_subsys+0xa4/0x240 cgroup_init+0x1c8/0x510 start_kernel+0x747/0x760 x86_64_start_reservations+0x25/0x30 x86_64_start_kernel+0x73/0x80 common_startup_64+0x129/0x138 -> #2 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire+0x91/0xd0 __kmalloc_cache_noprof+0x4d/0x4c0 mlx5r_cache_create_ent_locked+0x75/0x620 [mlx5_ib] mlx5_mkey_cache_init+0x186/0x360 [mlx5_ib] mlx5_ib_stage_post_ib_reg_umr_init+0x3c/0x60 [mlx5_ib] __mlx5_ib_add+0x4b/0x190 [mlx5_ib] mlx5r_probe+0xd9/0x320 [mlx5_ib] auxiliary_bus_probe+0x42/0x70 really_probe+0xdb/0x360 __driver_probe_device+0x8f/0x130 driver_probe_device+0x1f/0xb0 __driver_attach+0xd4/0x1f0 bus_for_each_dev+0x79/0xd0 bus_add_driver+0xf0/0x200 driver_register+0x6e/0xc0 __auxiliary_driver_register+0x6a/0xc0 do_one_initcall+0x5e/0x390 do_init_module+0x88/0x240 init_module_from_file+0x85/0xc0 idempotent_init_module+0x104/0x300 __x64_sys_finit_module+0x68/0xc0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #1 (&dev->cache.rb_lock){+.+.}-{4:4}: __mutex_lock+0x98/0xf10 __mlx5_ib_dereg_mr+0x6f2/0x890 [mlx5_ib] mlx5_ib_dereg_mr+0x21/0x110 [mlx5_ib] ib_dereg_mr_user+0x85/0x1f0 [ib_core] uverbs_free_mr+0x19/0x30 [ib_uverbs] destroy_hw_idr_uobject+0x21/0x80 [ib_uverbs] uverbs_destroy_uobject+0x60/0x3d0 [ib_uverbs] uobj_destroy+0x57/0xa0 [ib_uverbs] ib_uverbs_cmd_verbs+0x4d5/0x1210 [ib_uverbs] ib_uverbs_ioctl+0x129/0x230 [ib_uverbs] __x64_sys_ioctl+0x596/0xaa0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #0 (&umem_odp->umem_mutex){+.+.}-{4:4}: __lock_acquire+0x1826/0x2f00 lock_acquire+0xd3/0x2e0 __mutex_lock+0x98/0xf10 mlx5_ib_invalidate_range+0x5b/0x550 [mlx5_ib] __mmu_notifier_invalidate_range_start+0x18e/0x1f0 unmap_vmas+0x182/0x1a0 exit_mmap+0xf3/0x4a0 mmput+0x3a/0x100 do_exit+0x2b9/0xa90 do_group_exit+0x32/0xa0 get_signal+0xc32/0xcb0 arch_do_signal_or_restart+0x29/0x1d0 syscall_exit_to_user_mode+0x105/0x1d0 do_syscall_64+0x79/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Chain exists of: &dev->cache.rb_lock --> mmu_notifier_invalidate_range_start --> &umem_odp->umem_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&umem_odp->umem_mutex); lock(mmu_notifier_invalidate_range_start); lock(&umem_odp->umem_mutex); lock(&dev->cache.rb_lock); *** DEADLOCK *** Fixes: abb604a1a9c8 ("RDMA/mlx5: Fix a race for an ODP MR which leads to CQE with error") Signed-off-by: Or Har-Toov <[email protected]> Reviewed-by: Michael Guralnik <[email protected]> Link: https://patch.msgid.link/3c8f225a8a9fade647d19b014df1172544643e4a.1750061612.git.leon@kernel.org Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Ahmed Zaki <[email protected]> Date: Fri May 23 14:55:37 2025 -0600 idpf: convert control queue mutex to a spinlock [ Upstream commit b2beb5bb2cd90d7939e470ed4da468683f41baa3 ] With VIRTCHNL2_CAP_MACFILTER enabled, the following warning is generated on module load: [ 324.701677] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578 [ 324.701684] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1582, name: NetworkManager [ 324.701689] preempt_count: 201, expected: 0 [ 324.701693] RCU nest depth: 0, expected: 0 [ 324.701697] 2 locks held by NetworkManager/1582: [ 324.701702] #0: ffffffff9f7be770 (rtnl_mutex){....}-{3:3}, at: rtnl_newlink+0x791/0x21e0 [ 324.701730] #1: ff1100216c380368 (_xmit_ETHER){....}-{2:2}, at: __dev_open+0x3f0/0x870 [ 324.701749] Preemption disabled at: [ 324.701752] [<ffffffff9cd23b9d>] __dev_open+0x3dd/0x870 [ 324.701765] CPU: 30 UID: 0 PID: 1582 Comm: NetworkManager Not tainted 6.15.0-rc5+ #2 PREEMPT(voluntary) [ 324.701771] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 [ 324.701774] Call Trace: [ 324.701777] <TASK> [ 324.701779] dump_stack_lvl+0x5d/0x80 [ 324.701788] ? __dev_open+0x3dd/0x870 [ 324.701793] __might_resched.cold+0x1ef/0x23d <..> [ 324.701818] __mutex_lock+0x113/0x1b80 <..> [ 324.701917] idpf_ctlq_clean_sq+0xad/0x4b0 [idpf] [ 324.701935] ? kasan_save_track+0x14/0x30 [ 324.701941] idpf_mb_clean+0x143/0x380 [idpf] <..> [ 324.701991] idpf_send_mb_msg+0x111/0x720 [idpf] [ 324.702009] idpf_vc_xn_exec+0x4cc/0x990 [idpf] [ 324.702021] ? rcu_is_watching+0x12/0xc0 [ 324.702035] idpf_add_del_mac_filters+0x3ed/0xb50 [idpf] <..> [ 324.702122] __hw_addr_sync_dev+0x1cf/0x300 [ 324.702126] ? find_held_lock+0x32/0x90 [ 324.702134] idpf_set_rx_mode+0x317/0x390 [idpf] [ 324.702152] __dev_open+0x3f8/0x870 [ 324.702159] ? __pfx___dev_open+0x10/0x10 [ 324.702174] __dev_change_flags+0x443/0x650 <..> [ 324.702208] netif_change_flags+0x80/0x160 [ 324.702218] do_setlink.isra.0+0x16a0/0x3960 <..> [ 324.702349] rtnl_newlink+0x12fd/0x21e0 The sequence is as follows: rtnl_newlink()-> __dev_change_flags()-> __dev_open()-> dev_set_rx_mode() - > # disables BH and grabs "dev->addr_list_lock" idpf_set_rx_mode() -> # proceed only if VIRTCHNL2_CAP_MACFILTER is ON __dev_uc_sync() -> idpf_add_mac_filter -> idpf_add_del_mac_filters -> idpf_send_mb_msg() -> idpf_mb_clean() -> idpf_ctlq_clean_sq() # mutex_lock(cq_lock) Fix by converting cq_lock to a spinlock. All operations under the new lock are safe except freeing the DMA memory, which may use vunmap(). Fix by requesting a contiguous physical memory for the DMA mapping. Fixes: a251eee62133 ("idpf: add SRIOV support and other ndo_ops") Reviewed-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Ahmed Zaki <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Samuel Salin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Michal Swiatkowski <[email protected]> Date: Thu May 22 10:52:06 2025 +0200 idpf: return 0 size for RSS key if not supported [ Upstream commit f77bf1ebf8ff6301ccdbc346f7b52db928f9cbf8 ] Returning -EOPNOTSUPP from function returning u32 is leading to cast and invalid size value as a result. -EOPNOTSUPP as a size probably will lead to allocation fail. Command: ethtool -x eth0 It is visible on all devices that don't have RSS caps set. [ 136.615917] Call Trace: [ 136.615921] <TASK> [ 136.615927] ? __warn+0x89/0x130 [ 136.615942] ? __alloc_frozen_pages_noprof+0x322/0x330 [ 136.615953] ? report_bug+0x164/0x190 [ 136.615968] ? handle_bug+0x58/0x90 [ 136.615979] ? exc_invalid_op+0x17/0x70 [ 136.615987] ? asm_exc_invalid_op+0x1a/0x20 [ 136.616001] ? rss_prepare_get.constprop.0+0xb9/0x170 [ 136.616016] ? __alloc_frozen_pages_noprof+0x322/0x330 [ 136.616028] __alloc_pages_noprof+0xe/0x20 [ 136.616038] ___kmalloc_large_node+0x80/0x110 [ 136.616072] __kmalloc_large_node_noprof+0x1d/0xa0 [ 136.616081] __kmalloc_noprof+0x32c/0x4c0 [ 136.616098] ? rss_prepare_get.constprop.0+0xb9/0x170 [ 136.616105] rss_prepare_get.constprop.0+0xb9/0x170 [ 136.616114] ethnl_default_doit+0x107/0x3d0 [ 136.616131] genl_family_rcv_msg_doit+0x100/0x160 [ 136.616147] genl_rcv_msg+0x1b8/0x2c0 [ 136.616156] ? __pfx_ethnl_default_doit+0x10/0x10 [ 136.616168] ? __pfx_genl_rcv_msg+0x10/0x10 [ 136.616176] netlink_rcv_skb+0x58/0x110 [ 136.616186] genl_rcv+0x28/0x40 [ 136.616195] netlink_unicast+0x19b/0x290 [ 136.616206] netlink_sendmsg+0x222/0x490 [ 136.616215] __sys_sendto+0x1fd/0x210 [ 136.616233] __x64_sys_sendto+0x24/0x30 [ 136.616242] do_syscall_64+0x82/0x160 [ 136.616252] ? __sys_recvmsg+0x83/0xe0 [ 136.616265] ? syscall_exit_to_user_mode+0x10/0x210 [ 136.616275] ? do_syscall_64+0x8e/0x160 [ 136.616282] ? __count_memcg_events+0xa1/0x130 [ 136.616295] ? count_memcg_events.constprop.0+0x1a/0x30 [ 136.616306] ? handle_mm_fault+0xae/0x2d0 [ 136.616319] ? do_user_addr_fault+0x379/0x670 [ 136.616328] ? clear_bhb_loop+0x45/0xa0 [ 136.616340] ? clear_bhb_loop+0x45/0xa0 [ 136.616349] ? clear_bhb_loop+0x45/0xa0 [ 136.616359] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 136.616369] RIP: 0033:0x7fd30ba7b047 [ 136.616376] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 80 3d bd d5 0c 00 00 41 89 ca 74 10 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 71 c3 55 48 83 ec 30 44 89 4c 24 2c 4c 89 44 [ 136.616381] RSP: 002b:00007ffde1796d68 EFLAGS: 00000202 ORIG_RAX: 000000000000002c [ 136.616388] RAX: ffffffffffffffda RBX: 000055d7bd89f2a0 RCX: 00007fd30ba7b047 [ 136.616392] RDX: 0000000000000028 RSI: 000055d7bd89f3b0 RDI: 0000000000000003 [ 136.616396] RBP: 00007ffde1796e10 R08: 00007fd30bb4e200 R09: 000000000000000c [ 136.616399] R10: 0000000000000000 R11: 0000000000000202 R12: 000055d7bd89f340 [ 136.616403] R13: 000055d7bd89f3b0 R14: 000055d78943f200 R15: 0000000000000000 Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Reviewed-by: Ahmed Zaki <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Samuel Salin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Vitaly Lifshits <[email protected]> Date: Wed Jun 11 15:52:54 2025 +0300 igc: disable L1.2 PCI-E link substate to avoid performance issue [ Upstream commit 0325143b59c6c6d79987afc57d2456e7a20d13b7 ] I226 devices advertise support for the PCI-E link L1.2 substate. However, due to a hardware limitation, the exit latency from this low-power state is longer than the packet buffer can tolerate under high traffic conditions. This can lead to packet loss and degraded performance. To mitigate this, disable the L1.2 substate. The increased power draw between L1.1 and L1.2 is insignificant. Fixes: 43546211738e ("igc: Add new device ID's") Link: https://lore.kernel.org/intel-wired-lan/[email protected] Signed-off-by: Vitaly Lifshits <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Tested-by: Mor Bar-Gabay <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Yunshui Jiang <[email protected]> Date: Thu Jul 3 21:56:02 2025 -0700 Input: cs40l50-vibra - fix potential NULL dereference in cs40l50_upload_owt() commit 4cf65845fdd09d711fc7546d60c9abe010956922 upstream. The cs40l50_upload_owt() function allocates memory via kmalloc() without checking for allocation failure, which could lead to a NULL pointer dereference. Return -ENOMEM in case allocation fails. Signed-off-by: Yunshui Jiang <[email protected]> Fixes: c38fe1bb5d21 ("Input: cs40l50 - Add support for the CS40L50 haptic driver") Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jeff LaBundy <[email protected]> Date: Sun Jun 29 18:33:30 2025 -0700 Input: iqs7222 - explicitly define number of external channels commit 63f4970a1219b5256e8ea952096c86dab666d312 upstream. The number of external channels is assumed to be a multiple of 10, but this is not the case for IQS7222D. As a result, some CRx pins are wrongly prevented from being assigned to some channels. Address this problem by explicitly defining the number of external channels for cases in which the number of external channels is not equal to the total number of available channels. Fixes: dd24e202ac72 ("Input: iqs7222 - add support for Azoteq IQS7222D") Signed-off-by: Jeff LaBundy <[email protected]> Link: https://lore.kernel.org/r/aGHVf6HkyFZrzTPy@nixie71 Cc: [email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Nilton Perim Neto <[email protected]> Date: Fri Jun 27 16:29:40 2025 -0700 Input: xpad - support Acer NGR 200 Controller commit 22c69d786ef8fb789c61ca75492a272774221324 upstream. Add the NGR 200 Xbox 360 to the list of recognized controllers. Signed-off-by: Nilton Perim Neto <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Simon Xue <[email protected]> Date: Mon Jun 23 10:00:18 2025 +0800 iommu/rockchip: prevent iommus dead loop when two masters share one IOMMU commit 62e062a29ad5133f67c20b333ba0a952a99161ae upstream. When two masters share an IOMMU, calling ops->of_xlate during the second master's driver init may overwrite iommu->domain set by the first. This causes the check if (iommu->domain == domain) in rk_iommu_attach_device() to fail, resulting in the same iommu->node being added twice to &rk_domain->iommus, which can lead to an infinite loop in subsequent &rk_domain->iommus operations. Cc: <[email protected]> Fixes: 25c2325575cc ("iommu/rockchip: Add missing set_platform_dma_ops callback") Signed-off-by: Simon Xue <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Lu Baolu <[email protected]> Date: Sat Jun 28 18:03:51 2025 +0800 iommu/vt-d: Assign devtlb cache tag on ATS enablement commit 25b1b75bbaf96331750fb01302825069657b2ff8 upstream. Commit <4f1492efb495> ("iommu/vt-d: Revert ATS timing change to fix boot failure") placed the enabling of ATS in the probe_finalize callback. This occurs after the default domain attachment, which is when the ATS cache tag is assigned. Consequently, the device TLB cache tag is missed when the domain is attached, leading to the device TLB not being invalidated in the iommu_unmap paths. Fix this by assigning the CACHE_TAG_DEVTLB cache tag when ATS is enabled. Fixes: 4f1492efb495 ("iommu/vt-d: Revert ATS timing change to fix boot failure") Cc: [email protected] Suggested-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Tested-by: Shuicheng Lin <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Nicolin Chen <[email protected]> Date: Tue Jun 24 11:00:47 2025 -0700 iommufd/selftest: Add asserts testing global mfd commit a9bf67ee170514b17541038c60bb94cb2cf5732f upstream. The mfd and mfd_buffer will be used in the tests directly without an extra check. Test them in setup_sizes() to ensure they are safe to use. Fixes: 0bcceb1f51c7 ("iommufd: Selftest coverage for IOMMU_IOAS_MAP_FILE") Link: https://patch.msgid.link/r/94bdc11d2b6d5db337b1361c5e5fce0ed494bb40.1750787928.git.nicolinc@nvidia.com Cc: [email protected] Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Nicolin Chen <[email protected]> Date: Tue Jun 24 11:00:46 2025 -0700 iommufd/selftest: Add missing close(mfd) in memfd_mmap() commit 4b75e3babb85238912f50bbe0647bae08242a9b6 upstream. Do not forget to close mfd in the error paths, since none of the callers would close it when ASSERT_NE(MAP_FAILED, buf) fails. Fixes: 0bcceb1f51c7 ("iommufd: Selftest coverage for IOMMU_IOAS_MAP_FILE") Link: https://patch.msgid.link/r/a363a69dbf453d4bc1bde276f3b16778620488e1.1750787928.git.nicolinc@nvidia.com Cc: [email protected] Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Nicolin Chen <[email protected]> Date: Tue Jun 24 11:00:45 2025 -0700 iommufd/selftest: Fix iommufd_dirty_tracking with large hugepage sizes commit 818625570558cd91082c9bafd6f2b59b73241a69 upstream. The hugepage test cases of iommufd_dirty_tracking have the 64MB and 128MB coverages. Both of them are smaller than the default hugepage size 512MB, when CONFIG_PAGE_SIZE_64KB=y. However, these test cases have a variant of using huge pages, which would mmap(MAP_HUGETLB) using these smaller sizes than the system hugepag size. This results in the kernel aligning up the smaller size to 512MB. If a memory was located between the upper 64/128MB size boundary and the hugepage 512MB boundary, it would get wiped out: https://lore.kernel.org/all/[email protected]/ Given that this aligning up behavior is well documented, we have no choice but to allocate a hugepage aligned size to avoid this unintended wipe out. Instead of relying on the kernel's internal force alignment, pass the same size to posix_memalign() and map(). Also, fix the FIXTURE_TEARDOWN() misusing munmap() to free the memory from posix_memalign(), as munmap() doesn't destroy the allocator meta data. So, call free() instead. Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Link: https://patch.msgid.link/r/1ea8609ae6d523fdd4d8efb179ddee79c8582cb6.1750787928.git.nicolinc@nvidia.com Cc: [email protected] Suggested-by: Jason Gunthorpe <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Borislav Petkov (AMD) <[email protected]> Date: Wed Sep 11 11:00:50 2024 +0200 KVM: SVM: Advertise TSA CPUID bits to guests Commit 31272abd5974b38ba312e9cf2ec2f09f9dd7dcba upstream. Synthesize the TSA CPUID feature bits for guests. Set TSA_{SQ,L1}_NO on unaffected machines. Signed-off-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Borislav Petkov <[email protected]> Date: Mon Mar 24 17:06:17 2025 +0100 KVM: x86: Sort CPUID_8000_0021_EAX leaf bits properly Commit 49c140d5af127ef4faf19f06a89a0714edf0316f upstream. WRMSR_XX_BASE_NS is bit 1 so put it there, add some new bits as comments only. Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected] [sean: skip the FSRS/FSRC placeholders to avoid confusion] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Dan Carpenter <[email protected]> Date: Mon Jun 30 14:36:40 2025 -0500 lib: test_objagg: Set error message in check_expect_hints_stats() [ Upstream commit e6ed134a4ef592fe1fd0cafac9683813b3c8f3e8 ] Smatch complains that the error message isn't set in the caller: lib/test_objagg.c:923 test_hints_case2() error: uninitialized symbol 'errmsg'. This static checker warning only showed up after a recent refactoring but the bug dates back to when the code was originally added. This likely doesn't affect anything in real life. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/r/[email protected]/ Fixes: 0a020d416d0a ("lib: introduce initial implementation of object aggregation manager") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Greg Kroah-Hartman <[email protected]> Date: Thu Jul 10 16:08:55 2025 +0200 Linux 6.15.6 Link: https://lore.kernel.org/r/[email protected] Tested-by: Ronald Warsow <[email protected]> Tested-by: Justin M. Forbes <[email protected]> Tested-by: Salvatore Bonaccorso <[email protected]> Tested-by: Takeshi Ogasawara <[email protected]> Tested-by: Mark Brown <[email protected]> Tested-by: Ron Economos <[email protected]> Tested-by: Luna Jernberg <[email protected]> Tested-by: Jon Hunter <[email protected]> Tested-by: Miguel Ojeda <[email protected]> Tested-by: Christian Heusel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Oliver Neukum <[email protected]> Date: Thu Jun 5 14:28:45 2025 +0200 Logitech C-270 even more broken commit cee4392a57e14a799fbdee193bc4c0de65b29521 upstream. Some varieties of this device don't work with RESET_RESUME alone. Signed-off-by: Oliver Neukum <[email protected]> Cc: stable <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jeongjun Park <[email protected]> Date: Fri May 9 01:56:20 2025 +0900 mm/vmalloc: fix data race in show_numa_info() commit 5c5f0468d172ddec2e333d738d2a1f85402cf0bc upstream. The following data-race was found in show_numa_info(): ================================================================== BUG: KCSAN: data-race in vmalloc_info_show / vmalloc_info_show read to 0xffff88800971fe30 of 4 bytes by task 8289 on cpu 0: show_numa_info mm/vmalloc.c:4936 [inline] vmalloc_info_show+0x5a8/0x7e0 mm/vmalloc.c:5016 seq_read_iter+0x373/0xb40 fs/seq_file.c:230 proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299 .... write to 0xffff88800971fe30 of 4 bytes by task 8287 on cpu 1: show_numa_info mm/vmalloc.c:4934 [inline] vmalloc_info_show+0x38f/0x7e0 mm/vmalloc.c:5016 seq_read_iter+0x373/0xb40 fs/seq_file.c:230 proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299 .... value changed: 0x0000008f -> 0x00000000 ================================================================== According to this report,there is a read/write data-race because m->private is accessible to multiple CPUs. To fix this, instead of allocating the heap in proc_vmalloc_init() and passing the heap address to m->private, vmalloc_info_show() should allocate the heap. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8e1d743f2c26 ("mm: vmalloc: support multiple nodes in vmallocinfo") Signed-off-by: Jeongjun Park <[email protected]> Suggested-by: Eric Dumazet <[email protected]> Suggested-by: Andrew Morton <[email protected]> Reviewed-by: "Uladzislau Rezki (Sony)" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Victor Shih <[email protected]> Date: Fri Jun 6 19:01:19 2025 +0800 mmc: core: Adjust some error messages for SD UHS-II cards commit 14633da0f416fdbb6844d1b295cdc828b666e273 upstream. Adjust some error messages to debug mode to avoid causing misunderstanding it is an error. Signed-off-by: Victor Shih <[email protected]> Acked-by: Adrian Hunter <[email protected]> Fixes: 9a9f7e13952b ("mmc: core: Support UHS-II card control and access") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Avri Altman <[email protected]> Date: Mon May 26 14:44:45 2025 +0300 mmc: core: sd: Apply BROKEN_SD_DISCARD quirk earlier commit 009c3a4bc41e855fd76f92727f9fbae4e5917d7f upstream. Move the BROKEN_SD_DISCARD quirk for certain SanDisk SD cards from the `mmc_blk_fixups[]` to `mmc_sd_fixups[]`. This ensures the quirk is applied earlier in the device initialization process, aligning with the reasoning in [1]. Applying the quirk sooner prevents the kernel from incorrectly enabling discard support on affected cards during initial setup. [1] https://lore.kernel.org/all/[email protected] Fixes: 07d2872bf4c8 ("mmc: core: Add SD card quirk for broken discard") Signed-off-by: Avri Altman <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Victor Shih <[email protected]> Date: Fri Jun 6 19:01:21 2025 +0800 mmc: sdhci-uhs2: Adjust some error messages and register dump for SD UHS-II card commit 49b14db035135341f6cf4de7af6ac2cbc8ad29b6 upstream. Adjust some error messages to debug mode and register dump to dynamic debug mode to avoid causing misunderstanding it is an error. Signed-off-by: Victor Shih <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Victor Shih <[email protected]> Date: Fri Jun 6 19:01:20 2025 +0800 mmc: sdhci: Add a helper function for dump register in dynamic debug mode commit 2881ba9af073faa8ee7408a8d1e0575e50eb3f6c upstream. Add a helper function for dump register in dynamic debug mode. Signed-off-by: Victor Shih <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Peter Zijlstra <[email protected]> Date: Fri May 2 16:12:09 2025 +0200 module: Provide EXPORT_SYMBOL_GPL_FOR_MODULES() helper [ Upstream commit 707f853d7fa3ce323a6875487890c213e34d81a0 ] Helper macro to more easily limit the export of a symbol to a given list of modules. Eg: EXPORT_SYMBOL_GPL_FOR_MODULES(preempt_notifier_inc, "kvm"); will limit the use of said function to kvm.ko, any other module trying to use this symbol will refure to load (and get modpost build failures). Requested-by: Masahiro Yamada <[email protected]> Requested-by: Christoph Hellwig <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Reviewed-by: Petr Pavlu <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Stable-dep-of: cbe4134ea4bc ("fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass") Signed-off-by: Sasha Levin <[email protected]>
Author: Pablo Martin-Gomez <[email protected]> Date: Wed Jun 18 13:35:16 2025 +0200 mtd: spinand: fix memory leak of ECC engine conf [ Upstream commit 6463cbe08b0cbf9bba8763306764f5fd643023e1 ] Memory allocated for the ECC engine conf is not released during spinand cleanup. Below kmemleak trace is seen for this memory leak: unreferenced object 0xffffff80064f00e0 (size 8): comm "swapper/0", pid 1, jiffies 4294937458 hex dump (first 8 bytes): 00 00 00 00 00 00 00 00 ........ backtrace (crc 0): kmemleak_alloc+0x30/0x40 __kmalloc_cache_noprof+0x208/0x3c0 spinand_ondie_ecc_init_ctx+0x114/0x200 nand_ecc_init_ctx+0x70/0xa8 nanddev_ecc_engine_init+0xec/0x27c spinand_probe+0xa2c/0x1620 spi_mem_probe+0x130/0x21c spi_probe+0xf0/0x170 really_probe+0x17c/0x6e8 __driver_probe_device+0x17c/0x21c driver_probe_device+0x58/0x180 __device_attach_driver+0x15c/0x1f8 bus_for_each_drv+0xec/0x150 __device_attach+0x188/0x24c device_initial_probe+0x10/0x20 bus_probe_device+0x11c/0x160 Fix the leak by calling nanddev_ecc_engine_cleanup() inside spinand_cleanup(). Signed-off-by: Pablo Martin-Gomez <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Masami Hiramatsu (Google) <[email protected]> Date: Thu Jun 5 10:07:38 2025 +0900 mtk-sd: Fix a pagefault in dma_unmap_sg() for not prepared data commit 539d80575b810c7a5987c7ac8915e3bc99c03695 upstream. When swiotlb buffer is full, the dma_map_sg() returns 0 to msdc_prepare_data(), but it does not check it and sets the MSDC_PREPARE_FLAG. swiotlb_tbl_map_single() /* prints "swiotlb buffer is full" */ <-swiotlb_map() <-dma_direct_map_page() <-dma_direct_map_sg() <-__dma_map_sg_attrs() <-dma_map_sg_attrs() <-dma_map_sg() /* returns 0 (pages mapped) */ <-msdc_prepare_data() Then, the msdc_unprepare_data() checks MSDC_PREPARE_FLAG and calls dma_unmap_sg() with unmapped pages. It causes a page fault. To fix this problem, Do not set MSDC_PREPARE_FLAG if dma_map_sg() fails because this is not prepared. Fixes: 208489032bdd ("mmc: mediatek: Add Mediatek MMC driver") Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Tested-by: Sergey Senozhatsky <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/174908565814.4056588.769599127120955383.stgit@mhiramat.tok.corp.google.com Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Masami Hiramatsu (Google) <[email protected]> Date: Thu Jun 12 20:26:10 2025 +0900 mtk-sd: Prevent memory corruption from DMA map failure commit f5de469990f19569627ea0dd56536ff5a13beaa3 upstream. If msdc_prepare_data() fails to map the DMA region, the request is not prepared for data receiving, but msdc_start_data() proceeds the DMA with previous setting. Since this will lead a memory corruption, we have to stop the request operation soon after the msdc_prepare_data() fails to prepare it. Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Fixes: 208489032bdd ("mmc: mediatek: Add Mediatek MMC driver") Cc: [email protected] Link: https://lore.kernel.org/r/174972756982.3337526.6755001617701603082.stgit@mhiramat.tok.corp.google.com Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Sergey Senozhatsky <[email protected]> Date: Wed Jun 25 14:20:37 2025 +0900 mtk-sd: reset host->mrq on prepare_data() error commit ec54c0a20709ed6e56f40a8d59eee725c31a916b upstream. Do not leave host with dangling ->mrq pointer if we hit the msdc_prepare_data() error out path. Signed-off-by: Sergey Senozhatsky <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Fixes: f5de469990f1 ("mtk-sd: Prevent memory corruption from DMA map failure") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Lion Ackermann <[email protected]> Date: Mon Jun 30 15:27:30 2025 +0200 net/sched: Always pass notifications when child class becomes empty [ Upstream commit 103406b38c600fec1fe375a77b27d87e314aea09 ] Certain classful qdiscs may invoke their classes' dequeue handler on an enqueue operation. This may unexpectedly empty the child qdisc and thus make an in-flight class passive via qlen_notify(). Most qdiscs do not expect such behaviour at this point in time and may re-activate the class eventually anyways which will lead to a use-after-free. The referenced fix commit attempted to fix this behavior for the HFSC case by moving the backlog accounting around, though this turned out to be incomplete since the parent's parent may run into the issue too. The following reproducer demonstrates this use-after-free: tc qdisc add dev lo root handle 1: drr tc filter add dev lo parent 1: basic classid 1:1 tc class add dev lo parent 1: classid 1:1 drr tc qdisc add dev lo parent 1:1 handle 2: hfsc def 1 tc class add dev lo parent 2: classid 2:1 hfsc rt m1 8 d 1 m2 0 tc qdisc add dev lo parent 2:1 handle 3: netem tc qdisc add dev lo parent 3:1 handle 4: blackhole echo 1 | socat -u STDIN UDP4-DATAGRAM:127.0.0.1:8888 tc class delete dev lo classid 1:1 echo 1 | socat -u STDIN UDP4-DATAGRAM:127.0.0.1:8888 Since backlog accounting issues leading to a use-after-frees on stale class pointers is a recurring pattern at this point, this patch takes a different approach. Instead of trying to fix the accounting, the patch ensures that qdisc_tree_reduce_backlog always calls qlen_notify when the child qdisc is empty. This solves the problem because deletion of qdiscs always involves a call to qdisc_reset() and / or qdisc_purge_queue() which ultimately resets its qlen to 0 thus causing the following qdisc_tree_reduce_backlog() to report to the parent. Note that this may call qlen_notify on passive classes multiple times. This is not a problem after the recent patch series that made all the classful qdiscs qlen_notify() handlers idempotent. Fixes: 3f981138109f ("sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue()") Signed-off-by: Lion Ackermann <[email protected]> Reviewed-by: Jamal Hadi Salim <[email protected]> Acked-by: Cong Wang <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Antoine Tenart <[email protected]> Date: Tue Jul 1 09:49:34 2025 +0200 net: ipv4: fix stat increase when udp early demux drops the packet [ Upstream commit c2a2ff6b4db55647575260bf2227b0e09d46addb ] udp_v4_early_demux now returns drop reasons as it either returns 0 or ip_mc_validate_source, which returns itself a drop reason. However its use was not converted in ip_rcv_finish_core and the drop reason is ignored, leading to potentially skipping increasing LINUX_MIB_IPRPFILTER if the drop reason is SKB_DROP_REASON_IP_RPFILTER. This is a fix and we're not converting udp_v4_early_demux to explicitly return a drop reason to ease backports; this can be done as a follow-up. Fixes: d46f827016d8 ("net: ip: make ip_mc_validate_source() return drop reason") Cc: Menglong Dong <[email protected]> Reported-by: Sabrina Dubroca <[email protected]> Signed-off-by: Antoine Tenart <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jiawen Wu <[email protected]> Date: Tue Jul 1 15:06:25 2025 +0800 net: libwx: fix the incorrect display of the queue number commit 5186ff7e1d0e26aaef998ba18b31c79c28d1441f upstream. When setting "ethtool -L eth0 combined 1", the number of RX/TX queue is changed to be 1. RSS is disabled at this moment, and the indices of FDIR have not be changed in wx_set_rss_queues(). So the combined count still shows the previous value. This issue was introduced when supporting FDIR. Fix it for those devices that support FDIR. Fixes: 34744a7749b3 ("net: txgbe: add FDIR info to ethtool ops") Cc: [email protected] Signed-off-by: Jiawen Wu <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jiawen Wu <[email protected]> Date: Tue Jul 1 14:30:28 2025 +0800 net: txgbe: request MISC IRQ in ndo_open commit cc9f7f65cd2f31150b10e6956f1f0882e1bbae49 upstream. Move the creating of irq_domain for MISC IRQ from .probe to .ndo_open, and free it in .ndo_stop, to maintain consistency with the queue IRQs. This it for subsequent adjustments to the IRQ vectors. Fixes: aefd013624a1 ("net: txgbe: use irq_domain for interrupt controller") Cc: [email protected] Signed-off-by: Jiawen Wu <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Oleksij Rempel <[email protected]> Date: Fri Jun 27 07:13:46 2025 +0200 net: usb: lan78xx: fix WARN in __netif_napi_del_locked on disconnect [ Upstream commit 6c7ffc9af7186ed79403a3ffee9a1e5199fc7450 ] Remove redundant netif_napi_del() call from disconnect path. A WARN may be triggered in __netif_napi_del_locked() during USB device disconnect: WARNING: CPU: 0 PID: 11 at net/core/dev.c:7417 __netif_napi_del_locked+0x2b4/0x350 This happens because netif_napi_del() is called in the disconnect path while NAPI is still enabled. However, it is not necessary to call netif_napi_del() explicitly, since unregister_netdev() will handle NAPI teardown automatically and safely. Removing the redundant call avoids triggering the warning. Full trace: lan78xx 1-1:1.0 enu1: Failed to read register index 0x000000c4. ret = -ENODEV lan78xx 1-1:1.0 enu1: Failed to set MAC down with error -ENODEV lan78xx 1-1:1.0 enu1: Link is Down lan78xx 1-1:1.0 enu1: Failed to read register index 0x00000120. ret = -ENODEV ------------[ cut here ]------------ WARNING: CPU: 0 PID: 11 at net/core/dev.c:7417 __netif_napi_del_locked+0x2b4/0x350 Modules linked in: flexcan can_dev fuse CPU: 0 UID: 0 PID: 11 Comm: kworker/0:1 Not tainted 6.16.0-rc2-00624-ge926949dab03 #9 PREEMPT Hardware name: SKOV IMX8MP CPU revC - bd500 (DT) Workqueue: usb_hub_wq hub_event pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __netif_napi_del_locked+0x2b4/0x350 lr : __netif_napi_del_locked+0x7c/0x350 sp : ffffffc085b673c0 x29: ffffffc085b673c0 x28: ffffff800b7f2000 x27: ffffff800b7f20d8 x26: ffffff80110bcf58 x25: ffffff80110bd978 x24: 1ffffff0022179eb x23: ffffff80110bc000 x22: ffffff800b7f5000 x21: ffffff80110bc000 x20: ffffff80110bcf38 x19: ffffff80110bcf28 x18: dfffffc000000000 x17: ffffffc081578940 x16: ffffffc08284cee0 x15: 0000000000000028 x14: 0000000000000006 x13: 0000000000040000 x12: ffffffb0022179e8 x11: 1ffffff0022179e7 x10: ffffffb0022179e7 x9 : dfffffc000000000 x8 : 0000004ffdde8619 x7 : ffffff80110bcf3f x6 : 0000000000000001 x5 : ffffff80110bcf38 x4 : ffffff80110bcf38 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 1ffffff0022179e7 x0 : 0000000000000000 Call trace: __netif_napi_del_locked+0x2b4/0x350 (P) lan78xx_disconnect+0xf4/0x360 usb_unbind_interface+0x158/0x718 device_remove+0x100/0x150 device_release_driver_internal+0x308/0x478 device_release_driver+0x1c/0x30 bus_remove_device+0x1a8/0x368 device_del+0x2e0/0x7b0 usb_disable_device+0x244/0x540 usb_disconnect+0x220/0x758 hub_event+0x105c/0x35e0 process_one_work+0x760/0x17b0 worker_thread+0x768/0xce8 kthread+0x3bc/0x690 ret_from_fork+0x10/0x20 irq event stamp: 211604 hardirqs last enabled at (211603): [<ffffffc0828cc9ec>] _raw_spin_unlock_irqrestore+0x84/0x98 hardirqs last disabled at (211604): [<ffffffc0828a9a84>] el1_dbg+0x24/0x80 softirqs last enabled at (211296): [<ffffffc080095f10>] handle_softirqs+0x820/0xbc8 softirqs last disabled at (210993): [<ffffffc080010288>] __do_softirq+0x18/0x20 ---[ end trace 0000000000000000 ]--- lan78xx 1-1:1.0 enu1: failed to kill vid 0081/0 Fixes: ec4c7e12396b ("lan78xx: Introduce NAPI polling support") Suggested-by: Jakub Kicinski <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Howells <[email protected]> Date: Tue Jul 1 17:38:37 2025 +0100 netfs: Fix double put of request [ Upstream commit 9df7b5ebead649b00bf9a53a798e4bf83a1318fd ] If a netfs request finishes during the pause loop, it will have the ref that belongs to the IN_PROGRESS flag removed at that point - however, if it then goes to the final wait loop, that will *also* put the ref because it sees that the IN_PROGRESS flag is clear and incorrectly assumes that this happened when it called the collector. In fact, since IN_PROGRESS is clear, we shouldn't call the collector again since it's done all the cleanup, such as calling ->ki_complete(). Fix this by making netfs_collect_in_app() just return, indicating that we're done if IN_PROGRESS is removed. Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used") Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Tested-by: Steve French <[email protected]> Reviewed-by: Paulo Alcantara <[email protected]> cc: Steve French <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Howells <[email protected]> Date: Tue Jul 1 17:38:36 2025 +0100 netfs: Fix hang due to missing case in final DIO read result collection [ Upstream commit da8cf4bd458722d090a788c6e581eeb72695c62f ] When doing a DIO read, if the subrequests we issue fail and cause the request PAUSE flag to be set to put a pause on subrequest generation, we may complete collection of the subrequests (possibly discarding them) prior to the ALL_QUEUED flags being set. In such a case, netfs_read_collection() doesn't see ALL_QUEUED being set after netfs_collect_read_results() returns and will just return to the app (the collector can be seen unpausing the generator in the trace log). The subrequest generator can then set ALL_QUEUED and the app thread reaches netfs_wait_for_request(). This causes netfs_collect_in_app() to be called to see if we're done yet, but there's missing case here. netfs_collect_in_app() will see that a thread is active and set inactive to false, but won't see any subrequests in the read stream, and so won't set need_collect to true. The function will then just return 0, indicating that the caller should just sleep until further activity (which won't be forthcoming) occurs. Fix this by making netfs_collect_in_app() check to see if an active thread is complete - i.e. that ALL_QUEUED is set and the subrequests list is empty - and to skip the sleep return path. The collector will then be called which will clear the request IN_PROGRESS flag, allowing the app to progress. Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used") Reported-by: Steve French <[email protected]> Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Tested-by: Steve French <[email protected]> Reviewed-by: Paulo Alcantara <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Howells <[email protected]> Date: Tue Jul 1 17:38:45 2025 +0100 netfs: Fix i_size updating [ Upstream commit 2e0658940d90a3dc130bb3b7f75bae9f4100e01f ] Fix the updating of i_size, particularly in regard to the completion of DIO writes and especially async DIO writes by using a lock. The bug is triggered occasionally by the generic/207 xfstest as it chucks a bunch of AIO DIO writes at the filesystem and then checks that fstat() returns a reasonable st_size as each completes. The problem is that netfs is trying to do "if new_size > inode->i_size, update inode->i_size" sort of thing but without a lock around it. This can be seen with cifs, but shouldn't be seen with kafs because kafs serialises modification ops on the client whereas cifs sends the requests to the server as they're generated and lets the server order them. Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]> cc: Steve French <[email protected]> cc: Paulo Alcantara <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Howells <[email protected]> Date: Tue Jul 1 17:38:39 2025 +0100 netfs: Fix looping in wait functions [ Upstream commit 09623e3a14c1c5465124350cd227457c2b0fb017 ] netfs_wait_for_request() and netfs_wait_for_pause() can loop forever if netfs_collect_in_app() returns 2, indicating that it wants to repeat because the ALL_QUEUED flag isn't yet set and there are no subreqs left that haven't been collected. The problem is that, unless collection is offloaded (OFFLOAD_COLLECTION), we have to return to the application thread to continue and eventually set ALL_QUEUED after pausing to deal with a retry - but we never get there. Fix this by inserting checks for the IN_PROGRESS and PAUSE flags as appropriate before cycling round - and add cond_resched() for good measure. Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used") Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Tested-by: Steve French <[email protected]> Reviewed-by: Paulo Alcantara <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Howells <[email protected]> Date: Tue Jul 1 17:38:40 2025 +0100 netfs: Fix ref leak on inserted extra subreq in write retry [ Upstream commit 97d8e8e52cb8ab3d7675880a92626d9a4332f7a6 ] The write-retry algorithm will insert extra subrequests into the list if it can't get sufficient capacity to split the range that needs to be retried into the sequence of subrequests it currently has (for instance, if the cifs credit pool has fewer credits available than it did when the range was originally divided). However, the allocator furnishes each new subreq with 2 refs and then another is added for resubmission, causing one to be leaked. Fix this by replacing the ref-getting line with a neutral trace line. Fixes: 288ace2f57c9 ("netfs: New writeback implementation") Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Tested-by: Steve French <[email protected]> Reviewed-by: Paulo Alcantara <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kuniyuki Iwashima <[email protected]> Date: Thu Jun 12 14:52:50 2025 -0700 nfs: Clean up /proc/net/rpc/nfs when nfs_fs_proc_net_init() fails. [ Upstream commit e8d6f3ab59468e230f3253efe5cb63efa35289f7 ] syzbot reported a warning below [1] following a fault injection in nfs_fs_proc_net_init(). [0] When nfs_fs_proc_net_init() fails, /proc/net/rpc/nfs is not removed. Later, rpc_proc_exit() tries to remove /proc/net/rpc, and the warning is logged as the directory is not empty. Let's handle the error of nfs_fs_proc_net_init() properly. [0]: FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 1 UID: 0 PID: 6120 Comm: syz.2.27 Not tainted 6.16.0-rc1-syzkaller-00010-g2c4a1f3fe03e #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) should_fail_ex (lib/fault-inject.c:73 lib/fault-inject.c:174) should_failslab (mm/failslab.c:46) kmem_cache_alloc_noprof (mm/slub.c:4178 mm/slub.c:4204) __proc_create (fs/proc/generic.c:427) proc_create_reg (fs/proc/generic.c:554) proc_create_net_data (fs/proc/proc_net.c:120) nfs_fs_proc_net_init (fs/nfs/client.c:1409) nfs_net_init (fs/nfs/inode.c:2600) ops_init (net/core/net_namespace.c:138) setup_net (net/core/net_namespace.c:443) copy_net_ns (net/core/net_namespace.c:576) create_new_namespaces (kernel/nsproxy.c:110) unshare_nsproxy_namespaces (kernel/nsproxy.c:218 (discriminator 4)) ksys_unshare (kernel/fork.c:3123) __x64_sys_unshare (kernel/fork.c:3190) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) </TASK> [1]: remove_proc_entry: removing non-empty directory 'net/rpc', leaking at least 'nfs' WARNING: CPU: 1 PID: 6120 at fs/proc/generic.c:727 remove_proc_entry+0x45e/0x530 fs/proc/generic.c:727 Modules linked in: CPU: 1 UID: 0 PID: 6120 Comm: syz.2.27 Not tainted 6.16.0-rc1-syzkaller-00010-g2c4a1f3fe03e #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 RIP: 0010:remove_proc_entry+0x45e/0x530 fs/proc/generic.c:727 Code: 3c 02 00 0f 85 85 00 00 00 48 8b 93 d8 00 00 00 4d 89 f0 4c 89 e9 48 c7 c6 40 ba a2 8b 48 c7 c7 60 b9 a2 8b e8 33 81 1d ff 90 <0f> 0b 90 90 e9 5f fe ff ff e8 04 69 5e ff 90 48 b8 00 00 00 00 00 RSP: 0018:ffffc90003637b08 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88805f534140 RCX: ffffffff817a92c8 RDX: ffff88807da99e00 RSI: ffffffff817a92d5 RDI: 0000000000000001 RBP: ffff888033431ac0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888033431a00 R13: ffff888033431ae4 R14: ffff888033184724 R15: dffffc0000000000 FS: 0000555580328500(0000) GS:ffff888124a62000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f71733743e0 CR3: 000000007f618000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> sunrpc_exit_net+0x46/0x90 net/sunrpc/sunrpc_syms.c:76 ops_exit_list net/core/net_namespace.c:200 [inline] ops_undo_list+0x2eb/0xab0 net/core/net_namespace.c:253 setup_net+0x2e1/0x510 net/core/net_namespace.c:457 copy_net_ns+0x2a6/0x5f0 net/core/net_namespace.c:574 create_new_namespaces+0x3ea/0xa90 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:218 ksys_unshare+0x45b/0xa40 kernel/fork.c:3121 __do_sys_unshare kernel/fork.c:3192 [inline] __se_sys_unshare kernel/fork.c:3190 [inline] __x64_sys_unshare+0x31/0x40 kernel/fork.c:3190 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x490 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa1a6b8e929 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fff3a090368 EFLAGS: 00000246 ORIG_RAX: 0000000000000110 RAX: ffffffffffffffda RBX: 00007fa1a6db5fa0 RCX: 00007fa1a6b8e929 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000080 RBP: 00007fa1a6c10b39 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fa1a6db5fa0 R14: 00007fa1a6db5fa0 R15: 0000000000000001 </TASK> Fixes: d47151b79e32 ("nfs: expose /proc/net/sunrpc/nfs in net namespaces") Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=a4cc4ac22daa4a71b87c Tested-by: [email protected] Signed-off-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Trond Myklebust <[email protected]> Date: Thu Jun 19 15:16:11 2025 -0400 NFSv4/flexfiles: Fix handling of NFS level errors in I/O [ Upstream commit 38074de35b015df5623f524d6f2b49a0cd395c40 ] Allow the flexfiles error handling to recognise NFS level errors (as opposed to RPC level errors) and handle them separately. The main motivator is the NFSERR_PERM errors that get returned if the NFS client connects to the data server through a port number that is lower than 1024. In that case, the client should disconnect and retry a READ on a different data server, or it should retry a WRITE after reconnecting. Reviewed-by: Tigran Mkrtchyan <[email protected]> Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Benjamin Coddington <[email protected]> Date: Thu Jun 19 11:02:21 2025 -0400 NFSv4/pNFS: Fix a race to wake on NFS_LAYOUT_DRAIN [ Upstream commit c01776287414ca43412d1319d2877cbad65444ac ] We found a few different systems hung up in writeback waiting on the same page lock, and one task waiting on the NFS_LAYOUT_DRAIN bit in pnfs_update_layout(), however the pnfs_layout_hdr's plh_outstanding count was zero. It seems most likely that this is another race between the waiter and waker similar to commit ed0172af5d6f ("SUNRPC: Fix a race to wake a sync task"). Fix it up by applying the advised barrier. Fixes: 880265c77ac4 ("pNFS: Avoid a live lock condition in pnfs_update_layout()") Signed-off-by: Benjamin Coddington <[email protected]> Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Thomas Fourier <[email protected]> Date: Mon Jun 30 10:36:43 2025 +0200 nui: Fix dma_mapping_error() check [ Upstream commit 561aa0e22b70a5e7246b73d62a824b3aef3fc375 ] dma_map_XXX() functions return values DMA_MAPPING_ERROR as error values which is often ~0. The error value should be tested with dma_mapping_error(). This patch creates a new function in niu_ops to test if the mapping failed. The test is fixed in niu_rbr_add_page(), added in niu_start_xmit() and the successfully mapped pages are unmaped upon error. Fixes: ec2deec1f352 ("niu: Fix to check for dma mapping errors.") Signed-off-by: Thomas Fourier <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Geliang Tang <[email protected]> Date: Mon Jun 30 15:48:32 2025 +0800 nvme-multipath: fix suspicious RCU usage warning [ Upstream commit d6811074203b13f715ce2480ac64c5b1c773f2a5 ] When I run the NVME over TCP test in virtme-ng, I get the following "suspicious RCU usage" warning in nvme_mpath_add_sysfs_link(): ''' [ 5.024557][ T44] nvmet: Created nvm controller 1 for subsystem nqn.2025-06.org.nvmexpress.mptcp for NQN nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77. [ 5.027401][ T183] nvme nvme0: creating 2 I/O queues. [ 5.029017][ T183] nvme nvme0: mapped 2/0/0 default/read/poll queues. [ 5.032587][ T183] nvme nvme0: new ctrl: NQN "nqn.2025-06.org.nvmexpress.mptcp", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77 [ 5.042214][ T25] [ 5.042440][ T25] ============================= [ 5.042579][ T25] WARNING: suspicious RCU usage [ 5.042705][ T25] 6.16.0-rc3+ #23 Not tainted [ 5.042812][ T25] ----------------------------- [ 5.042934][ T25] drivers/nvme/host/multipath.c:1203 RCU-list traversed in non-reader section!! [ 5.043111][ T25] [ 5.043111][ T25] other info that might help us debug this: [ 5.043111][ T25] [ 5.043341][ T25] [ 5.043341][ T25] rcu_scheduler_active = 2, debug_locks = 1 [ 5.043502][ T25] 3 locks held by kworker/u9:0/25: [ 5.043615][ T25] #0: ffff888008730948 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x7ed/0x1350 [ 5.043830][ T25] #1: ffffc900001afd40 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0xcf3/0x1350 [ 5.044084][ T25] #2: ffff888013ee0020 (&head->srcu){.+.+}-{0:0}, at: nvme_mpath_add_sysfs_link.part.0+0xb4/0x3a0 [ 5.044300][ T25] [ 5.044300][ T25] stack backtrace: [ 5.044439][ T25] CPU: 0 UID: 0 PID: 25 Comm: kworker/u9:0 Not tainted 6.16.0-rc3+ #23 PREEMPT(full) [ 5.044441][ T25] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 5.044442][ T25] Workqueue: async async_run_entry_fn [ 5.044445][ T25] Call Trace: [ 5.044446][ T25] <TASK> [ 5.044449][ T25] dump_stack_lvl+0x6f/0xb0 [ 5.044453][ T25] lockdep_rcu_suspicious.cold+0x4f/0xb1 [ 5.044457][ T25] nvme_mpath_add_sysfs_link.part.0+0x2fb/0x3a0 [ 5.044459][ T25] ? queue_work_on+0x90/0xf0 [ 5.044461][ T25] ? lockdep_hardirqs_on+0x78/0x110 [ 5.044466][ T25] nvme_mpath_set_live+0x1e9/0x4f0 [ 5.044470][ T25] nvme_mpath_add_disk+0x240/0x2f0 [ 5.044472][ T25] ? __pfx_nvme_mpath_add_disk+0x10/0x10 [ 5.044475][ T25] ? add_disk_fwnode+0x361/0x580 [ 5.044480][ T25] nvme_alloc_ns+0x81c/0x17c0 [ 5.044483][ T25] ? kasan_quarantine_put+0x104/0x240 [ 5.044487][ T25] ? __pfx_nvme_alloc_ns+0x10/0x10 [ 5.044495][ T25] ? __pfx_nvme_find_get_ns+0x10/0x10 [ 5.044496][ T25] ? rcu_read_lock_any_held+0x45/0xa0 [ 5.044498][ T25] ? validate_chain+0x232/0x4f0 [ 5.044503][ T25] nvme_scan_ns+0x4c8/0x810 [ 5.044506][ T25] ? __pfx_nvme_scan_ns+0x10/0x10 [ 5.044508][ T25] ? find_held_lock+0x2b/0x80 [ 5.044512][ T25] ? ktime_get+0x16d/0x220 [ 5.044517][ T25] ? kvm_clock_get_cycles+0x18/0x30 [ 5.044520][ T25] ? __pfx_nvme_scan_ns_async+0x10/0x10 [ 5.044522][ T25] async_run_entry_fn+0x97/0x560 [ 5.044523][ T25] ? rcu_is_watching+0x12/0xc0 [ 5.044526][ T25] process_one_work+0xd3c/0x1350 [ 5.044532][ T25] ? __pfx_process_one_work+0x10/0x10 [ 5.044536][ T25] ? assign_work+0x16c/0x240 [ 5.044539][ T25] worker_thread+0x4da/0xd50 [ 5.044545][ T25] ? __pfx_worker_thread+0x10/0x10 [ 5.044546][ T25] kthread+0x356/0x5c0 [ 5.044548][ T25] ? __pfx_kthread+0x10/0x10 [ 5.044549][ T25] ? ret_from_fork+0x1b/0x2e0 [ 5.044552][ T25] ? __lock_release.isra.0+0x5d/0x180 [ 5.044553][ T25] ? ret_from_fork+0x1b/0x2e0 [ 5.044555][ T25] ? rcu_is_watching+0x12/0xc0 [ 5.044557][ T25] ? __pfx_kthread+0x10/0x10 [ 5.044559][ T25] ret_from_fork+0x218/0x2e0 [ 5.044561][ T25] ? __pfx_kthread+0x10/0x10 [ 5.044562][ T25] ret_from_fork_asm+0x1a/0x30 [ 5.044570][ T25] </TASK> ''' This patch uses sleepable RCU version of helper list_for_each_entry_srcu() instead of list_for_each_entry_rcu() to fix it. Fixes: 4dbd2b2ebe4c ("nvme-multipath: Add visibility for round-robin io-policy") Signed-off-by: Geliang Tang <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Nilay Shroff <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Eugen Hristev <[email protected]> Date: Fri Jun 13 16:21:01 2025 -0300 nvme-pci: refresh visible attrs after being checked [ Upstream commit 14005c96d6649b27fa52d0cd0f492eb3b5586c07 ] The sysfs attributes are registered early, but the driver does not know whether they are needed or not at that moment. For the CMB attributes, commit e917a849c3fc ("nvme-pci: refresh visible attrs for cmb attributes") solved this problem by calling nvme_update_attrs after mapping the CMB. However the issue persists for the HMB attributes. To solve the problem, moved the call to nvme_update_attrs after nvme_setup_host_mem, which sets up the HMB. Fixes: e917a849c3fc ("nvme-pci: refresh visible attrs for cmb attributes") Fixes: 86adbf0cdb9e ("nvme: simplify transport specific device attribute handling") Signed-off-by: Eugen Hristev <[email protected]> Signed-off-by: André Almeida <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Alok Tiwari <[email protected]> Date: Sat Jun 28 11:12:32 2025 -0700 nvme: Fix incorrect cdw15 value in passthru error logging [ Upstream commit 2e96d2d8c2a7a6c2cef45593c028d9c5ef180316 ] Fix an error in nvme_log_err_passthru() where cdw14 was incorrectly printed twice instead of cdw15. This fix ensures accurate logging of the full passthrough command payload. Fixes: 9f079dda1433 ("nvme: allow passthru cmd error logging") Signed-off-by: Alok Tiwari <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Dmitry Bogdanov <[email protected]> Date: Wed Jun 25 14:45:33 2025 +0300 nvmet: fix memory leak of bio integrity [ Upstream commit 190f4c2c863af7cc5bb354b70e0805f06419c038 ] If nvmet receives commands with metadata there is a continuous memory leak of kmalloc-128 slab or more precisely bio->bi_integrity. Since commit bf4c89fc8797 ("block: don't call bio_uninit from bio_endio") each user of bio_init has to use bio_uninit as well. Otherwise the bio integrity is not getting free. Nvmet uses bio_init for inline bios. Uninit the inline bio to complete deallocation of integrity in bio. Fixes: bf4c89fc8797 ("block: don't call bio_uninit from bio_endio") Signed-off-by: Dmitry Bogdanov <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Jens Wiklander <[email protected]> Date: Mon Jun 2 14:04:35 2025 +0200 optee: ffa: fix sleep in atomic context commit 312d02adb959ea199372f375ada06e0186f651e4 upstream. The OP-TEE driver registers the function notif_callback() for FF-A notifications. However, this function is called in an atomic context leading to errors like this when processing asynchronous notifications: | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:258 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 9, name: kworker/0:0 | preempt_count: 1, expected: 0 | RCU nest depth: 0, expected: 0 | CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.14.0-00019-g657536ebe0aa #13 | Hardware name: linux,dummy-virt (DT) | Workqueue: ffa_pcpu_irq_notification notif_pcpu_irq_work_fn | Call trace: | show_stack+0x18/0x24 (C) | dump_stack_lvl+0x78/0x90 | dump_stack+0x18/0x24 | __might_resched+0x114/0x170 | __might_sleep+0x48/0x98 | mutex_lock+0x24/0x80 | optee_get_msg_arg+0x7c/0x21c | simple_call_with_arg+0x50/0xc0 | optee_do_bottom_half+0x14/0x20 | notif_callback+0x3c/0x48 | handle_notif_callbacks+0x9c/0xe0 | notif_get_and_handle+0x40/0x88 | generic_exec_single+0x80/0xc0 | smp_call_function_single+0xfc/0x1a0 | notif_pcpu_irq_work_fn+0x2c/0x38 | process_one_work+0x14c/0x2b4 | worker_thread+0x2e4/0x3e0 | kthread+0x13c/0x210 | ret_from_fork+0x10/0x20 Fix this by adding work queue to process the notification in a non-atomic context. Fixes: d0476a59de06 ("optee: ffa_abi: add asynchronous notifications") Cc: [email protected] Reviewed-by: Sumit Garg <[email protected]> Tested-by: Sudeep Holla <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Wiklander <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Alok Tiwari <[email protected]> Date: Wed Jun 18 23:05:00 2025 -0700 platform/mellanox: mlxbf-pmc: Fix duplicate event ID for CACHE_DATA1 [ Upstream commit 173bbec6693f3f3f00dac144f3aa0cd62fb60d33 ] same ID (103) was assigned to both GDC_BANK0_G_RSE_PIPE_CACHE_DATA0 and GDC_BANK0_G_RSE_PIPE_CACHE_DATA1. This could lead to incorrect event mapping. Updated the ID to 104 to ensure uniqueness. Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3") Signed-off-by: Alok Tiwari <[email protected]> Reviewed-by: David Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: David Thompson <[email protected]> Date: Fri Jun 13 21:46:08 2025 +0000 platform/mellanox: mlxbf-tmfifo: fix vring_desc.len assignment [ Upstream commit 109f4d29dade8ae5b4ac6325af9d1bc24b4230f8 ] Fix warnings reported by sparse, related to incorrect type: drivers/platform/mellanox/mlxbf-tmfifo.c:284:38: warning: incorrect type in assignment (different base types) drivers/platform/mellanox/mlxbf-tmfifo.c:284:38: expected restricted __virtio32 [usertype] len drivers/platform/mellanox/mlxbf-tmfifo.c:284:38: got unsigned long Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Fixes: 78034cbece79 ("platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors") Signed-off-by: David Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Alok Tiwari <[email protected]> Date: Mon Jun 30 03:58:08 2025 -0700 platform/mellanox: mlxreg-lc: Fix logic error in power state check [ Upstream commit 644bec18e705ca41d444053407419a21832fcb2f ] Fixes a logic issue in mlxreg_lc_completion_notify() where the intention was to check if MLXREG_LC_POWERED flag is not set before powering on the device. The original code used "state & ~MLXREG_LC_POWERED" to check for the absence of the POWERED bit. However this condition evaluates to true even when other bits are set, leading to potentially incorrect behavior. Corrected the logic to explicitly check for the absence of MLXREG_LC_POWERED using !(state & MLXREG_LC_POWERED). Fixes: 62f9529b8d5c ("platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices") Suggested-by: Vadim Pasternak <[email protected]> Signed-off-by: Alok Tiwari <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Alok Tiwari <[email protected]> Date: Sun Jun 22 00:29:12 2025 -0700 platform/mellanox: nvsw-sn2201: Fix bus number in adapter error message [ Upstream commit d07143b507c51c04c091081627c5a130e9d3c517 ] change error log to use correct bus number from main_mux_devs instead of cpld_devs. Fixes: 662f24826f95 ("platform/mellanox: Add support for new SN2201 system") Signed-off-by: Alok Tiwari <[email protected]> Reviewed-by: Vadim Pasternak <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mario Limonciello <[email protected]> Date: Wed Jun 11 15:33:40 2025 -0500 platform/x86/amd/pmc: Add PCSpecialist Lafite Pro V 14M to 8042 quirks list [ Upstream commit 9ba75ccad85708c5a484637dccc1fc59295b0a83 ] Every other s2idle cycle fails to reach hardware sleep when keyboard wakeup is enabled. This appears to be an EC bug, but the vendor refuses to fix it. It was confirmed that turning off i8042 wakeup avoids ths issue (albeit keyboard wakeup is disabled). Take the lesser of two evils and add it to the i8042 quirk list. Reported-by: Raoul <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220116 Tested-by: Raoul <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kurt Borja <[email protected]> Date: Wed Jun 25 22:17:37 2025 -0300 platform/x86: dell-wmi-sysman: Fix class device unregistration [ Upstream commit 314e5ad4782d08858b3abc325c0487bd2abc23a1 ] Devices under the firmware_attributes_class do not have unique a dev_t. Therefore, device_unregister() should be used instead of device_destroy(), since the latter may match any device with a given dev_t. Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Signed-off-by: Kurt Borja <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kurt Borja <[email protected]> Date: Mon Jun 30 00:43:12 2025 -0300 platform/x86: dell-wmi-sysman: Fix WMI data block retrieval in sysfs callbacks [ Upstream commit eb617dd25ca176f3fee24f873f0fd60010773d67 ] After retrieving WMI data blocks in sysfs callbacks, check for the validity of them before dereferencing their content. Reported-by: Jan Graczyk <[email protected]> Closes: https://lore.kernel.org/r/CAHk-=wgMiSKXf7SvQrfEnxVtmT=QVQPjJdNjfm3aXS7wc=rzTw@mail.gmail.com/ Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Suggested-by: Linus Torvalds <[email protected]> Reviewed-by: Armin Wolf <[email protected]> Signed-off-by: Kurt Borja <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kurt Borja <[email protected]> Date: Wed Jun 25 22:17:35 2025 -0300 platform/x86: hp-bioscfg: Fix class device unregistration [ Upstream commit 11cba4793b95df3bc192149a6eb044f69aa0b99e ] Devices under the firmware_attributes_class do not have unique a dev_t. Therefore, device_unregister() should be used instead of device_destroy(), since the latter may match any device with a given dev_t. Fixes: a34fc329b189 ("platform/x86: hp-bioscfg: bioscfg") Signed-off-by: Kurt Borja <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kurt Borja <[email protected]> Date: Mon Jun 30 14:31:19 2025 -0300 platform/x86: think-lmi: Create ksets consecutively commit 8dab34ca77293b409c3223636dde915a22656748 upstream. Avoid entering tlmi_release_attr() in error paths if both ksets are not yet created. This is accomplished by initializing them side by side. Reviewed-by: Mark Pearson <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Cc: [email protected] Signed-off-by: Kurt Borja <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Kurt Borja <[email protected]> Date: Wed Jun 25 22:17:36 2025 -0300 platform/x86: think-lmi: Fix class device unregistration [ Upstream commit 5ff1fbb3059730700b4823f43999fc1315984632 ] Devices under the firmware_attributes_class do not have unique a dev_t. Therefore, device_unregister() should be used instead of device_destroy(), since the latter may match any device with a given dev_t. Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Kurt Borja <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Kurt Borja <[email protected]> Date: Mon Jun 30 14:31:20 2025 -0300 platform/x86: think-lmi: Fix kobject cleanup commit 9110056fe10b0519529bdbbac37311a5037ea0c2 upstream. In tlmi_analyze(), allocated structs with an embedded kobject are freed in error paths after the they were already initialized. Fix this by first by avoiding the initialization of kobjects in tlmi_analyze() and then by correctly cleaning them up in tlmi_release_attr() using their kset's kobject list. Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Fixes: 30e78435d3bf ("platform/x86: think-lmi: Split kobject_init() and kobject_add() calls") Cc: [email protected] Reviewed-by: Mark Pearson <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Kurt Borja <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Kurt Borja <[email protected]> Date: Mon Jun 30 14:31:21 2025 -0300 platform/x86: think-lmi: Fix sysfs group cleanup commit 4f30f946f27b7f044cf8f3f1f353dee1dcd3517a upstream. Many error paths in tlmi_sysfs_init() lead to sysfs groups being removed when they were not even created. Fix this by letting the kobject core manage these groups through their kobj_type's defult_groups. Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Cc: [email protected] Reviewed-by: Mark Pearson <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Kurt Borja <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Armin Wolf <[email protected]> Date: Fri Jun 20 00:14:39 2025 +0200 platform/x86: wmi: Fix WMI event enablement [ Upstream commit cf0b812500e64a7d5e2957abed38c3a97917b34f ] It turns out that the Windows WMI-ACPI driver always enables/disables WMI events regardless of whether they are marked as expensive or not. This finding is further reinforced when reading the documentation of the WMI_FUNCTION_CONTROL_CALLBACK callback used by Windows drivers for enabling/disabling WMI devices: The DpWmiFunctionControl routine enables or disables notification of events, and enables or disables data collection for data blocks that the driver registered as expensive to collect. Follow this behavior to fix the WMI event used for reporting hotkey events on the Dell Latitude 5400 and likely many more devices. Reported-by: Dmytro Bagrii <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220246 Tested-by: Dmytro Bagrii <[email protected]> Fixes: 656f0961d126 ("platform/x86: wmi: Rework WCxx/WExx ACPI method handling") Signed-off-by: Armin Wolf <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Zhang Rui <[email protected]> Date: Thu Jun 19 15:13:40 2025 +0800 powercap: intel_rapl: Do not change CLAMPING bit if ENABLE bit cannot be changed commit 964209202ebe1569c858337441e87ef0f9d71416 upstream. PL1 cannot be disabled on some platforms. The ENABLE bit is still set after software clears it. This behavior leads to a scenario where, upon user request to disable the Power Limit through the powercap sysfs, the ENABLE bit remains set while the CLAMPING bit is inadvertently cleared. According to the Intel Software Developer's Manual, the CLAMPING bit, "When set, allows the processor to go below the OS requested P states in order to maintain the power below specified Platform Power Limit value." Thus this means the system may operate at higher power levels than intended on such platforms. Enhance the code to check ENABLE bit after writing to it, and stop further processing if ENABLE bit cannot be changed. Reported-by: Srinivas Pandruvada <[email protected]> Fixes: 2d281d8196e3 ("PowerCap: Introduce Intel RAPL power capping driver") Cc: All applicable <[email protected]> Signed-off-by: Zhang Rui <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Use str_enabled_disabled() instead of open-coded equivalent ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Madhavan Srinivasan <[email protected]> Date: Sat May 17 19:52:37 2025 +0530 powerpc: Fix struct termio related ioctl macros [ Upstream commit ab107276607af90b13a5994997e19b7b9731e251 ] Since termio interface is now obsolete, include/uapi/asm/ioctls.h has some constant macros referring to "struct termio", this caused build failure at userspace. In file included from /usr/include/asm/ioctl.h:12, from /usr/include/asm/ioctls.h:5, from tst-ioctls.c:3: tst-ioctls.c: In function 'get_TCGETA': tst-ioctls.c:12:10: error: invalid application of 'sizeof' to incomplete type 'struct termio' 12 | return TCGETA; | ^~~~~~ Even though termios.h provides "struct termio", trying to juggle definitions around to make it compile could introduce regressions. So better to open code it. Reported-by: Tulio Magno <[email protected]> Suggested-by: Nicholas Piggin <[email protected]> Tested-by: Justin M. Forbes <[email protected]> Reviewed-by: Michael Ellerman <[email protected]> Closes: https://lore.kernel.org/linuxppc-dev/[email protected]/ Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
Author: Uladzislau Rezki (Sony) <[email protected]> Date: Tue Jun 10 19:34:48 2025 +0200 rcu: Return early if callback is not specified [ Upstream commit 33b6a1f155d627f5bd80c7485c598ce45428f74f ] Currently the call_rcu() API does not check whether a callback pointer is NULL. If NULL is passed, rcu_core() will try to invoke it, resulting in NULL pointer dereference and a kernel crash. To prevent this and improve debuggability, this patch adds a check for NULL and emits a kernel stack trace to help identify a faulty caller. Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> Reviewed-by: Joel Fernandes <[email protected]> Signed-off-by: Joel Fernandes <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Patrisious Haddad <[email protected]> Date: Mon Jun 16 12:14:53 2025 +0300 RDMA/mlx5: Fix CC counters query for MPV [ Upstream commit acd245b1e33fc4b9d0f2e3372021d632f7ee0652 ] In case, CC counters are querying for the second port use the correct core device for the query instead of always using the master core device. Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE") Signed-off-by: Patrisious Haddad <[email protected]> Reviewed-by: Michael Guralnik <[email protected]> Link: https://patch.msgid.link/9cace74dcf106116118bebfa9146d40d4166c6b0.1750064969.git.leon@kernel.org Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Patrisious Haddad <[email protected]> Date: Mon Jun 16 12:14:52 2025 +0300 RDMA/mlx5: Fix HW counters query for non-representor devices [ Upstream commit 3cc1dbfddf88dc5ecce0a75185061403b1f7352d ] To get the device HW counters, a non-representor switchdev device should use the mlx5_ib_query_q_counters() function and query all of the available counters. While a representor device in switchdev mode should use the mlx5_ib_query_q_counters_vport() function and query only the Q_Counters without the PPCNT counters and congestion control counters, since they aren't relevant for a representor device. Currently a non-representor switchdev device skips querying the PPCNT counters and congestion control counters, leaving them unupdated. Fix that by properly querying those counters for non-representor devices. Fixes: d22467a71ebe ("RDMA/mlx5: Expand switchdev Q-counters to expose representor statistics") Signed-off-by: Patrisious Haddad <[email protected]> Reviewed-by: Maher Sanalla <[email protected]> Link: https://patch.msgid.link/56bf8af4ca8c58e3fb9f7e47b1dca2009eeeed81.1750064969.git.leon@kernel.org Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Or Har-Toov <[email protected]> Date: Mon Jun 16 11:17:01 2025 +0300 RDMA/mlx5: Fix unsafe xarray access in implicit ODP handling [ Upstream commit 2c6b640ea08bff1a192bf87fa45246ff1e40767c ] __xa_store() and __xa_erase() were used without holding the proper lock, which led to a lockdep warning due to unsafe RCU usage. This patch replaces them with xa_store() and xa_erase(), which perform the necessary locking internally. ============================= WARNING: suspicious RCPU usage 6.14.0-rc7_for_upstream_debug_2025_03_18_15_01 #1 Not tainted ----------------------------- ./include/linux/xarray.h:1211 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by kworker/u136:0/219: at: process_one_work+0xbe4/0x15f0 process_one_work+0x75c/0x15f0 pagefault_mr+0x9a5/0x1390 [mlx5_ib] stack backtrace: CPU: 14 UID: 0 PID: 219 Comm: kworker/u136:0 Not tainted 6.14.0-rc7_for_upstream_debug_2025_03_18_15_01 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib] Call Trace: dump_stack_lvl+0xa8/0xc0 lockdep_rcu_suspicious+0x1e6/0x260 xas_create+0xb8a/0xee0 xas_store+0x73/0x14c0 __xa_store+0x13c/0x220 ? xa_store_range+0x390/0x390 ? spin_bug+0x1d0/0x1d0 pagefault_mr+0xcb5/0x1390 [mlx5_ib] ? _raw_spin_unlock+0x1f/0x30 mlx5_ib_eqe_pf_action+0x3be/0x2620 [mlx5_ib] ? lockdep_hardirqs_on_prepare+0x400/0x400 ? mlx5_ib_invalidate_range+0xcb0/0xcb0 [mlx5_ib] process_one_work+0x7db/0x15f0 ? pwq_dec_nr_in_flight+0xda0/0xda0 ? assign_work+0x168/0x240 worker_thread+0x57d/0xcd0 ? rescuer_thread+0xc40/0xc40 kthread+0x3b3/0x800 ? kthread_is_per_cpu+0xb0/0xb0 ? lock_downgrade+0x680/0x680 ? do_raw_spin_lock+0x12d/0x270 ? spin_bug+0x1d0/0x1d0 ? finish_task_switch.isra.0+0x284/0x9e0 ? lockdep_hardirqs_on_prepare+0x284/0x400 ? kthread_is_per_cpu+0xb0/0xb0 ret_from_fork+0x2d/0x70 ? kthread_is_per_cpu+0xb0/0xb0 ret_from_fork_asm+0x11/0x20 Fixes: d3d930411ce3 ("RDMA/mlx5: Fix implicit ODP use after free") Link: https://patch.msgid.link/r/a85ddd16f45c8cb2bc0a188c2b0fcedfce975eb8.1750061791.git.leon@kernel.org Signed-off-by: Or Har-Toov <[email protected]> Reviewed-by: Patrisious Haddad <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Patrisious Haddad <[email protected]> Date: Mon Jun 16 12:14:54 2025 +0300 RDMA/mlx5: Fix vport loopback for MPV device [ Upstream commit a9a9e68954f29b1e197663f76289db4879fd51bb ] Always enable vport loopback for both MPV devices on driver start. Previously in some cases related to MPV RoCE, packets weren't correctly executing loopback check at vport in FW, since it was disabled. Due to complexity of identifying such cases for MPV always enable vport loopback for both GVMIs when binding the slave to the master port. Fixes: 0042f9e458a5 ("RDMA/mlx5: Enable vport loopback when user context or QP mandate") Signed-off-by: Patrisious Haddad <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Link: https://patch.msgid.link/d4298f5ebb2197459e9e7221c51ecd6a34699847.1750064969.git.leon@kernel.org Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mark Zhang <[email protected]> Date: Tue Jun 17 11:13:55 2025 +0300 RDMA/mlx5: Initialize obj_event->obj_sub_list before xa_insert [ Upstream commit 8edab8a72d67742f87e9dc2e2b0cdfddda5dc29a ] The obj_event may be loaded immediately after inserted, then if the list_head is not initialized then we may get a poisonous pointer. This fixes the crash below: mlx5_core 0000:03:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 enhanced) mlx5_core.sf mlx5_core.sf.4: firmware version: 32.38.3056 mlx5_core 0000:03:00.0 en3f0pf0sf2002: renamed from eth0 mlx5_core.sf mlx5_core.sf.4: Rate limit: 127 rates are supported, range: 0Mbps to 195312Mbps IPv6: ADDRCONF(NETDEV_CHANGE): en3f0pf0sf2002: link becomes ready Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 Mem abort info: ESR = 0x96000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=00000007760fb000 [0000000000000060] pgd=000000076f6d7003, p4d=000000076f6d7003, pud=0000000777841003, pmd=0000000000000000 Internal error: Oops: 96000006 [#1] SMP Modules linked in: ipmb_host(OE) act_mirred(E) cls_flower(E) sch_ingress(E) mptcp_diag(E) udp_diag(E) raw_diag(E) unix_diag(E) tcp_diag(E) inet_diag(E) binfmt_misc(E) bonding(OE) rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) isofs(E) cdrom(E) mst_pciconf(OE) ib_umad(OE) mlx5_ib(OE) ipmb_dev_int(OE) mlx5_core(OE) kpatch_15237886(OEK) mlxdevm(OE) auxiliary(OE) ib_uverbs(OE) ib_core(OE) psample(E) mlxfw(OE) tls(E) sunrpc(E) vfat(E) fat(E) crct10dif_ce(E) ghash_ce(E) sha1_ce(E) sbsa_gwdt(E) virtio_console(E) ext4(E) mbcache(E) jbd2(E) xfs(E) libcrc32c(E) mmc_block(E) virtio_net(E) net_failover(E) failover(E) sha2_ce(E) sha256_arm64(E) nvme(OE) nvme_core(OE) gpio_mlxbf3(OE) mlx_compat(OE) mlxbf_pmc(OE) i2c_mlxbf(OE) sdhci_of_dwcmshc(OE) pinctrl_mlxbf3(OE) mlxbf_pka(OE) gpio_generic(E) i2c_core(E) mmc_core(E) mlxbf_gige(OE) vitesse(E) pwr_mlxbf(OE) mlxbf_tmfifo(OE) micrel(E) mlxbf_bootctl(OE) virtio_ring(E) virtio(E) ipmi_devintf(E) ipmi_msghandler(E) [last unloaded: mst_pci] CPU: 11 PID: 20913 Comm: rte-worker-11 Kdump: loaded Tainted: G OE K 5.10.134-13.1.an8.aarch64 #1 Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.2.2.12968 Oct 26 2023 pstate: a0400089 (NzCv daIf +PAN -UAO -TCO BTYPE=--) pc : dispatch_event_fd+0x68/0x300 [mlx5_ib] lr : devx_event_notifier+0xcc/0x228 [mlx5_ib] sp : ffff80001005bcf0 x29: ffff80001005bcf0 x28: 0000000000000001 x27: ffff244e0740a1d8 x26: ffff244e0740a1d0 x25: ffffda56beff5ae0 x24: ffffda56bf911618 x23: ffff244e0596a480 x22: ffff244e0596a480 x21: ffff244d8312ad90 x20: ffff244e0596a480 x19: fffffffffffffff0 x18: 0000000000000000 x17: 0000000000000000 x16: ffffda56be66d620 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000040 x10: ffffda56bfcafb50 x9 : ffffda5655c25f2c x8 : 0000000000000010 x7 : 0000000000000000 x6 : ffff24545a2e24b8 x5 : 0000000000000003 x4 : ffff80001005bd28 x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff244e0596a480 x0 : ffff244d8312ad90 Call trace: dispatch_event_fd+0x68/0x300 [mlx5_ib] devx_event_notifier+0xcc/0x228 [mlx5_ib] atomic_notifier_call_chain+0x58/0x80 mlx5_eq_async_int+0x148/0x2b0 [mlx5_core] atomic_notifier_call_chain+0x58/0x80 irq_int_handler+0x20/0x30 [mlx5_core] __handle_irq_event_percpu+0x60/0x220 handle_irq_event_percpu+0x3c/0x90 handle_irq_event+0x58/0x158 handle_fasteoi_irq+0xfc/0x188 generic_handle_irq+0x34/0x48 ... Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX") Link: https://patch.msgid.link/r/3ce7f20e0d1a03dc7de6e57494ec4b8eaf1f05c2.1750147949.git.leon@kernel.org Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Arnd Bergmann <[email protected]> Date: Tue Jun 10 11:28:42 2025 +0200 RDMA/mlx5: reduce stack usage in mlx5_ib_ufile_hw_cleanup [ Upstream commit b26852daaa83f535109253d114426d1fa674155d ] This function has an array of eight mlx5_async_cmd structures, which often fits on the stack, but depending on the configuration can end up blowing the stack frame warning limit: drivers/infiniband/hw/mlx5/devx.c:2670:6: error: stack frame size (1392) exceeds limit (1280) in 'mlx5_ib_ufile_hw_cleanup' [-Werror,-Wframe-larger-than] Change this to a dynamic allocation instead. While a kmalloc() can theoretically fail, a GFP_KERNEL allocation under a page will block until memory has been freed up, so in the worst case, this only adds extra time in an already constrained environment. Fixes: 7c891a4dbcc1 ("RDMA/mlx5: Add implementation for ufile_hw_cleanup device operation") Signed-off-by: Arnd Bergmann <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Heiko Stuebner <[email protected]> Date: Fri Jun 6 21:04:18 2025 +0200 regulator: fan53555: add enable_time support and soft-start times [ Upstream commit 8acfb165a492251a08a22a4fa6497a131e8c2609 ] The datasheets for all the fan53555 variants (and clones using the same interface) define so called soft start times, from enabling the regulator until at least some percentage of the output (i.e. 92% for the rk860x types) are available. The regulator framework supports this with the enable_time property but currently the fan53555 driver does not define enable_times for any variant. I ran into a problem with this while testing the new driver for the Rockchip NPUs (rocket), which does runtime-pm including disabling and enabling a rk8602 as needed. When reenabling the regulator while running a load, fatal hangs could be observed while enabling the associated power-domain, which the regulator supplies. Experimentally setting the regulator to always-on, made the issue disappear, leading to the missing delay to let power stabilize. And as expected, setting the enable-time to a non-zero value according to the datasheet also resolved the regulator-issue. The datasheets in nearly all cases only specify "typical" values, except for the fan53555 type 08. There both a typical and maximum value are listed - 40uS apart. For all typical values I've added 100uS to be on the safe side. Individual details for the relevant regulators below: - fan53526: The datasheet for all variants lists a typical value of 150uS, so make that 250uS with safety margin. - fan53555: types 08 and 18 (unsupported) are given a typical enable time of 135uS but also a maximum of 175uS so use that value. All the other types only have a typical time in the datasheet of 300uS, so give a bit margin by setting it to 400uS. - rk8600 + rk8602: Datasheet reports a typical value of 260us, so use 360uS to be safe. - syr82x + syr83x: All datasheets report typical soft-start values of 300uS for these regulators, so use 400uS. - tcs452x: Datasheet sadly does not report a soft-start time, so I've not set an enable-time Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Manivannan Sadhasivam <[email protected]> Date: Thu Jul 3 16:05:49 2025 +0530 regulator: gpio: Fix the out-of-bounds access to drvdata::gpiods commit c9764fd88bc744592b0604ccb6b6fc1a5f76b4e3 upstream. drvdata::gpiods is supposed to hold an array of 'gpio_desc' pointers. But the memory is allocated for only one pointer. This will lead to out-of-bounds access later in the code if 'config::ngpios' is > 1. So fix the code to allocate enough memory to hold 'config::ngpios' of GPIO descriptors. While at it, also move the check for memory allocation failure to be below the allocation to make it more readable. Cc: [email protected] # 5.0 Fixes: d6cd33ad7102 ("regulator: gpio: Convert to use descriptors") Signed-off-by: Manivannan Sadhasivam <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Ulf Hansson <[email protected]> Date: Tue Jun 24 13:09:32 2025 +0200 Revert "mmc: sdhci: Disable SD card clock before changing parameters" commit dcc3bcfc5b50c625b475dcc25d167b6b947a6637 upstream. It has turned out the trying to strictly conform to the SDHCI specification is causing problems. Let's revert and start over. This reverts commit fb3bbc46c94f261b6156ee863c1b06c84cf157dc. Cc: Erick Shepherd <[email protected]> Cc: [email protected] Fixes: fb3bbc46c94f ("mmc: sdhci: Disable SD card clock before changing parameters") Suggested-by: Adrian Hunter <[email protected]> Reported-by: Jonathan Liu <[email protected]> Reported-by: Salvatore Bonaccorso <[email protected]> Closes: https://bugs.debian.org/1108065 Acked-by: Adrian Hunter <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Roy Luo <[email protected]> Date: Thu May 22 19:09:12 2025 +0000 Revert "usb: xhci: Implement xhci_handshake_check_state() helper" commit 7aed15379db9c6ec67999cdaf5c443b7be06ea73 upstream. This reverts commit 6ccb83d6c4972ebe6ae49de5eba051de3638362c. Commit 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") was introduced to workaround watchdog timeout issues on some platforms, allowing xhci_reset() to bail out early without waiting for the reset to complete. Skipping the xhci handshake during a reset is a dangerous move. The xhci specification explicitly states that certain registers cannot be accessed during reset in section 5.4.1 USB Command Register (USBCMD), Host Controller Reset (HCRST) field: "This bit is cleared to '0' by the Host Controller when the reset process is complete. Software cannot terminate the reset process early by writinga '0' to this bit and shall not write any xHC Operational or Runtime registers until while HCRST is '1'." This behavior causes a regression on SNPS DWC3 USB controller with dual-role capability. When the DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then tries to configure its hardware for device mode, the ongoing reset leads to register access issues; specifically, all register reads returns 0. These issues extend beyond the xhci register space (which is expected during a reset) and affect the entire DWC3 IP block, causing the DWC3 device mode to malfunction. Cc: stable <[email protected]> Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") Signed-off-by: Roy Luo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Vivian Wang <[email protected]> Date: Tue Jun 24 16:04:46 2025 +0800 riscv: cpu_ops_sbi: Use static array for boot_data commit 2b29be967ae456fc09c320d91d52278cf721be1e upstream. Since commit 6b9f29b81b15 ("riscv: Enable pcpu page first chunk allocator"), if NUMA is enabled, the page percpu allocator may be used on very sparse configurations, or when requested on boot with percpu_alloc=page. In that case, percpu data gets put in the vmalloc area. However, sbi_hsm_hart_start() needs the physical address of a sbi_hart_boot_data, and simply assumes that __pa() would work. This causes the just started hart to immediately access an invalid address and hang. Fortunately, struct sbi_hart_boot_data is not too large, so we can simply allocate an array for boot_data statically, putting it in the kernel image. This fixes NUMA=y SMP boot on Sophgo SG2042. To reproduce on QEMU: Set CONFIG_NUMA=y and CONFIG_DEBUG_VIRTUAL=y, then run with: qemu-system-riscv64 -M virt -smp 2 -nographic \ -kernel arch/riscv/boot/Image \ -append "percpu_alloc=page" Kernel output: [ 0.000000] Booting Linux on hartid 0 [ 0.000000] Linux version 6.16.0-rc1 (dram@sakuya) (riscv64-unknown-linux-gnu-gcc (GCC) 14.2.1 20250322, GNU ld (GNU Binutils) 2.44) #11 SMP Tue Jun 24 14:56:22 CST 2025 ... [ 0.000000] percpu: 28 4K pages/cpu s85784 r8192 d20712 ... [ 0.083192] smp: Bringing up secondary CPUs ... [ 0.086722] ------------[ cut here ]------------ [ 0.086849] virt_to_phys used for non-linear address: (____ptrval____) (0xff2000000001d080) [ 0.088001] WARNING: CPU: 0 PID: 1 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xae/0xe8 [ 0.088376] Modules linked in: [ 0.088656] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc1 #11 NONE [ 0.088833] Hardware name: riscv-virtio,qemu (DT) [ 0.088948] epc : __virt_to_phys+0xae/0xe8 [ 0.089001] ra : __virt_to_phys+0xae/0xe8 [ 0.089037] epc : ffffffff80021eaa ra : ffffffff80021eaa sp : ff2000000004bbc0 [ 0.089057] gp : ffffffff817f49c0 tp : ff60000001d60000 t0 : 5f6f745f74726976 [ 0.089076] t1 : 0000000000000076 t2 : 705f6f745f747269 s0 : ff2000000004bbe0 [ 0.089095] s1 : ff2000000001d080 a0 : 0000000000000000 a1 : 0000000000000000 [ 0.089113] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.089131] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 [ 0.089155] s2 : ffffffff8130dc00 s3 : 0000000000000001 s4 : 0000000000000001 [ 0.089174] s5 : ffffffff8185eff8 s6 : ff2000007f1eb000 s7 : ffffffff8002a2ec [ 0.089193] s8 : 0000000000000001 s9 : 0000000000000001 s10: 0000000000000000 [ 0.089211] s11: 0000000000000000 t3 : ffffffff8180a9f7 t4 : ffffffff8180a9f7 [ 0.089960] t5 : ffffffff8180a9f8 t6 : ff2000000004b9d8 [ 0.089984] status: 0000000200000120 badaddr: ffffffff80021eaa cause: 0000000000000003 [ 0.090101] [<ffffffff80021eaa>] __virt_to_phys+0xae/0xe8 [ 0.090228] [<ffffffff8001d796>] sbi_cpu_start+0x6e/0xe8 [ 0.090247] [<ffffffff8001a5da>] __cpu_up+0x1e/0x8c [ 0.090260] [<ffffffff8002a32e>] bringup_cpu+0x42/0x258 [ 0.090277] [<ffffffff8002914c>] cpuhp_invoke_callback+0xe0/0x40c [ 0.090292] [<ffffffff800294e0>] __cpuhp_invoke_callback_range+0x68/0xfc [ 0.090320] [<ffffffff8002a96a>] _cpu_up+0x11a/0x244 [ 0.090334] [<ffffffff8002aae6>] cpu_up+0x52/0x90 [ 0.090384] [<ffffffff80c09350>] bringup_nonboot_cpus+0x78/0x118 [ 0.090411] [<ffffffff80c11060>] smp_init+0x34/0xb8 [ 0.090425] [<ffffffff80c01220>] kernel_init_freeable+0x148/0x2e4 [ 0.090442] [<ffffffff80b83802>] kernel_init+0x1e/0x14c [ 0.090455] [<ffffffff800124ca>] ret_from_fork_kernel+0xe/0xf0 [ 0.090471] [<ffffffff80b8d9c2>] ret_from_fork_kernel_asm+0x16/0x18 [ 0.090560] ---[ end trace 0000000000000000 ]--- [ 1.179875] CPU1: failed to come online [ 1.190324] smp: Brought up 1 node, 1 CPU Cc: [email protected] Reported-by: Han Gao <[email protected]> Fixes: 6b9f29b81b15 ("riscv: Enable pcpu page first chunk allocator") Reviewed-by: Alexandre Ghiti <[email protected]> Tested-by: Alexandre Ghiti <[email protected]> Signed-off-by: Vivian Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Ghiti <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Kohei Enju <[email protected]> Date: Sun Jun 29 12:06:31 2025 +0900 rose: fix dangling neighbour pointers in rose_rt_device_down() [ Upstream commit 34a500caf48c47d5171f4aa1f237da39b07c6157 ] There are two bugs in rose_rt_device_down() that can cause use-after-free: 1. The loop bound `t->count` is modified within the loop, which can cause the loop to terminate early and miss some entries. 2. When removing an entry from the neighbour array, the subsequent entries are moved up to fill the gap, but the loop index `i` is still incremented, causing the next entry to be skipped. For example, if a node has three neighbours (A, A, B) with count=3 and A is being removed, the second A is not checked. i=0: (A, A, B) -> (A, B) with count=2 ^ checked i=1: (A, B) -> (A, B) with count=2 ^ checked (B, not A!) i=2: (doesn't occur because i < count is false) This leaves the second A in the array with count=2, but the rose_neigh structure has been freed. Code that accesses these entries assumes that the first `count` entries are valid pointers, causing a use-after-free when it accesses the dangling pointer. Fix both issues by iterating over the array in reverse order with a fixed loop bound. This ensures that all entries are examined and that the removal of an entry doesn't affect subsequent iterations. Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=e04e2c007ba2c80476cb Tested-by: [email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kohei Enju <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Mateusz Jończyk <[email protected]> Date: Sat Jun 7 23:06:08 2025 +0200 rtc: cmos: use spin_lock_irqsave in cmos_interrupt commit 00a39d8652ff9088de07a6fe6e9e1893452fe0dd upstream. cmos_interrupt() can be called in a non-interrupt context, such as in an ACPI event handler (which runs in an interrupt thread). Therefore, usage of spin_lock(&rtc_lock) is insecure. Use spin_lock_irqsave() / spin_unlock_irqrestore() instead. Before a misguided commit 6950d046eb6e ("rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ") the cmos_interrupt() function used spin_lock_irqsave(). That commit changed it to spin_lock() and broke locking, which was partially fixed in commit 13be2efc390a ("rtc: cmos: Disable irq around direct invocation of cmos_interrupt()") That second commit did not take account of the ACPI fixed event handler pathway, however. It introduced local_irq_disable() workarounds in cmos_check_wkalrm(), which can cause problems on PREEMPT_RT kernels and are now unnecessary. Add an explicit comment so that this change will not be reverted by mistake. Cc: [email protected] Fixes: 6950d046eb6e ("rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ") Signed-off-by: Mateusz Jończyk <[email protected]> Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Tested-by: Chris Bainbridge <[email protected]> Reported-by: Chris Bainbridge <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Hugo Villeneuve <[email protected]> Date: Thu May 29 16:29:22 2025 -0400 rtc: pcf2127: add missing semicolon after statement commit 08d82d0cad51c2b1d454fe41ea1ff96ade676961 upstream. Replace comma with semicolon at the end of the statement when setting config.max_register. Fixes: fd28ceb4603f ("rtc: pcf2127: add variant-specific configuration structure") Cc: [email protected] Cc: Elena Popa <[email protected]> Signed-off-by: Hugo Villeneuve <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Elena Popa <[email protected]> Date: Fri May 30 13:40:00 2025 +0300 rtc: pcf2127: fix SPI command byte for PCF2131 commit fa78e9b606a472495ef5b6b3d8b45c37f7727f9d upstream. PCF2131 was not responding to read/write operations using SPI. PCF2131 has a different command byte definition, compared to PCF2127/29. Added the new command byte definition when PCF2131 is detected. Fixes: afc505bf9039 ("rtc: pcf2127: add support for PCF2131 RTC") Cc: [email protected] Signed-off-by: Elena Popa <[email protected]> Acked-by: Hugo Villeneuve <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Niklas Schnelle <[email protected]> Date: Wed Jun 25 11:28:29 2025 +0200 s390/pci: Do not try re-enabling load/store if device is disabled commit b97a7972b1f4f81417840b9a2ab0c19722b577d5 upstream. If a device is disabled unblocking load/store on its own is not useful as a full re-enable of the function is necessary anyway. Note that SCLP Write Event Data Action Qualifier 0 (Reset) leaves the device disabled and triggers this case unless the driver already requests a reset. Cc: [email protected] Fixes: 4cdf2f4e24ff ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Farhan Ali <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Niklas Schnelle <[email protected]> Date: Wed Jun 25 11:28:28 2025 +0200 s390/pci: Fix stale function handles in error handling commit 45537926dd2aaa9190ac0fac5a0fbeefcadfea95 upstream. The error event information for PCI error events contains a function handle for the respective function. This handle is generally captured at the time the error event was recorded. Due to delays in processing or cascading issues, it may happen that during firmware recovery multiple events are generated. When processing these events in order Linux may already have recovered an affected function making the event information stale. Fix this by doing an unconditional CLP List PCI function retrieving the current function handle with the zdev->state_lock held and ignoring the event if its function handle is stale. Cc: [email protected] Fixes: 4cdf2f4e24ff ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Julian Ruess <[email protected]> Reviewed-by: Gerd Bayer <[email protected]> Reviewed-by: Farhan Ali <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Christoph Hellwig <[email protected]> Date: Tue Jun 24 14:52:28 2025 +0200 scsi: core: Enforce unlimited max_segment_size when virt_boundary_mask is set [ Upstream commit 4937e604ca24c41cae3296d069c871c2f3f519c8 ] The virt_boundary_mask limit requires an unlimited max_segment_size for bio splitting to not corrupt data. Historically, the block layer tried to validate this, although the check was half-hearted until the addition of the atomic queue limits API. The full blown check then triggered issues with stacked devices incorrectly inheriting limits such as the virt boundary and got disabled in commit b561ea56a264 ("block: allow device to have both virt_boundary_mask and max segment size") instead of fixing the issue properly. Ensure that the SCSI mid layer doesn't set the default low max_segment_size limit for this case, and check for invalid max_segment_size values in the host template, similar to the original block layer check given that SCSI devices can't be stacked. This fixes reported data corruption on storvsc, although as far as I can tell storvsc always failed to properly set the max_segment_size limit as the SCSI APIs historically applied that when setting up the host, while storvsc only set the virt_boundary_mask when configuring the scsi_device. Fixes: 81988a0e6b03 ("storvsc: get rid of bounce buffer") Fixes: b561ea56a264 ("block: allow device to have both virt_boundary_mask and max segment size") Reported-by: Ming Lei <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: John Garry <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Thomas Fourier <[email protected]> Date: Tue Jun 17 18:11:11 2025 +0200 scsi: qla2xxx: Fix DMA mapping test in qla24xx_get_port_database() [ Upstream commit c3b214719a87735d4f67333a8ef3c0e31a34837c ] dma_map_XXX() functions return as error values DMA_MAPPING_ERROR which is often ~0. The error value should be tested with dma_mapping_error() like it was done in qla26xx_dport_diagnostics(). Fixes: 818c7f87a177 ("scsi: qla2xxx: Add changes in preparation for vendor extended FDMI/RDP") Signed-off-by: Thomas Fourier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Thomas Fourier <[email protected]> Date: Wed Jun 18 09:17:37 2025 +0200 scsi: qla4xxx: Fix missing DMA mapping error in qla4xxx_alloc_pdu() [ Upstream commit 00f452a1b084efbe8dcb60a29860527944a002a1 ] dma_map_XXX() can fail and should be tested for errors with dma_mapping_error(). Fixes: b3a271a94d00 ("[SCSI] qla4xxx: support iscsiadm session mgmt") Signed-off-by: Thomas Fourier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: jackysliu <[email protected]> Date: Thu Jun 19 12:03:02 2025 +0800 scsi: sd: Fix VPD page 0xb7 length check [ Upstream commit 8889676cd62161896f1d861ce294adc29c4f2cb5 ] sd_read_block_limits_ext() currently assumes that vpd->len excludes the size of the page header. However, vpd->len describes the size of the entire VPD page, therefore the sanity check is incorrect. In practice this is not really a problem since we don't attach VPD pages unless they actually report data trailing the header. But fix the length check regardless. This issue was identified by Wukong-Agent (formerly Tencent Woodpecker), a code security AI agent, through static code analysis. [mkp: rewrote patch description] Signed-off-by: jackysliu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: 96b171d6dba6 ("scsi: core: Query the Block Limits Extension VPD page") Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Maurizio Lombardi <[email protected]> Date: Thu Jun 12 12:15:56 2025 +0200 scsi: target: Fix NULL pointer dereference in core_scsi3_decode_spec_i_port() [ Upstream commit d8ab68bdb294b09a761e967dad374f2965e1913f ] The function core_scsi3_decode_spec_i_port(), in its error code path, unconditionally calls core_scsi3_lunacl_undepend_item() passing the dest_se_deve pointer, which may be NULL. This can lead to a NULL pointer dereference if dest_se_deve remains unset. SPC-3 PR SPEC_I_PT: Unable to locate dest_tpg Unable to handle kernel paging request at virtual address dfff800000000012 Call trace: core_scsi3_lunacl_undepend_item+0x2c/0xf0 [target_core_mod] (P) core_scsi3_decode_spec_i_port+0x120c/0x1c30 [target_core_mod] core_scsi3_emulate_pro_register+0x6b8/0xcd8 [target_core_mod] target_scsi3_emulate_pr_out+0x56c/0x840 [target_core_mod] Fix this by adding a NULL check before calling core_scsi3_lunacl_undepend_item() Signed-off-by: Maurizio Lombardi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Reviewed-by: John Meneghini <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Bart Van Assche <[email protected]> Date: Tue Jun 24 11:16:44 2025 -0700 scsi: ufs: core: Fix spelling of a sysfs attribute name [ Upstream commit 021f243627ead17eb6500170256d3d9be787dad8 ] Change "resourse" into "resource" in the name of a sysfs attribute. Fixes: d829fc8a1058 ("scsi: ufs: sysfs: unit descriptor") Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Avri Altman <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Paulo Alcantara <[email protected]> Date: Thu Jul 3 17:57:19 2025 -0300 smb: client: fix native SMB symlink traversal [ Upstream commit 3363da82e02f1bddc54faa92ea430c6532e2cd2e ] We've seen customers having shares mounted in paths like /??/C:/ or /??/UNC/foo.example.com/share in order to get their native SMB symlinks successfully followed from different mounts. After commit 12b466eb52d9 ("cifs: Fix creating and resolving absolute NT-style symlinks"), the client would then convert absolute paths from "/??/C:/" to "/mnt/c/" by default. The absolute paths would vary depending on the value of symlinkroot= mount option. Fix this by restoring old behavior of not trying to convert absolute paths by default. Only do this if symlinkroot= was _explicitly_ set. Before patch: $ mount.cifs //w22-fs0/test2 /mnt/1 -o vers=3.1.1,username=xxx,password=yyy $ ls -l /mnt/1/symlink2 lrwxr-xr-x 1 root root 15 Jun 20 14:22 /mnt/1/symlink2 -> /mnt/c/testfile $ mkdir -p /??/C:; echo foo > //??/C:/testfile $ cat /mnt/1/symlink2 cat: /mnt/1/symlink2: No such file or directory After patch: $ mount.cifs //w22-fs0/test2 /mnt/1 -o vers=3.1.1,username=xxx,password=yyy $ ls -l /mnt/1/symlink2 lrwxr-xr-x 1 root root 15 Jun 20 14:22 /mnt/1/symlink2 -> '/??/C:/testfile' $ mkdir -p /??/C:; echo foo > //??/C:/testfile $ cat /mnt/1/symlink2 foo Cc: [email protected] Reported-by: Pierguido Lambri <[email protected]> Cc: David Howells <[email protected]> Cc: Stefan Metzmacher <[email protected]> Fixes: 12b466eb52d9 ("cifs: Fix creating and resolving absolute NT-style symlinks") Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Wang Zhaolong <[email protected]> Date: Thu Jul 3 21:29:52 2025 +0800 smb: client: fix race condition in negotiate timeout by using more precise timing [ Upstream commit 266b5d02e14f3a0e07414e11f239397de0577a1d ] When the SMB server reboots and the client immediately accesses the mount point, a race condition can occur that causes operations to fail with "Host is down" error. Reproduction steps: # Mount SMB share mount -t cifs //192.168.245.109/TEST /mnt/ -o xxxx ls /mnt # Reboot server ssh [email protected] reboot ssh [email protected] /path/to/cifs_server_setup.sh ssh [email protected] systemctl stop firewalld # Immediate access fails ls /mnt ls: cannot access '/mnt': Host is down # But works if there is a delay The issue is caused by a race condition between negotiate and reconnect. The 20-second negotiate timeout mechanism can interfere with the normal recovery process when both are triggered simultaneously. ls cifsd --------------------------------------------------- cifs_getattr cifs_revalidate_dentry cifs_get_inode_info cifs_get_fattr smb2_query_path_info smb2_compound_op SMB2_open_init smb2_reconnect cifs_negotiate_protocol smb2_negotiate cifs_send_recv smb_send_rqst wait_for_response cifs_demultiplex_thread cifs_read_from_socket cifs_readv_from_socket server_unresponsive cifs_reconnect __cifs_reconnect cifs_abort_connection mid->mid_state = MID_RETRY_NEEDED cifs_wake_up_task cifs_sync_mid_result // case MID_RETRY_NEEDED rc = -EAGAIN; // In smb2_negotiate() rc = -EHOSTDOWN; The server_unresponsive() timeout triggers cifs_reconnect(), which aborts ongoing mid requests and causes the ls command to receive -EAGAIN, leading to -EHOSTDOWN. Fix this by introducing a dedicated `neg_start` field to precisely tracks when the negotiate process begins. The timeout check now uses this accurate timestamp instead of `lstrp`, ensuring that: 1. Timeout is only triggered after negotiate has actually run for 20s 2. The mechanism doesn't interfere with concurrent recovery processes 3. Uninitialized timestamps (value 0) don't trigger false timeouts Fixes: 7ccc1465465d ("smb: client: fix hang in wait_for_response() for negproto") Signed-off-by: Wang Zhaolong <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Philipp Kerling <[email protected]> Date: Sun Jun 29 19:05:05 2025 +0200 smb: client: fix readdir returning wrong type with POSIX extensions commit b8f89cb723b9e66f5dbd7199e4036fee34fb0de0 upstream. When SMB 3.1.1 POSIX Extensions are negotiated, userspace applications using readdir() or getdents() calls without stat() on each individual file (such as a simple "ls" or "find") would misidentify file types and exhibit strange behavior such as not descending into directories. The reason for this behavior is an oversight in the cifs_posix_to_fattr conversion function. Instead of extracting the entry type for cf_dtype from the properly converted cf_mode field, it tries to extract the type from the PDU. While the wire representation of the entry mode is similar in structure to POSIX stat(), the assignments of the entry types are different. Applying the S_DT macro to cf_mode instead yields the correct result. This is also what the equivalent function smb311_posix_info_to_fattr in inode.c already does for stat() etc.; which is why "ls -l" would give the correct file type but "ls" would not (as identified by the colors). Cc: [email protected] Signed-off-by: Philipp Kerling <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Paulo Alcantara <[email protected]> Date: Wed Jun 25 00:00:17 2025 -0300 smb: client: fix warning when reconnecting channel [ Upstream commit 3bbe46716092d8ef6b0df4b956f585c5cd0fc78e ] When reconnecting a channel in smb2_reconnect_server(), a dummy tcon is passed down to smb2_reconnect() with ->query_interface uninitialized, so we can't call queue_delayed_work() on it. Fix the following warning by ensuring that we're queueing the delayed worker from correct tcon. WARNING: CPU: 4 PID: 1126 at kernel/workqueue.c:2498 __queue_delayed_work+0x1d2/0x200 Modules linked in: cifs cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 4 UID: 0 PID: 1126 Comm: kworker/4:0 Not tainted 6.16.0-rc3 #5 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-4.fc42 04/01/2014 Workqueue: cifsiod smb2_reconnect_server [cifs] RIP: 0010:__queue_delayed_work+0x1d2/0x200 Code: 41 5e 41 5f e9 7f ee ff ff 90 0f 0b 90 e9 5d ff ff ff bf 02 00 00 00 e8 6c f3 07 00 89 c3 eb bd 90 0f 0b 90 e9 57 f> 0b 90 e9 65 fe ff ff 90 0f 0b 90 e9 72 fe ff ff 90 0f 0b 90 e9 RSP: 0018:ffffc900014afad8 EFLAGS: 00010003 RAX: 0000000000000000 RBX: ffff888124d99988 RCX: ffffffff81399cc1 RDX: dffffc0000000000 RSI: ffff888114326e00 RDI: ffff888124d999f0 RBP: 000000000000ea60 R08: 0000000000000001 R09: ffffed10249b3331 R10: ffff888124d9998f R11: 0000000000000004 R12: 0000000000000040 R13: ffff888114326e00 R14: ffff888124d999d8 R15: ffff888114939020 FS: 0000000000000000(0000) GS:ffff88829f7fe000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe7a2b4038 CR3: 0000000120a6f000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> queue_delayed_work_on+0xb4/0xc0 smb2_reconnect+0xb22/0xf50 [cifs] smb2_reconnect_server+0x413/0xd40 [cifs] ? __pfx_smb2_reconnect_server+0x10/0x10 [cifs] ? local_clock_noinstr+0xd/0xd0 ? local_clock+0x15/0x30 ? lock_release+0x29b/0x390 process_one_work+0x4c5/0xa10 ? __pfx_process_one_work+0x10/0x10 ? __list_add_valid_or_report+0x37/0x120 worker_thread+0x2f1/0x5a0 ? __kthread_parkme+0xde/0x100 ? __pfx_worker_thread+0x10/0x10 kthread+0x1fe/0x380 ? kthread+0x10f/0x380 ? __pfx_kthread+0x10/0x10 ? local_clock_noinstr+0xd/0xd0 ? ret_from_fork+0x1b/0x1f0 ? local_clock+0x15/0x30 ? lock_release+0x29b/0x390 ? rcu_is_watching+0x20/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x15b/0x1f0 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 1116206 hardirqs last enabled at (1116205): [<ffffffff8143af42>] __up_console_sem+0x52/0x60 hardirqs last disabled at (1116206): [<ffffffff81399f0e>] queue_delayed_work_on+0x6e/0xc0 softirqs last enabled at (1116138): [<ffffffffc04562fd>] __smb_send_rqst+0x42d/0x950 [cifs] softirqs last disabled at (1116136): [<ffffffff823d35e1>] release_sock+0x21/0xf0 Cc: [email protected] Reported-by: David Howells <[email protected]> Fixes: 42ca547b13a2 ("cifs: do not disable interface polling on failure") Reviewed-by: David Howells <[email protected]> Tested-by: David Howells <[email protected]> Reviewed-by: Shyam Prasad N <[email protected]> Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: Steve French <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Paulo Alcantara <[email protected]> Date: Tue Jul 1 17:38:42 2025 +0100 smb: client: set missing retry flag in cifs_readv_callback() [ Upstream commit 0e60bae24ad28ab06a485698077d3c626f1e54ab ] Set NETFS_SREQ_NEED_RETRY flag to tell netfslib that the subreq needs to be retried. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Tested-by: Steve French <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Paulo Alcantara <[email protected]> Date: Tue Jul 1 17:38:43 2025 +0100 smb: client: set missing retry flag in cifs_writev_callback() [ Upstream commit 74ee76bea4b445c023d04806e0bcd78a912fd30b ] Set NETFS_SREQ_NEED_RETRY flag to tell netfslib that the subreq needs to be retried. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Tested-by: Steve French <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Paulo Alcantara <[email protected]> Date: Tue Jul 1 17:38:41 2025 +0100 smb: client: set missing retry flag in smb2_writev_callback() [ Upstream commit e67e75edeb88022c04f8e0a173e1ff6dc688f155 ] Set NETFS_SREQ_NEED_RETRY flag to tell netfslib that the subreq needs to be retried. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Tested-by: Steve French <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: James Clark <[email protected]> Date: Fri Jun 27 11:21:37 2025 +0100 spi: spi-fsl-dspi: Clear completion counter before initiating transfer [ Upstream commit fa60c094c19b97e103d653f528f8d9c178b6a5f5 ] In target mode, extra interrupts can be received between the end of a transfer and halting the module if the host continues sending more data. If the interrupt from this occurs after the reinit_completion() then the completion counter is left at a non-zero value. The next unrelated transfer initiated by userspace will then complete immediately without waiting for the interrupt or writing to the RX buffer. Fix it by resetting the counter before the transfer so that lingering values are cleared. This is done after clearing the FIFOs and the status register but before the transfer is initiated, so no interrupts should be received at this point resulting in other race conditions. Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion") Signed-off-by: James Clark <[email protected]> Reviewed-by: Frank Li <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Gabor Juhos <[email protected]> Date: Wed Jun 18 22:22:49 2025 +0200 spi: spi-qpic-snand: reallocate BAM transactions [ Upstream commit d85d0380292a7e618915069c3579ae23c7c80339 ] Using the mtd_nandbiterrs module for testing the driver occasionally results in weird things like below. 1. swiotlb mapping fails with the following message: [ 85.926216] qcom_snand 79b0000.spi: swiotlb buffer is full (sz: 4294967294 bytes), total 512 (slots), used 0 (slots) [ 85.932937] qcom_snand 79b0000.spi: failure in mapping desc [ 87.999314] qcom_snand 79b0000.spi: failure to write raw page [ 87.999352] mtd_nandbiterrs: error: write_oob failed (-110) Rebooting the board after this causes a panic due to a NULL pointer dereference. 2. If the swiotlb mapping does not fail, rebooting the board may result in a different panic due to a bad spinlock magic: [ 256.104459] BUG: spinlock bad magic on CPU#3, procd/2241 [ 256.104488] Unable to handle kernel paging request at virtual address ffffffff0000049b ... Investigating the issue revealed that these symptoms are results of memory corruption which is caused by out of bounds access within the driver. The driver uses a dynamically allocated structure for BAM transactions, which structure must have enough space for all possible variations of different flash operations initiated by the driver. The required space heavily depends on the actual number of 'codewords' which is calculated from the pagesize of the actual NAND chip. Although the qcom_nandc_alloc() function allocates memory for the BAM transactions during probe, but since the actual number of 'codewords' is not yet know the allocation is done for one 'codeword' only. Because of this, whenever the driver does a flash operation, and the number of the required transactions exceeds the size of the allocated arrays the driver accesses memory out of the allocated range. To avoid this, change the code to free the initially allocated BAM transactions memory, and allocate a new one once the actual number of 'codewords' required for a given NAND chip is known. Fixes: 7304d1909080 ("spi: spi-qpic: add driver for QCOM SPI NAND flash Interface") Reviewed-by: Md Sadre Alam <[email protected]> Signed-off-by: Gabor Juhos <[email protected]> Link: https://patch.msgid.link/20250618-qpic-snand-avoid-mem-corruption-v3-1-319c71296cda@gmail.com Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Heikki Krogerus <[email protected]> Date: Wed Jun 11 14:14:15 2025 +0300 usb: acpi: fix device link removal commit 3b18405763c1ebb1efc15feef5563c9cdb2cc3a7 upstream. The device link to the USB4 host interface has to be removed manually since it's no longer auto removed. Fixes: 623dae3e7084 ("usb: acpi: fix boot hang due to early incorrect 'tunneled' USB3 device links") Cc: stable <[email protected]> Signed-off-by: Heikki Krogerus <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Peter Chen <[email protected]> Date: Thu Jun 19 09:34:13 2025 +0800 usb: cdnsp: do not disable slot for disabled slot commit 7e2c421ef88e9da9c39e01496b7f5b0b354b42bc upstream. It doesn't need to do it, and the related command event returns 'Slot Not Enabled Error' status. Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <[email protected]> Suggested-by: Hongliang Yang <[email protected]> Reviewed-by: Fugang Duan <[email protected]> Signed-off-by: Peter Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Pawel Laszczak <[email protected]> Date: Fri Jun 20 08:23:12 2025 +0000 usb: cdnsp: Fix issue with CV Bad Descriptor test commit 2831a81077f5162f104ba5a97a7d886eb371c21c upstream. The SSP2 controller has extra endpoint state preserve bit (ESP) which setting causes that endpoint state will be preserved during Halt Endpoint command. It is used only for EP0. Without this bit the Command Verifier "TD 9.10 Bad Descriptor Test" failed. Setting this bit doesn't have any impact for SSP controller. Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <[email protected]> Signed-off-by: Pawel Laszczak <[email protected]> Acked-by: Peter Chen <[email protected]> Link: https://lore.kernel.org/r/PH7PR07MB95382CCD50549DABAEFD6156DD7CA@PH7PR07MB9538.namprd07.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Xu Yang <[email protected]> Date: Sat Jun 14 20:49:14 2025 +0800 usb: chipidea: udc: disconnect/reconnect from host when do suspend/resume commit 31a6afbe86e8e9deba9ab53876ec49eafc7fd901 upstream. Shawn and John reported a hang issue during system suspend as below: - USB gadget is enabled as Ethernet - There is data transfer over USB Ethernet (scp a big file between host and device) - Device is going in/out suspend (echo mem > /sys/power/state) The root cause is the USB device controller is suspended but the USB bus is still active which caused the USB host continues to transfer data with device and the device continues to queue USB requests (in this case, a delayed TCP ACK packet trigger the issue) after controller is suspended, however the USB controller clock is already gated off. Then if udc driver access registers after that point, the system will hang. The correct way to avoid such issue is to disconnect device from host when the USB bus is not at suspend state. Then the host will receive disconnect event and stop data transfer in time. To continue make USB gadget device work after system resume, this will reconnect device automatically. To make usb wakeup work if USB bus is already at suspend state, this will keep connection for it only when USB device controller has enabled wakeup capability. Reported-by: Shawn Guo <[email protected]> Reported-by: John Ernberg <[email protected]> Closes: https://lore.kernel.org/linux-usb/aEZxmlHmjeWcXiF3@dragon/ Tested-by: John Ernberg <[email protected]> # iMX8QXP Fixes: 235ffc17d014 ("usb: chipidea: udc: add suspend/resume support for device controller") Cc: stable <[email protected]> Reviewed-by: Jun Li <[email protected]> Signed-off-by: Xu Yang <[email protected]> Acked-by: Peter Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Kuen-Han Tsai <[email protected]> Date: Wed May 28 18:03:11 2025 +0800 usb: dwc3: Abort suspend on soft disconnect failure commit 630a1dec3b0eba2a695b9063f1c205d585cbfec9 upstream. When dwc3_gadget_soft_disconnect() fails, dwc3_suspend_common() keeps going with the suspend, resulting in a period where the power domain is off, but the gadget driver remains connected. Within this time frame, invoking vbus_event_work() will cause an error as it attempts to access DWC3 registers for endpoint disabling after the power domain has been completely shut down. Abort the suspend sequence when dwc3_gadget_suspend() cannot halt the controller and proceeds with a soft connect. Fixes: 9f8a67b65a49 ("usb: dwc3: gadget: fix gadget suspend/resume") Cc: stable <[email protected]> Acked-by: Thinh Nguyen <[email protected]> Signed-off-by: Kuen-Han Tsai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: SCHNEIDER Johannes <[email protected]> Date: Wed Jun 25 07:49:16 2025 +0000 usb: dwc3: gadget: Fix TRB reclaim logic for short transfers and ZLPs commit 80e08394377559ed5a2ccadd861e62d24b826911 upstream. Commit 96c7bf8f6b3e ("usb: dwc3: gadget: Cleanup SG handling") updated the TRB reclaim path to use the TRB CHN (Chain) bit to determine whether a TRB was part of a chain. However, this inadvertently changed the behavior of reclaiming the final TRB in some scatter-gather or short transfer cases. In particular, if the final TRB did not have the CHN bit set, the cleanup path could incorrectly skip clearing the HWO (Hardware Own) bit, leaving stale TRBs in the ring. This resulted in broken data transfer completions in userspace, notably for MTP over FunctionFS. Fix this by unconditionally clearing the HWO bit during TRB reclaim, regardless of the CHN bit state. This restores correct behavior especially for transfers that require ZLPs or end on non-CHN TRBs. Fixes: 61440628a4ff ("usb: dwc3: gadget: Cleanup SG handling") Signed-off-by: Johannes Schneider <[email protected]> Acked-by: Thinh Nguyen <[email protected]> Cc: stable <[email protected]> Link: https://lore.kernel.org/r/AM8PR06MB7521A29A8863C838B54987B6BC7BA@AM8PR06MB7521.eurprd06.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: RD Babiera <[email protected]> Date: Wed Jun 18 22:49:42 2025 +0000 usb: typec: altmodes/displayport: do not index invalid pin_assignments commit af4db5a35a4ef7a68046883bfd12468007db38f1 upstream. A poorly implemented DisplayPort Alt Mode port partner can indicate that its pin assignment capabilities are greater than the maximum value, DP_PIN_ASSIGN_F. In this case, calls to pin_assignment_show will cause a BRK exception due to an out of bounds array access. Prevent for loop in pin_assignment_show from accessing invalid values in pin_assignments by adding DP_PIN_ASSIGN_MAX value in typec_dp.h and using i < DP_PIN_ASSIGN_MAX as a loop condition. Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode") Cc: stable <[email protected]> Signed-off-by: RD Babiera <[email protected]> Reviewed-by: Badhri Jagan Sridharan <[email protected]> Reviewed-by: Heikki Krogerus <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Andrei Kuchynski <[email protected]> Date: Tue Jun 24 13:32:46 2025 +0000 usb: typec: displayport: Fix potential deadlock commit 099cf1fbb8afc3771f408109f62bdec66f85160e upstream. The deadlock can occur due to a recursive lock acquisition of `cros_typec_altmode_data::mutex`. The call chain is as follows: 1. cros_typec_altmode_work() acquires the mutex 2. typec_altmode_vdm() -> dp_altmode_vdm() -> 3. typec_altmode_exit() -> cros_typec_altmode_exit() 4. cros_typec_altmode_exit() attempts to acquire the mutex again To prevent this, defer the `typec_altmode_exit()` call by scheduling it rather than calling it directly from within the mutex-protected context. Cc: stable <[email protected]> Fixes: b4b38ffb38c9 ("usb: typec: displayport: Receive DP Status Update NAK request exit dp altmode") Signed-off-by: Andrei Kuchynski <[email protected]> Reviewed-by: Heikki Krogerus <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Raju Rangoju <[email protected]> Date: Fri Jun 27 17:41:19 2025 +0300 usb: xhci: quirk for data loss in ISOC transfers commit cbc889ab0122366f6cdbe3c28d477c683ebcebc2 upstream. During the High-Speed Isochronous Audio transfers, xHCI controller on certain AMD platforms experiences momentary data loss. This results in Missed Service Errors (MSE) being generated by the xHCI. The root cause of the MSE is attributed to the ISOC OUT endpoint being omitted from scheduling. This can happen when an IN endpoint with a 64ms service interval either is pre-scheduled prior to the ISOC OUT endpoint or the interval of the ISOC OUT endpoint is shorter than that of the IN endpoint. Consequently, the OUT service is neglected when an IN endpoint with a service interval exceeding 32ms is scheduled concurrently (every 64ms in this scenario). This issue is particularly seen on certain older AMD platforms. To mitigate this problem, it is recommended to adjust the service interval of the IN endpoint to not exceed 32ms (interval 8). This adjustment ensures that the OUT endpoint will not be bypassed, even if a smaller interval value is utilized. Cc: stable <[email protected]> Signed-off-by: Raju Rangoju <[email protected]> Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Roy Luo <[email protected]> Date: Thu May 22 19:09:11 2025 +0000 usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removed commit 3eff494f6e17abf932699483f133a708ac0355dc upstream. xhci_reset() currently returns -ENODEV if XHCI_STATE_REMOVING is set, without completing the xhci handshake, unless the reset completes exceptionally quickly. This behavior causes a regression on Synopsys DWC3 USB controllers with dual-role capabilities. Specifically, when a DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then attempts to configure its hardware for device mode, the ongoing, incomplete reset leads to critical register access issues. All register reads return zero, not just within the xHCI register space (which might be expected during a reset), but across the entire DWC3 IP block. This patch addresses the issue by preventing xhci_reset() from being called in xhci_resume() and bailing out early in the reinit flow when XHCI_STATE_REMOVING is set. Cc: stable <[email protected]> Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") Suggested-by: Mathias Nyman <[email protected]> Signed-off-by: Roy Luo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Bui Quang Minh <[email protected]> Date: Mon Jun 30 21:42:10 2025 +0700 virtio-net: ensure the received length does not exceed allocated size commit 315dbdd7cdf6aa533829774caaf4d25f1fd20e73 upstream. In xdp_linearize_page, when reading the following buffers from the ring, we forget to check the received length with the true allocate size. This can lead to an out-of-bound read. This commit adds that missing check. Cc: <[email protected]> Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set") Signed-off-by: Bui Quang Minh <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Bui Quang Minh <[email protected]> Date: Mon Jun 30 22:13:14 2025 +0700 virtio-net: xsk: rx: fix the frame's length check commit 5177373c31318c3c6a190383bfd232e6cf565c36 upstream. When calling buf_to_xdp, the len argument is the frame data's length without virtio header's length (vi->hdr_len). We check that len with xsk_pool_get_rx_frame_size() + vi->hdr_len to ensure the provided len does not larger than the allocated chunk size. The additional vi->hdr_len is because in virtnet_add_recvbuf_xsk, we use part of XDP_PACKET_HEADROOM for virtio header and ask the vhost to start placing data from hard_start + XDP_PACKET_HEADROOM - vi->hdr_len not hard_start + XDP_PACKET_HEADROOM But the first buffer has virtio_header, so the maximum frame's length in the first buffer can only be xsk_pool_get_rx_frame_size() not xsk_pool_get_rx_frame_size() + vi->hdr_len like in the current check. This commit adds an additional argument to buf_to_xdp differentiate between the first buffer and other ones to correctly calculate the maximum frame's length. Cc: [email protected] Reviewed-by: Xuan Zhuo <[email protected]> Fixes: a4e7ba702701 ("virtio_net: xsk: rx: support recv small mode") Signed-off-by: Bui Quang Minh <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: HarshaVardhana S A <[email protected]> Date: Tue Jul 1 14:22:54 2025 +0200 vsock/vmci: Clear the vmci transport packet properly when initializing it commit 223e2288f4b8c262a864e2c03964ffac91744cd5 upstream. In vmci_transport_packet_init memset the vmci_transport_packet before populating the fields to avoid any uninitialised data being left in the structure. Cc: Bryan Tan <[email protected]> Cc: Vishnu Dasa <[email protected]> Cc: Broadcom internal kernel review list Cc: Stefano Garzarella <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Simon Horman <[email protected]> Cc: [email protected] Cc: [email protected] Cc: stable <[email protected]> Signed-off-by: HarshaVardhana S A <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Acked-by: Stefano Garzarella <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Johannes Berg <[email protected]> Date: Tue Jun 17 11:45:29 2025 +0200 wifi: ath6kl: remove WARN on bad firmware input [ Upstream commit e7417421d89358da071fd2930f91e67c7128fbff ] If the firmware gives bad input, that's nothing to do with the driver's stack at this point etc., so the WARN_ON() doesn't add any value. Additionally, this is one of the top syzbot reports now. Just print a message, and as an added bonus, print the sizes too. Reported-by: [email protected] Tested-by: [email protected] Acked-by: Jeff Johnson <[email protected]> Link: https://patch.msgid.link/20250617114529.031a677a348e.I58bf1eb4ac16a82c546725ff010f3f0d2b0cca49@changeid Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Johannes Berg <[email protected]> Date: Mon Jun 16 17:18:38 2025 +0200 wifi: mac80211: drop invalid source address OCB frames [ Upstream commit d1b1a5eb27c4948e8811cf4dbb05aaf3eb10700c ] In OCB, don't accept frames from invalid source addresses (and in particular don't try to create stations for them), drop the frames instead. Reported-by: [email protected] Closes: https://lore.kernel.org/r/[email protected]/ Signed-off-by: Johannes Berg <[email protected]> Tested-by: [email protected] Link: https://patch.msgid.link/20250616171838.7433379cab5d.I47444d63c72a0bd58d2e2b67bb99e1fea37eec6f@changeid Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Borislav Petkov (AMD) <[email protected]> Date: Wed Sep 11 10:53:08 2024 +0200 x86/bugs: Add a Transient Scheduler Attacks mitigation Commit d8010d4ba43e9f790925375a7de100604a5e2dba upstream. Add the required features detection glue to bugs.c et all in order to support the TSA mitigation. Co-developed-by: Kim Phillips <[email protected]> Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Pawan Gupta <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Borislav Petkov (AMD) <[email protected]> Date: Wed Sep 11 05:13:46 2024 +0200 x86/bugs: Rename MDS machinery to something more generic Commit f9af88a3d384c8b55beb5dc5483e5da0135fadbd upstream. It will be used by other x86 mitigations. No functional changes. Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Pawan Gupta <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Borislav Petkov (AMD) <[email protected]> Date: Thu Mar 27 12:23:55 2025 +0100 x86/microcode/AMD: Add TSA microcode SHAs Commit 2329f250e04d3b8e78b36a68b9880ca7750a07ef upstream. Signed-off-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Jake Hillion <[email protected]> Date: Thu Jun 5 19:09:26 2025 +0100 x86/platform/amd: move final timeout check to after final sleep [ Upstream commit f8afb12a2d7503de6558c23cacd7acbf6e9fe678 ] __hsmp_send_message sleeps between result read attempts and has a timeout of 100ms. Under extreme load it's possible for these sleeps to take a long time, exceeding the 100ms. In this case the current code does not check the register and fails with ETIMEDOUT. Refactor the loop to ensure there is at least one read of the register after a sleep of any duration. This removes instances of ETIMEDOUT with a single caller, even with a misbehaving scheduler. Tested on AMD Bergamo machines. Suggested-by: Blaise Sanouillet <[email protected]> Reviewed-by: Suma Hegde <[email protected]> Tested-by: Suma Hegde <[email protected]> Signed-off-by: Jake Hillion <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Author: Borislav Petkov (AMD) <[email protected]> Date: Mon Apr 14 15:33:19 2025 +0200 x86/process: Move the buffer clearing before MONITOR Commit 8e786a85c0a3c0fffae6244733fb576eeabd9dec upstream. Move the VERW clearing before the MONITOR so that VERW doesn't disarm it and the machine never enters C1. Original idea by Kim Phillips <[email protected]>. Suggested-by: Andrew Cooper <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Darrick J. Wong <[email protected]> Date: Thu Jun 12 10:51:12 2025 -0700 xfs: actually use the xfs_growfs_check_rtgeom tracepoint commit db44d088a5ab030b741a3adf2e7b181a8a6dcfbe upstream. We created a new tracepoint but forgot to put it in. Fix that. Cc: [email protected] Cc: [email protected] # v6.14 Fixes: 59a57acbce282d ("xfs: check that the rtrmapbt maxlevels doesn't increase when growing fs") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Carlos Maiolino <[email protected]> Reported-by: Steven Rostedt <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Mathias Nyman <[email protected]> Date: Fri Jun 27 17:41:22 2025 +0300 xhci: dbc: Flush queued requests before stopping dbc commit efe3e3ae5a66cb38ef29c909e951b4039044bae9 upstream. Flush dbc requests when dbc is stopped and transfer rings are freed. Failure to flush them lead to leaking memory and dbc completing odd requests after resuming from suspend, leading to error messages such as: [ 95.344392] xhci_hcd 0000:00:0d.0: no matched request Cc: stable <[email protected]> Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver") Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Łukasz Bartosik <[email protected]> Date: Fri Jun 27 17:41:21 2025 +0300 xhci: dbctty: disable ECHO flag by default commit 2b857d69a5e116150639a0c6c39c86cc329939ee upstream. When /dev/ttyDBC0 device is created then by default ECHO flag is set for the terminal device. However if data arrives from a peer before application using /dev/ttyDBC0 applies its set of terminal flags then the arriving data will be echoed which might not be desired behavior. Fixes: 4521f1613940 ("xhci: dbctty: split dbc tty driver registration and unregistration functions.") Cc: stable <[email protected]> Signed-off-by: Łukasz Bartosik <[email protected]> Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/stable/20250610111802.18742-1-ukaszb%40chromium.org Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Author: Hongyu Xie <[email protected]> Date: Fri Jun 27 17:41:20 2025 +0300 xhci: Disable stream for xHC controller with XHCI_BROKEN_STREAMS commit cd65ee81240e8bc3c3119b46db7f60c80864b90b upstream. Disable stream for platform xHC controller with broken stream. Fixes: 14aec589327a6 ("storage: accept some UAS devices if streams are unavailable") Cc: stable <[email protected]> Signed-off-by: Hongyu Xie <[email protected]> Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>