You are not logged in.
Pages: 1
Ever since I got my ThinkCentre M715q and installed Arch (2 years by now), I've gotten random freezes. I have no clue why.
I originally thought it was panic, but after setting up kdumps and SysRq, I realized the whole system was just frozen, not even a panic.
Nothing much shows up in the journal, besides some IRQ #7 thing
Journal from a crash: https://0x0.st/PpN8.txt
Things tried:
I ran memtest for 24 hours and no issues.
I've stress-tested with `stress` for 24 hours (hdd, cpu, vm) and cannot get a reliable crash.
I've fully disabled swap.
I have the latest microcode.
I am using linux-lts.
I have earlyoom.
I would appreciate any help, and will try anything. One thing I did notice was that it happened twice when I was downloading large files for a long time, but once again I cant replicate it :C.
Linux 6.18.19-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 19 Mar 2026 15:56:44 +0000 x86_64 GNU/Linux # Static information about the filesystems.
# See fstab(5) for details.
# <file system> <dir> <type> <options> <dump> <pass>
# /dev/nvme0n1p2
UUID=ad6e819a-9e48-448d-a129-87bad23db1ef / btrfs rw,relatime,compress=zstd:3,ssd,discard=async,space_cache=v2,subvol=/ 0 0
# /dev/nvme0n1p1
UUID=0D0A-1287 /boot vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 2
# /dev/sda1
UUID=9af0a80d-63d0-4ca3-832c-1be8619b991d /mnt/HDD btrfs rw,relatime,compress=zstd:3,discard=async,space_cache=v2 0 0System:
Kernel: 6.18.19-1-lts arch: x86_64 bits: 64 compiler: gcc v: 15.2.1 clocksource: hpet
avail: acpi_pm parameters: BOOT_IMAGE=/vmlinuz-linux-lts
root=UUID=ad6e819a-9e48-448d-a129-87bad23db1ef rw zswap.enabled=0 rootfstype=btrfs
crashkernel=256M loglevel=3 pci=nommconf
Console: pty pts/1 Distro: Arch Linux
Machine:
Type: Mini-pc System: LENOVO product: 10VG0006US v: ThinkCentre M715q
serial: <superuser required> Chassis: type: 35 serial: <superuser required>
Mobo: LENOVO model: 3130 v: SDK0J40697 WIN 3305189998500 serial: <superuser required>
part-nu: LENOVO_MT_10VG_BU_Think_FM_ThinkCentre M715q uuid: <superuser required> Firmware: UEFI
vendor: LENOVO v: M1XKT39A date: 04/08/2019
CPU:
Info: model: AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics bits: 64 type: MT MCP arch: Zen
level: v3 note: check built: 2017-19 process: GF 14nm family: 0x17 (23) model-id: 0x11 (17)
stepping: 0 microcode: 0x8101007
Topology: cpus: 1x dies: 1 clusters: 1 cores: 4 threads: 8 tpc: 2 smt: enabled cache:
L1: 384 KiB desc: d-4x32 KiB; i-4x64 KiB L2: 2 MiB desc: 4x512 KiB L3: 4 MiB desc: 1x4 MiB
Speed (MHz): avg: 1499 min/max: 1600/3200 boost: enabled scaling: driver: acpi-cpufreq
governor: schedutil cores: 1: 1499 2: 1499 3: 1499 4: 1499 5: 1499 6: 1499 7: 1499 8: 1499
bogomips: 51120
Flags-basic: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Vulnerabilities:
Type: gather_data_sampling status: Not affected
Type: ghostwrite status: Not affected
Type: indirect_target_selection status: Not affected
Type: itlb_multihit status: Not affected
Type: l1tf status: Not affected
Type: mds status: Not affected
Type: meltdown status: Not affected
Type: mmio_stale_data status: Not affected
Type: old_microcode status: Not affected
Type: reg_file_data_sampling status: Not affected
Type: retbleed mitigation: untrained return thunk; SMT vulnerable
Type: spec_rstack_overflow mitigation: Safe RET
Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; STIBP: disabled; RSB filling;
PBRSB-eIBRS: Not affected; BHI: Not affected
Type: srbds status: Not affected
Type: tsa status: Not affected
Type: tsx_async_abort status: Not affected
Type: vmscape mitigation: IBPB before exit to userspace
Graphics:
Device-1: Advanced Micro Devices [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Mobile
Series] vendor: Lenovo driver: amdgpu v: kernel arch: GCN-5 code: Vega process: GF 14nm
built: 2017-20 pcie: gen: 3 speed: 8 GT/s lanes: 16 ports: active: DP-2 empty: DP-1,DP-3,DP-4
bus-ID: 04:00.0 chip-ID: 1002:15dd class-ID: 0300 temp: 51.0 C
Device-2: Logitech BRIO Ultra HD Webcam driver: snd-usb-audio,uvcvideo type: USB rev: 2.1
speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 2-3:2 chip-ID: 046d:085e class-ID: 0102
serial: <filter>
Device-3: Realtek RTL2838 DVB-T driver: usbfs type: USB rev: 2.0 speed: 480 Mb/s lanes: 1
mode: 2.0 bus-ID: 4-1.2:4 chip-ID: 0bda:2838 class-ID: 0000 serial: <filter>
Display: unspecified server: N/A driver: gpu: amdgpu tty: 213x54
Monitor-1: DP-2 model: Toshiba T749-fHD720 serial: <filter> built: 2011 res: 1920x1080
gamma: 1.2 size: 708x398mm (27.87x15.67") modes: max: 1920x1080 min: 640x480
API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi device: 1 drv: swrast
gbm: drv: radeonsi surfaceless: drv: radeonsi inactive: wayland,x11
API: OpenGL v: 4.6 compat-v: 4.5 vendor: mesa v: 26.0.3-arch1.1 note: console (EGL sourced)
renderer: AMD Radeon Vega 11 Graphics (radeonsi raven ACO DRM 3.64 6.18.19-1-lts), llvmpipe
(LLVM 22.1.1 256 bits)
API: Vulkan v: 1.4.341 layers: 4 device: 0 type: integrated-gpu name: AMD Radeon Vega 11
Graphics (RADV RAVEN) driver: mesa radv v: 26.0.3-arch1.1 device-ID: 1002:15dd surfaces: N/A
Info: Tools: api: eglinfo, glxinfo, vulkaninfo gpu: nvidia-smi,radeontop x11: xprop
Audio:
Device-1: Advanced Micro Devices [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio
driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 04:00.1
chip-ID: 1002:15de class-ID: 0403
Device-2: Advanced Micro Devices [AMD] Audio Coprocessor driver: snd_pci_acp3x v: kernel
alternate: snd_rn_pci_acp3x, snd_pci_acp5x, snd_pci_acp6x, snd_acp_pci, snd_rpl_pci_acp6x,
snd_pci_ps, snd_sof_amd_renoir, snd_sof_amd_rembrandt, snd_sof_amd_vangogh,
snd_sof_amd_acp63, snd_sof_amd_acp70 pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 04:00.5
chip-ID: 1022:15e2 class-ID: 0480
Device-3: Advanced Micro Devices [AMD] Ryzen HD Audio vendor: Lenovo driver: snd_hda_intel
v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 04:00.6 chip-ID: 1022:15e3 class-ID: 0403
Device-4: Logitech BRIO Ultra HD Webcam driver: snd-usb-audio,uvcvideo type: USB rev: 2.1
speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 2-3:2 chip-ID: 046d:085e class-ID: 0102
serial: <filter>
API: ALSA v: k6.18.19-1-lts status: kernel-api tools: N/A
Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
Server-2: PipeWire v: 1.6.2 status: active with: 1: pipewire-pulse status: active
2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
tools: pactl,pw-cat,pw-cli,wpctl
Network:
Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet vendor: Lenovo
driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: fc00 bus-ID: 01:00.0
chip-ID: 10ec:8168 class-ID: 0200
IF: enp1s0f0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Device-2: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter vendor: Lenovo
driver: ath10k_pci v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 bus-ID: 02:00.0
chip-ID: 168c:003e class-ID: 0280 temp: 47.0 C
IF: wlp2s0 state: down mac: <filter>
IF-ID-1: tailscale0 state: unknown speed: -1 duplex: full mac: N/A
Info: services: NetworkManager, nginx, sshd, systemd-timesyncd, wpa_supplicant
Bluetooth:
Device-1: Qualcomm Atheros QCA61x4 Bluetooth 4.0 driver: btusb v: 0.8 type: USB rev: 2.0
speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 4-1.4:6 chip-ID: 0cf3:e300 class-ID: e001
Report: rfkill ID: hci0 rfk-id: 1 state: down bt-service: disabled rfk-block: hardware: no
software: no address: see --recommends
Drives:
Local Storage: total: 1.82 TiB used: 382.17 GiB (20.5%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Western Digital model: WD Blue SN570 1TB
size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD
serial: <filter> fw-rev: 234110WD temp: 43.9 C scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: Crucial model: CT1000BX500SSD1 size: 931.51 GiB
block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s tech: SSD serial: <filter>
fw-rev: 061 scheme: GPT
Partition:
ID-1: / raw-size: 930.51 GiB size: 930.51 GiB (100.00%) used: 138.39 GiB (14.9%) fs: btrfs
dev: /dev/nvme0n1p2 maj-min: 259:2
ID-2: /boot raw-size: 1024 MiB size: 1022 MiB (99.80%) used: 71.3 MiB (7.0%) fs: vfat
dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: no
ID-1: swap-1 type: zram size: 4 GiB used: 0 KiB (0.0%) priority: 100 comp: zstd
avail: lzo-rle,lzo,lz4,lz4hc,deflate,842 dev: /dev/zram0
Sensors:
System Temperatures: cpu: 60.1 C mobo: N/A gpu: amdgpu temp: 60.0 C
Fan Speeds (rpm): N/A
Info:
Memory: total: 32 GiB note: est. available: 30.04 GiB used: 2.49 GiB (8.3%)
Processes: 262 Power: uptime: 11m states: freeze,mem,disk suspend: deep avail: s2idle
wakeups: 0 hibernate: platform avail: shutdown, reboot, suspend, test_resume image: 11.98 GiB
Init: systemd v: 260 default: graphical tool: systemctl
Packages: pm: pacman pkgs: 1431 libs: 250 tools: yay Compilers: clang: 22.1.1 gcc: 15.2.1
Shell: Bash v: 5.3.9 running-in: pty pts/1 inxi: 3.3.40Last edited by qualia (2026-03-22 06:12:11)
Offline
Try replacing it's driver with https://aur.archlinux.org/packages/r8168-dkms and blacklist the r8169 driver for now... rebuild initramfs and reboot...
Last edited by 5hridhyan (2026-03-22 08:30:54)
---
Offline
Mar 21 12:17:03 HAL9000 kernel: irq 7: nobody cared (try booting with the "irqpoll" option)
Mar 21 12:17:03 HAL9000 kernel: CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Not tainted 6.18.19-1-lts #1 PREEMPT(voluntary) 7b82d74327f0bd97f4186090924e336ca19b1df4
Mar 21 12:17:03 HAL9000 kernel: Hardware name: LENOVO 10VG0006US/3130, BIOS M1XKT39A 04/08/2019
Mar 21 12:17:03 HAL9000 kernel: Call Trace:
Mar 21 12:17:03 HAL9000 kernel: <IRQ>
Mar 21 12:17:03 HAL9000 kernel: dump_stack_lvl+0x5d/0x80
Mar 21 12:17:03 HAL9000 kernel: __report_bad_irq+0x35/0xbc
Mar 21 12:17:03 HAL9000 kernel: note_interrupt.cold+0x28/0x66
Mar 21 12:17:03 HAL9000 kernel: handle_irq_event+0x72/0x80
Mar 21 12:17:03 HAL9000 kernel: handle_fasteoi_irq+0xda/0x1f0
Mar 21 12:17:03 HAL9000 kernel: __common_interrupt+0x41/0xa0
Mar 21 12:17:03 HAL9000 kernel: common_interrupt+0x80/0xa0
Mar 21 12:17:03 HAL9000 kernel: </IRQ>
Mar 21 12:17:03 HAL9000 kernel: <TASK>
Mar 21 12:17:03 HAL9000 kernel: asm_common_interrupt+0x26/0x40
Mar 21 12:17:03 HAL9000 kernel: RIP: 0010:cpuidle_enter_state+0xbb/0x410
Mar 21 12:17:03 HAL9000 kernel: Code: 00 00 e8 e8 2f 02 ff e8 23 f2 ff ff 48 89 c5 0f 1f 44 00 00 31 ff e8 04 ad 00 ff 45 84 ff 0f 85 33 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 7c 01 00 00 49 63 ce 48 2b 2c 24 48 6b d1 68 48 89
Mar 21 12:17:03 HAL9000 kernel: RSP: 0018:ffffd220401bfe78 EFLAGS: 00000246
Mar 21 12:17:03 HAL9000 kernel: RAX: ffff8f3dfadae000 RBX: 0000000000000001 RCX: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel: RDX: 000000006bea1b6a RSI: fffffffb368a2c7a RDI: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel: RBP: 000000006bea1b6a R08: 0000000000000002 R09: 000001d1a94a2000
Mar 21 12:17:03 HAL9000 kernel: R10: 00000000ffffffff R11: ffffffffffffffff R12: ffff8f3600db9400
Mar 21 12:17:03 HAL9000 kernel: R13: ffffffff837fe740 R14: 0000000000000001 R15: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel: ? cpuidle_enter_state+0xac/0x410
Mar 21 12:17:03 HAL9000 kernel: cpuidle_enter+0x31/0x50
Mar 21 12:17:03 HAL9000 kernel: do_idle+0x12d/0x220
Mar 21 12:17:03 HAL9000 kernel: cpu_startup_entry+0x29/0x30
Mar 21 12:17:03 HAL9000 kernel: start_secondary+0x119/0x140
Mar 21 12:17:03 HAL9000 kernel: common_startup_64+0x13e/0x141
Mar 21 12:17:03 HAL9000 kernel: </TASK>
Mar 21 12:17:03 HAL9000 kernel: handlers:
Mar 21 12:17:03 HAL9000 kernel: [<00000000d5536624>] amd_gpio_irq_handler
Mar 21 12:17:03 HAL9000 kernel: Disabling IRQ #7I've gotten random freezes. I have no clue why.
I originally thought it was panic, but after setting up kdumps and SysRq, I realized the whole system was just frozen, not even a panic.
sysrq worked?
Mar 21 17:38:23 HAL9000 kernel: perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79500Generically https://wiki.archlinux.org/title/Ryzen# … k_freezing - the CPU might fail to wake from the deep states and that would most certainly not happen
I've stress-tested with `stress` for 24 hours (hdd, cpu, vm) and cannot get a reliable crash.
when putting it under stress
Why is there "pci=nommconf"?
Offline
Mar 21 12:17:03 HAL9000 kernel: irq 7: nobody cared (try booting with the "irqpoll" option) Mar 21 12:17:03 HAL9000 kernel: CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Not tainted 6.18.19-1-lts #1 PREEMPT(voluntary) 7b82d74327f0bd97f4186090924e336ca19b1df4 Mar 21 12:17:03 HAL9000 kernel: Hardware name: LENOVO 10VG0006US/3130, BIOS M1XKT39A 04/08/2019 Mar 21 12:17:03 HAL9000 kernel: Call Trace: Mar 21 12:17:03 HAL9000 kernel: <IRQ> Mar 21 12:17:03 HAL9000 kernel: dump_stack_lvl+0x5d/0x80 Mar 21 12:17:03 HAL9000 kernel: __report_bad_irq+0x35/0xbc Mar 21 12:17:03 HAL9000 kernel: note_interrupt.cold+0x28/0x66 Mar 21 12:17:03 HAL9000 kernel: handle_irq_event+0x72/0x80 Mar 21 12:17:03 HAL9000 kernel: handle_fasteoi_irq+0xda/0x1f0 Mar 21 12:17:03 HAL9000 kernel: __common_interrupt+0x41/0xa0 Mar 21 12:17:03 HAL9000 kernel: common_interrupt+0x80/0xa0 Mar 21 12:17:03 HAL9000 kernel: </IRQ> Mar 21 12:17:03 HAL9000 kernel: <TASK> Mar 21 12:17:03 HAL9000 kernel: asm_common_interrupt+0x26/0x40 Mar 21 12:17:03 HAL9000 kernel: RIP: 0010:cpuidle_enter_state+0xbb/0x410 Mar 21 12:17:03 HAL9000 kernel: Code: 00 00 e8 e8 2f 02 ff e8 23 f2 ff ff 48 89 c5 0f 1f 44 00 00 31 ff e8 04 ad 00 ff 45 84 ff 0f 85 33 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 7c 01 00 00 49 63 ce 48 2b 2c 24 48 6b d1 68 48 89 Mar 21 12:17:03 HAL9000 kernel: RSP: 0018:ffffd220401bfe78 EFLAGS: 00000246 Mar 21 12:17:03 HAL9000 kernel: RAX: ffff8f3dfadae000 RBX: 0000000000000001 RCX: 0000000000000000 Mar 21 12:17:03 HAL9000 kernel: RDX: 000000006bea1b6a RSI: fffffffb368a2c7a RDI: 0000000000000000 Mar 21 12:17:03 HAL9000 kernel: RBP: 000000006bea1b6a R08: 0000000000000002 R09: 000001d1a94a2000 Mar 21 12:17:03 HAL9000 kernel: R10: 00000000ffffffff R11: ffffffffffffffff R12: ffff8f3600db9400 Mar 21 12:17:03 HAL9000 kernel: R13: ffffffff837fe740 R14: 0000000000000001 R15: 0000000000000000 Mar 21 12:17:03 HAL9000 kernel: ? cpuidle_enter_state+0xac/0x410 Mar 21 12:17:03 HAL9000 kernel: cpuidle_enter+0x31/0x50 Mar 21 12:17:03 HAL9000 kernel: do_idle+0x12d/0x220 Mar 21 12:17:03 HAL9000 kernel: cpu_startup_entry+0x29/0x30 Mar 21 12:17:03 HAL9000 kernel: start_secondary+0x119/0x140 Mar 21 12:17:03 HAL9000 kernel: common_startup_64+0x13e/0x141 Mar 21 12:17:03 HAL9000 kernel: </TASK> Mar 21 12:17:03 HAL9000 kernel: handlers: Mar 21 12:17:03 HAL9000 kernel: [<00000000d5536624>] amd_gpio_irq_handler Mar 21 12:17:03 HAL9000 kernel: Disabling IRQ #7I've gotten random freezes. I have no clue why.
I originally thought it was panic, but after setting up kdumps and SysRq, I realized the whole system was just frozen, not even a panic.sysrq worked?
Mar 21 17:38:23 HAL9000 kernel: perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79500Generically https://wiki.archlinux.org/title/Ryzen# … k_freezing - the CPU might fail to wake from the deep states and that would most certainly not happen
I've stress-tested with `stress` for 24 hours (hdd, cpu, vm) and cannot get a reliable crash.
when putting it under stress
Why is there "pci=nommconf"?
I guess there is a sysrq log, but nothing ever happened on the screen nor did it ever shut down. The IRQ 7 nobody cares happens at boot, the freeze doesn't happen till hours/days/weeks later. "pci=nommconf" was left over from me trying things to fix.
Offline
https://wiki.archlinux.org/title/Keyboa … el_(SysRq)
You have to explicitly enable that first and then sysrq + REISUB (give each step a couple of seconds) will cause a controlled shutdown (and write the journal to disk)
Parallel you can just try to limit the c-state, "processor.max_cstate=1" is the most aggressive setting but maybe a good test to see whether it does something for you at all (it'll effectively prevent the CPU from stepping down a raise power draw/shorten battery time)
If (it seems) yes, boldly go for "processor.max_cstate=5" next.
On a formal note, please avoid bloating the thread w/ pointless full quotes of previous posts to keep everyones mousewheels cool ![]()
Offline
Okay maybe I am confused or was confusing. SysRq IS ENABLED and I can trigger it under norm circumstances. When the freeze happens I CANNOT trigger sysrq. NOTHING happens visually when I REISUB under the freeze.
I am asking how to fix the freezing.
Offline
The kernel complete halted - have you already tried to limit the c-states?
Offline
@seth no I have not tried limiting cstates, will do that. I got rid of the `pci=nommconf` and now I'm getting these a bunch but freeze is still random with no logs:
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: AER: Multiple Correctable error message received from 0000:02:00.0
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Transmitter ID)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: device [168c:003e] error status/mask=00001081/00006000
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: [ 0] RxErr (First)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: [ 7] BadDLLP
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: [12] Timeout
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: AER: Multiple Correctable error message received from 0000:02:00.0
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Transmitter ID)
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: device [1022:15d3] error status/mask=00001000/00006000
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: [12] Timeout
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: device [168c:003e] error status/mask=00000041/00006000
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: [ 0] RxErr (First)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: [ 6] BadTLP
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: AER: Error of this Agent is reported firstOffline
Just to rule that out, can you disable the wifi chip?
(though I don't think it would result in kernel halts)
Offline
With the wifi chip disabled and with processor.max_cstate=1 I still get freezes.
Offline
Do you have updated journals covering those?
It's not the most common symptom, but https://wiki.archlinux.org/title/Ryzen#System_halts
Offline
Edit: double post, forum being DDOS'd?
Last edited by seth (2026-03-26 07:46:40)
Offline
I thought it was my hard drive, so I disabled it, but then it happened again. Getting an error on NVME. Maybe it's a bad NVMe?
I just started playing around with my NAS so thats the logs at the end, not sure if its an issue.
I tried updating the UEFI, but I gave up. fmwmgr didn't pick up anything.
Offline
Feeling more and more like something with my NVME SSD. I've been doing a ton of transfers of large media files, including downloading and uploading to a NAS, and it's consistently failing.
Any ideas on how to verify this?
Offline
First and foremost by cutting out the NAS (ie. network) - just use a usb drive or stress the disk w/ dd, https://wiki.archlinux.org/title/Benchmarking#dd (that's for benchmarking, you might need bigger files?)
Offline
ugh this sucks and is the worst to debug. if you had to suggest would you suspect the ssd or the ethernet? or do we even know enough
Offline
Keep a terminal running "dmesg -W" visible at all times, if you're loosing the root partition you'd likely get some IO errors before it actually causes stuff to freeze.
Since you're already getting bus errors from the wireless device (which I assume connects your NAS - do you have a wired connection and/or can deactivate the wifi chip and use a dongle or in doubt your phone¹ for a while?) there's a significant chance that this is coming from there (also because the nvme failing, even if it holds your root partition, "should" not break the sysrq invocation ![]()
Edit:
¹ https://wiki.archlinux.org/title/Tethering
Last edited by seth (2026-03-28 07:34:54)
Offline
I am using a wired connection and the wifi chip is deactivated. I did that after you told me awhile back up.
Offline
Okay yeah, I can confirm it's the SSD. Ugh noooooooooo. Consistently fails after like 30 seconds of dding from /dev/random. Nothing shows up in the dmesg -W terminal.
Offline
Well I said that and now I cant get it to do it again... did see this once tho:
Mar 29 19:40:24 HAL9000 kernel: pcieport 0000:00:01.4: AER: Correctable error message received from 0000:03:00.0
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: device [15b7:501a] error status/mask=00000001/0000e000
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: [ 0] RxErr (First)
Mar 29 19:48:00 HAL9000 kernel: device-mapper: uevent: version 1.0.3
Mar 29 19:48:00 HAL9000 kernel: device-mapper: ioctl: 4.50.0-ioctl (2025-04-28) initialised: dm-devel@lists.linux.devOffline
Uh oh: I did a full reinstall on the hard drive and ran the same disk stress test and it froze. So either both drives are bad in the exact same way? or its not the ram or the drives (as I ran memtest), which would mean its what? the CPU??
Offline
same disk stress test
Writing where exactly?
The errors you've posted so far are all on the bus (wifi and nvme), do you have one for the latest freeze as well?
lspci -tvand try to add "pcie_aspm=off" to the https://wiki.archlinux.org/title/Kernel_parameters
Offline
I already have pcie_aspm=off. The logs show nothing new. I stressed my ssd with my hdd unmounted and it froze. I then stressed hdd with my ssd unmounted and it froze.
same old:
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: device [15b7:501a] error status/mask=00000001/0000e000
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: [ 0] RxErr (First)-[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex
+-00.2 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
+-01.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+-01.2-[01]--+-00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller
| +-00.1 Realtek Semiconductor Co., Ltd. RTL8111xP UART #1
| +-00.2 Realtek Semiconductor Co., Ltd. RTL8111xP UART #2
| +-00.3 Realtek Semiconductor Co., Ltd. RTL8111xP IPMI interface
| \-00.4 Realtek Semiconductor Co., Ltd. RTL811x EHCI host controller
+-01.3-[02]----00.0 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
+-01.4-[03]----00.0 Sandisk Corp SanDisk Ultra 3D / WD Blue SN570 NVMe SSD (DRAM-less)
+-08.0 Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+-08.1-[04]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series]
| +-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller
| +-00.2 Advanced Micro Devices, Inc. [AMD] Raven/Raven2/FireFlight/Renoir/Cezanne Platform Security Processor
| +-00.3 Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
| +-00.4 Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
| +-00.5 Advanced Micro Devices, Inc. [AMD] Audio Coprocessor
| \-00.6 Advanced Micro Devices, Inc. [AMD] Ryzen HD Audio Controller
+-08.2-[05]----00.0 Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
+-14.0 Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
+-14.3 Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
+-18.0 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 0
+-18.1 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 1
+-18.2 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 2
+-18.3 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 3
+-18.4 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 4
+-18.5 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 5
+-18.6 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 6
\-18.7 Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 7I just bought another computer so I will plug ssd hdd and ram into that and see how it fares.
Last edited by qualia (2026-04-02 17:53:15)
Offline
Its gotta be something really low. I boot into the BIOS fine, and nothing breaks there, but when I go into the Lenovo Diagnostic Tool it freezes before I can even hit a key.
Offline
I then stressed hdd with my ssd unmounted and it froze.
Did that result in the
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)?
Wifi and ethernet are on the same bus as the nvme.
Can you disable those in the UEFI?
Offline
Pages: 1