You are not logged in.

#1 2026-03-22 05:33:45

qualia
Member
Registered: 2024-01-01
Posts: 26

Random Unsolvable Freezes

Ever since I got my ThinkCentre M715q and installed Arch (2 years by now), I've gotten random freezes. I have no clue why.

I originally thought it was panic, but after setting up kdumps and SysRq, I realized the whole system was just frozen, not even a panic.
Nothing much shows up in the journal, besides some IRQ #7 thing

Journal from a crash: https://0x0.st/PpN8.txt

Things tried:
I ran memtest for 24 hours and no issues.
I've stress-tested with `stress` for 24 hours (hdd, cpu, vm) and cannot get a reliable crash.
I've fully disabled swap.
I have the latest microcode.
I am using linux-lts.
I have earlyoom.

I would appreciate any help, and will try anything. One thing I did notice was that it happened twice when I was downloading large files for a long time, but once again I cant replicate it :C.

 Linux 6.18.19-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 19 Mar 2026 15:56:44 +0000 x86_64 GNU/Linux 
# Static information about the filesystems.
# See fstab(5) for details.

# <file system> <dir> <type> <options> <dump> <pass>
# /dev/nvme0n1p2
UUID=ad6e819a-9e48-448d-a129-87bad23db1ef / btrfs rw,relatime,compress=zstd:3,ssd,discard=async,space_cache=v2,subvol=/ 0 0

# /dev/nvme0n1p1
UUID=0D0A-1287 /boot vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 2

# /dev/sda1
UUID=9af0a80d-63d0-4ca3-832c-1be8619b991d /mnt/HDD btrfs rw,relatime,compress=zstd:3,discard=async,space_cache=v2 0 0
System:
  Kernel: 6.18.19-1-lts arch: x86_64 bits: 64 compiler: gcc v: 15.2.1 clocksource: hpet
    avail: acpi_pm parameters: BOOT_IMAGE=/vmlinuz-linux-lts
    root=UUID=ad6e819a-9e48-448d-a129-87bad23db1ef rw zswap.enabled=0 rootfstype=btrfs
    crashkernel=256M loglevel=3 pci=nommconf
  Console: pty pts/1 Distro: Arch Linux
Machine:
  Type: Mini-pc System: LENOVO product: 10VG0006US v: ThinkCentre M715q
    serial: <superuser required> Chassis: type: 35 serial: <superuser required>
  Mobo: LENOVO model: 3130 v: SDK0J40697 WIN 3305189998500 serial: <superuser required>
    part-nu: LENOVO_MT_10VG_BU_Think_FM_ThinkCentre M715q uuid: <superuser required> Firmware: UEFI
    vendor: LENOVO v: M1XKT39A date: 04/08/2019
CPU:
  Info: model: AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics bits: 64 type: MT MCP arch: Zen
    level: v3 note: check built: 2017-19 process: GF 14nm family: 0x17 (23) model-id: 0x11 (17)
    stepping: 0 microcode: 0x8101007
  Topology: cpus: 1x dies: 1 clusters: 1 cores: 4 threads: 8 tpc: 2 smt: enabled cache:
    L1: 384 KiB desc: d-4x32 KiB; i-4x64 KiB L2: 2 MiB desc: 4x512 KiB L3: 4 MiB desc: 1x4 MiB
  Speed (MHz): avg: 1499 min/max: 1600/3200 boost: enabled scaling: driver: acpi-cpufreq
    governor: schedutil cores: 1: 1499 2: 1499 3: 1499 4: 1499 5: 1499 6: 1499 7: 1499 8: 1499
    bogomips: 51120
  Flags-basic: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: ghostwrite status: Not affected
  Type: indirect_target_selection status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: old_microcode status: Not affected
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed mitigation: untrained return thunk; SMT vulnerable
  Type: spec_rstack_overflow mitigation: Safe RET
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; STIBP: disabled; RSB filling;
    PBRSB-eIBRS: Not affected; BHI: Not affected
  Type: srbds status: Not affected
  Type: tsa status: Not affected
  Type: tsx_async_abort status: Not affected
  Type: vmscape mitigation: IBPB before exit to userspace
Graphics:
  Device-1: Advanced Micro Devices [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Mobile
    Series] vendor: Lenovo driver: amdgpu v: kernel arch: GCN-5 code: Vega process: GF 14nm
    built: 2017-20 pcie: gen: 3 speed: 8 GT/s lanes: 16 ports: active: DP-2 empty: DP-1,DP-3,DP-4
    bus-ID: 04:00.0 chip-ID: 1002:15dd class-ID: 0300 temp: 51.0 C
  Device-2: Logitech BRIO Ultra HD Webcam driver: snd-usb-audio,uvcvideo type: USB rev: 2.1
    speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 2-3:2 chip-ID: 046d:085e class-ID: 0102
    serial: <filter>
  Device-3: Realtek RTL2838 DVB-T driver: usbfs type: USB rev: 2.0 speed: 480 Mb/s lanes: 1
    mode: 2.0 bus-ID: 4-1.2:4 chip-ID: 0bda:2838 class-ID: 0000 serial: <filter>
  Display: unspecified server: N/A driver: gpu: amdgpu tty: 213x54
  Monitor-1: DP-2 model: Toshiba T749-fHD720 serial: <filter> built: 2011 res: 1920x1080
    gamma: 1.2 size: 708x398mm (27.87x15.67") modes: max: 1920x1080 min: 640x480
  API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi device: 1 drv: swrast
    gbm: drv: radeonsi surfaceless: drv: radeonsi inactive: wayland,x11
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: mesa v: 26.0.3-arch1.1 note: console (EGL sourced)
    renderer: AMD Radeon Vega 11 Graphics (radeonsi raven ACO DRM 3.64 6.18.19-1-lts), llvmpipe
    (LLVM 22.1.1 256 bits)
  API: Vulkan v: 1.4.341 layers: 4 device: 0 type: integrated-gpu name: AMD Radeon Vega 11
    Graphics (RADV RAVEN) driver: mesa radv v: 26.0.3-arch1.1 device-ID: 1002:15dd surfaces: N/A
  Info: Tools: api: eglinfo, glxinfo, vulkaninfo gpu: nvidia-smi,radeontop x11: xprop
Audio:
  Device-1: Advanced Micro Devices [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio
    driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 04:00.1
    chip-ID: 1002:15de class-ID: 0403
  Device-2: Advanced Micro Devices [AMD] Audio Coprocessor driver: snd_pci_acp3x v: kernel
    alternate: snd_rn_pci_acp3x, snd_pci_acp5x, snd_pci_acp6x, snd_acp_pci, snd_rpl_pci_acp6x,
    snd_pci_ps, snd_sof_amd_renoir, snd_sof_amd_rembrandt, snd_sof_amd_vangogh,
    snd_sof_amd_acp63, snd_sof_amd_acp70 pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 04:00.5
    chip-ID: 1022:15e2 class-ID: 0480
  Device-3: Advanced Micro Devices [AMD] Ryzen HD Audio vendor: Lenovo driver: snd_hda_intel
    v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 04:00.6 chip-ID: 1022:15e3 class-ID: 0403
  Device-4: Logitech BRIO Ultra HD Webcam driver: snd-usb-audio,uvcvideo type: USB rev: 2.1
    speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 2-3:2 chip-ID: 046d:085e class-ID: 0102
    serial: <filter>
  API: ALSA v: k6.18.19-1-lts status: kernel-api tools: N/A
  Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
  Server-2: PipeWire v: 1.6.2 status: active with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet vendor: Lenovo
    driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: fc00 bus-ID: 01:00.0
    chip-ID: 10ec:8168 class-ID: 0200
  IF: enp1s0f0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter vendor: Lenovo
    driver: ath10k_pci v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 bus-ID: 02:00.0
    chip-ID: 168c:003e class-ID: 0280 temp: 47.0 C
  IF: wlp2s0 state: down mac: <filter>
  IF-ID-1: tailscale0 state: unknown speed: -1 duplex: full mac: N/A
  Info: services: NetworkManager, nginx, sshd, systemd-timesyncd, wpa_supplicant
Bluetooth:
  Device-1: Qualcomm Atheros QCA61x4 Bluetooth 4.0 driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 4-1.4:6 chip-ID: 0cf3:e300 class-ID: e001
  Report: rfkill ID: hci0 rfk-id: 1 state: down bt-service: disabled rfk-block: hardware: no
    software: no address: see --recommends
Drives:
  Local Storage: total: 1.82 TiB used: 382.17 GiB (20.5%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Western Digital model: WD Blue SN570 1TB
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD
    serial: <filter> fw-rev: 234110WD temp: 43.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Crucial model: CT1000BX500SSD1 size: 931.51 GiB
    block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s tech: SSD serial: <filter>
    fw-rev: 061 scheme: GPT
Partition:
  ID-1: / raw-size: 930.51 GiB size: 930.51 GiB (100.00%) used: 138.39 GiB (14.9%) fs: btrfs
    dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-2: /boot raw-size: 1024 MiB size: 1022 MiB (99.80%) used: 71.3 MiB (7.0%) fs: vfat
    dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: no
  ID-1: swap-1 type: zram size: 4 GiB used: 0 KiB (0.0%) priority: 100 comp: zstd
    avail: lzo-rle,lzo,lz4,lz4hc,deflate,842 dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 60.1 C mobo: N/A gpu: amdgpu temp: 60.0 C
  Fan Speeds (rpm): N/A
Info:
  Memory: total: 32 GiB note: est. available: 30.04 GiB used: 2.49 GiB (8.3%)
  Processes: 262 Power: uptime: 11m states: freeze,mem,disk suspend: deep avail: s2idle
    wakeups: 0 hibernate: platform avail: shutdown, reboot, suspend, test_resume image: 11.98 GiB
    Init: systemd v: 260 default: graphical tool: systemctl
  Packages: pm: pacman pkgs: 1431 libs: 250 tools: yay Compilers: clang: 22.1.1 gcc: 15.2.1
    Shell: Bash v: 5.3.9 running-in: pty pts/1 inxi: 3.3.40

Last edited by qualia (2026-03-22 06:12:11)

Offline

#2 2026-03-22 08:17:35

5hridhyan
Member
From: Asia
Registered: 2025-12-25
Posts: 562

Re: Random Unsolvable Freezes

Try replacing it's driver with https://aur.archlinux.org/packages/r8168-dkms and blacklist the r8169 driver for now... rebuild initramfs and reboot...

Last edited by 5hridhyan (2026-03-22 08:30:54)


---

Offline

#3 2026-03-22 08:52:04

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

Mar 21 12:17:03 HAL9000 kernel: irq 7: nobody cared (try booting with the "irqpoll" option)
Mar 21 12:17:03 HAL9000 kernel: CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Not tainted 6.18.19-1-lts #1 PREEMPT(voluntary)  7b82d74327f0bd97f4186090924e336ca19b1df4
Mar 21 12:17:03 HAL9000 kernel: Hardware name: LENOVO 10VG0006US/3130, BIOS M1XKT39A 04/08/2019
Mar 21 12:17:03 HAL9000 kernel: Call Trace:
Mar 21 12:17:03 HAL9000 kernel:  <IRQ>
Mar 21 12:17:03 HAL9000 kernel:  dump_stack_lvl+0x5d/0x80
Mar 21 12:17:03 HAL9000 kernel:  __report_bad_irq+0x35/0xbc
Mar 21 12:17:03 HAL9000 kernel:  note_interrupt.cold+0x28/0x66
Mar 21 12:17:03 HAL9000 kernel:  handle_irq_event+0x72/0x80
Mar 21 12:17:03 HAL9000 kernel:  handle_fasteoi_irq+0xda/0x1f0
Mar 21 12:17:03 HAL9000 kernel:  __common_interrupt+0x41/0xa0
Mar 21 12:17:03 HAL9000 kernel:  common_interrupt+0x80/0xa0
Mar 21 12:17:03 HAL9000 kernel:  </IRQ>
Mar 21 12:17:03 HAL9000 kernel:  <TASK>
Mar 21 12:17:03 HAL9000 kernel:  asm_common_interrupt+0x26/0x40
Mar 21 12:17:03 HAL9000 kernel: RIP: 0010:cpuidle_enter_state+0xbb/0x410
Mar 21 12:17:03 HAL9000 kernel: Code: 00 00 e8 e8 2f 02 ff e8 23 f2 ff ff 48 89 c5 0f 1f 44 00 00 31 ff e8 04 ad 00 ff 45 84 ff 0f 85 33 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 7c 01 00 00 49 63 ce 48 2b 2c 24 48 6b d1 68 48 89
Mar 21 12:17:03 HAL9000 kernel: RSP: 0018:ffffd220401bfe78 EFLAGS: 00000246
Mar 21 12:17:03 HAL9000 kernel: RAX: ffff8f3dfadae000 RBX: 0000000000000001 RCX: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel: RDX: 000000006bea1b6a RSI: fffffffb368a2c7a RDI: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel: RBP: 000000006bea1b6a R08: 0000000000000002 R09: 000001d1a94a2000
Mar 21 12:17:03 HAL9000 kernel: R10: 00000000ffffffff R11: ffffffffffffffff R12: ffff8f3600db9400
Mar 21 12:17:03 HAL9000 kernel: R13: ffffffff837fe740 R14: 0000000000000001 R15: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel:  ? cpuidle_enter_state+0xac/0x410
Mar 21 12:17:03 HAL9000 kernel:  cpuidle_enter+0x31/0x50
Mar 21 12:17:03 HAL9000 kernel:  do_idle+0x12d/0x220
Mar 21 12:17:03 HAL9000 kernel:  cpu_startup_entry+0x29/0x30
Mar 21 12:17:03 HAL9000 kernel:  start_secondary+0x119/0x140
Mar 21 12:17:03 HAL9000 kernel:  common_startup_64+0x13e/0x141
Mar 21 12:17:03 HAL9000 kernel:  </TASK>
Mar 21 12:17:03 HAL9000 kernel: handlers:
Mar 21 12:17:03 HAL9000 kernel: [<00000000d5536624>] amd_gpio_irq_handler
Mar 21 12:17:03 HAL9000 kernel: Disabling IRQ #7

I've gotten random freezes. I have no clue why.
I originally thought it was panic, but after setting up kdumps and SysRq, I realized the whole system was just frozen, not even a panic.

sysrq worked?

Mar 21 17:38:23 HAL9000 kernel: perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79500

Generically https://wiki.archlinux.org/title/Ryzen# … k_freezing - the CPU might fail to wake from the deep states and that would most certainly not happen

I've stress-tested with `stress` for 24 hours (hdd, cpu, vm) and cannot get a reliable crash.

when putting it under stress

Why is there "pci=nommconf"?

Offline

#4 2026-03-22 19:36:09

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

seth wrote:
Mar 21 12:17:03 HAL9000 kernel: irq 7: nobody cared (try booting with the "irqpoll" option)
Mar 21 12:17:03 HAL9000 kernel: CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Not tainted 6.18.19-1-lts #1 PREEMPT(voluntary)  7b82d74327f0bd97f4186090924e336ca19b1df4
Mar 21 12:17:03 HAL9000 kernel: Hardware name: LENOVO 10VG0006US/3130, BIOS M1XKT39A 04/08/2019
Mar 21 12:17:03 HAL9000 kernel: Call Trace:
Mar 21 12:17:03 HAL9000 kernel:  <IRQ>
Mar 21 12:17:03 HAL9000 kernel:  dump_stack_lvl+0x5d/0x80
Mar 21 12:17:03 HAL9000 kernel:  __report_bad_irq+0x35/0xbc
Mar 21 12:17:03 HAL9000 kernel:  note_interrupt.cold+0x28/0x66
Mar 21 12:17:03 HAL9000 kernel:  handle_irq_event+0x72/0x80
Mar 21 12:17:03 HAL9000 kernel:  handle_fasteoi_irq+0xda/0x1f0
Mar 21 12:17:03 HAL9000 kernel:  __common_interrupt+0x41/0xa0
Mar 21 12:17:03 HAL9000 kernel:  common_interrupt+0x80/0xa0
Mar 21 12:17:03 HAL9000 kernel:  </IRQ>
Mar 21 12:17:03 HAL9000 kernel:  <TASK>
Mar 21 12:17:03 HAL9000 kernel:  asm_common_interrupt+0x26/0x40
Mar 21 12:17:03 HAL9000 kernel: RIP: 0010:cpuidle_enter_state+0xbb/0x410
Mar 21 12:17:03 HAL9000 kernel: Code: 00 00 e8 e8 2f 02 ff e8 23 f2 ff ff 48 89 c5 0f 1f 44 00 00 31 ff e8 04 ad 00 ff 45 84 ff 0f 85 33 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 7c 01 00 00 49 63 ce 48 2b 2c 24 48 6b d1 68 48 89
Mar 21 12:17:03 HAL9000 kernel: RSP: 0018:ffffd220401bfe78 EFLAGS: 00000246
Mar 21 12:17:03 HAL9000 kernel: RAX: ffff8f3dfadae000 RBX: 0000000000000001 RCX: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel: RDX: 000000006bea1b6a RSI: fffffffb368a2c7a RDI: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel: RBP: 000000006bea1b6a R08: 0000000000000002 R09: 000001d1a94a2000
Mar 21 12:17:03 HAL9000 kernel: R10: 00000000ffffffff R11: ffffffffffffffff R12: ffff8f3600db9400
Mar 21 12:17:03 HAL9000 kernel: R13: ffffffff837fe740 R14: 0000000000000001 R15: 0000000000000000
Mar 21 12:17:03 HAL9000 kernel:  ? cpuidle_enter_state+0xac/0x410
Mar 21 12:17:03 HAL9000 kernel:  cpuidle_enter+0x31/0x50
Mar 21 12:17:03 HAL9000 kernel:  do_idle+0x12d/0x220
Mar 21 12:17:03 HAL9000 kernel:  cpu_startup_entry+0x29/0x30
Mar 21 12:17:03 HAL9000 kernel:  start_secondary+0x119/0x140
Mar 21 12:17:03 HAL9000 kernel:  common_startup_64+0x13e/0x141
Mar 21 12:17:03 HAL9000 kernel:  </TASK>
Mar 21 12:17:03 HAL9000 kernel: handlers:
Mar 21 12:17:03 HAL9000 kernel: [<00000000d5536624>] amd_gpio_irq_handler
Mar 21 12:17:03 HAL9000 kernel: Disabling IRQ #7

I've gotten random freezes. I have no clue why.
I originally thought it was panic, but after setting up kdumps and SysRq, I realized the whole system was just frozen, not even a panic.

sysrq worked?

Mar 21 17:38:23 HAL9000 kernel: perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79500

Generically https://wiki.archlinux.org/title/Ryzen# … k_freezing - the CPU might fail to wake from the deep states and that would most certainly not happen

I've stress-tested with `stress` for 24 hours (hdd, cpu, vm) and cannot get a reliable crash.

when putting it under stress

Why is there "pci=nommconf"?

I guess there is a sysrq log, but nothing ever happened on the screen nor did it ever shut down. The IRQ 7 nobody cares happens at boot, the freeze doesn't happen till hours/days/weeks later. "pci=nommconf" was left over from me trying things to fix.

Offline

#5 2026-03-22 21:19:11

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

https://wiki.archlinux.org/title/Keyboa … el_(SysRq)
You have to explicitly enable that first and then sysrq + REISUB (give each step a couple of seconds) will cause a controlled shutdown (and write the journal to disk)

Parallel you can just try to limit the c-state, "processor.max_cstate=1" is the most aggressive setting but maybe a good test to see whether it does something for you at all (it'll effectively prevent the CPU from stepping down a raise power draw/shorten battery time)
If (it seems) yes, boldly go for "processor.max_cstate=5" next.

On a formal note, please avoid bloating the thread w/ pointless full quotes of previous posts to keep everyones mousewheels cool wink

Offline

#6 2026-03-23 00:18:38

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

Okay maybe I am confused or was confusing. SysRq IS ENABLED and I can trigger it under norm circumstances. When the freeze happens I CANNOT trigger sysrq. NOTHING happens visually when I REISUB under the freeze.

I am asking how to fix the freezing.

Offline

#7 2026-03-23 08:07:34

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

The kernel complete halted - have you already tried to limit the c-states?

Offline

#8 2026-03-25 04:37:47

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

@seth no I have not tried limiting cstates, will do that. I got rid of the `pci=nommconf` and now I'm getting these a bunch but freeze is still random with no logs:

Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: AER: Multiple Correctable error message received from 0000:02:00.0
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Transmitter ID)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0:   device [168c:003e] error status/mask=00001081/00006000
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0:    [ 0] RxErr                  (First)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0:    [ 7] BadDLLP
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0:    [12] Timeout
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: AER: Multiple Correctable error message received from 0000:02:00.0
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Transmitter ID)
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3:   device [1022:15d3] error status/mask=00001000/00006000
Mar 24 23:35:44 HAL9000 kernel: pcieport 0000:00:01.3:    [12] Timeout
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0:   device [168c:003e] error status/mask=00000041/00006000
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0:    [ 0] RxErr                  (First)
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0:    [ 6] BadTLP
Mar 24 23:35:44 HAL9000 kernel: ath10k_pci 0000:02:00.0: AER:   Error of this Agent is reported first

Offline

#9 2026-03-25 12:41:52

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

Just to rule that out, can you disable the wifi chip?
(though I don't think it would result in kernel halts)

Offline

#10 2026-03-26 05:20:52

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

With the wifi chip disabled and with processor.max_cstate=1 I still get freezes.

Offline

#11 2026-03-26 07:35:11

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

Do you have updated journals covering those?
It's not the most common symptom, but https://wiki.archlinux.org/title/Ryzen#System_halts

Offline

#12 2026-03-26 07:45:56

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

Edit: double post, forum being DDOS'd?

Last edited by seth (2026-03-26 07:46:40)

Offline

#13 2026-03-26 17:18:43

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

Hmm: https://termbin.com/f6au

I thought it was my hard drive, so I disabled it, but then it happened again. Getting an error on NVME. Maybe it's a bad NVMe?
I just started playing around with my NAS so thats the logs at the end, not sure if its an issue.

I tried updating the UEFI, but I gave up. fmwmgr didn't pick up anything.

Offline

#14 2026-03-26 19:16:21

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

Feeling more and more like something with my NVME SSD. I've been doing a ton of transfers of large media files, including downloading and uploading to a NAS, and it's consistently failing.

Any ideas on how to verify this?

Offline

#15 2026-03-26 21:19:58

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

First and foremost by cutting out the NAS (ie. network) - just use a usb drive or stress the disk w/ dd, https://wiki.archlinux.org/title/Benchmarking#dd (that's for benchmarking, you might need bigger files?)

Offline

#16 2026-03-28 02:32:56

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

ugh this sucks and is the worst to debug. if you had to suggest would you suspect the ssd or the ethernet? or do we even know enough

Offline

#17 2026-03-28 07:34:33

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

Keep a terminal running "dmesg -W" visible at all times, if you're loosing the root partition you'd likely get some IO errors before it actually causes stuff to freeze.
Since you're already getting bus errors from the wireless device (which I assume connects your NAS - do you have a wired connection and/or can deactivate the wifi chip and use a dongle or in doubt your phone¹ for a while?) there's a significant chance that this is coming from there (also because the nvme failing, even if it holds your root partition, "should" not break the sysrq invocation hmm

Edit:
¹ https://wiki.archlinux.org/title/Tethering

Last edited by seth (2026-03-28 07:34:54)

Offline

#18 2026-03-30 00:32:54

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

I am using a wired connection and the wifi chip is deactivated. I did that after you told me awhile back up.

Offline

#19 2026-03-30 00:39:07

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

Okay yeah, I can confirm it's the SSD. Ugh noooooooooo. Consistently fails after like 30 seconds of dding from /dev/random. Nothing shows up in the dmesg -W terminal.

Offline

#20 2026-03-30 00:54:00

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

Well I said that and now I cant get it to do it again... did see this once tho:

Mar 29 19:40:24 HAL9000 kernel: pcieport 0000:00:01.4: AER: Correctable error message received from 0000:03:00.0
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0:   device [15b7:501a] error status/mask=00000001/0000e000
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0:    [ 0] RxErr                  (First)
Mar 29 19:48:00 HAL9000 kernel: device-mapper: uevent: version 1.0.3
Mar 29 19:48:00 HAL9000 kernel: device-mapper: ioctl: 4.50.0-ioctl (2025-04-28) initialised: dm-devel@lists.linux.dev

Offline

#21 2026-03-30 04:17:46

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

Uh oh: I did a full reinstall on the hard drive and ran the same disk stress test and it froze. So either both drives are bad in the exact same way? or its not the ram or the drives (as I ran memtest), which would mean its what? the CPU??

Offline

#22 2026-03-30 07:03:24

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

same disk stress test

Writing where exactly?
The errors you've posted so far are all on the bus (wifi and nvme), do you have one for the latest freeze as well?

lspci -tv

and try to add "pcie_aspm=off" to the https://wiki.archlinux.org/title/Kernel_parameters

Offline

#23 2026-04-02 17:52:43

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

I already have pcie_aspm=off. The logs show nothing new. I stressed my ssd with my hdd unmounted and it froze. I then stressed hdd with my ssd unmounted and it froze.

same old:

Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0:   device [15b7:501a] error status/mask=00000001/0000e000
Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0:    [ 0] RxErr                  (First)
-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex
           +-00.2  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
           +-01.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
           +-01.2-[01]--+-00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller
           |            +-00.1  Realtek Semiconductor Co., Ltd. RTL8111xP UART #1
           |            +-00.2  Realtek Semiconductor Co., Ltd. RTL8111xP UART #2
           |            +-00.3  Realtek Semiconductor Co., Ltd. RTL8111xP IPMI interface
           |            \-00.4  Realtek Semiconductor Co., Ltd. RTL811x EHCI host controller
           +-01.3-[02]----00.0  Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
           +-01.4-[03]----00.0  Sandisk Corp SanDisk Ultra 3D / WD Blue SN570 NVMe SSD (DRAM-less)
           +-08.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
           +-08.1-[04]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series]
           |            +-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller
           |            +-00.2  Advanced Micro Devices, Inc. [AMD] Raven/Raven2/FireFlight/Renoir/Cezanne Platform Security Processor
           |            +-00.3  Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
           |            +-00.4  Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
           |            +-00.5  Advanced Micro Devices, Inc. [AMD] Audio Coprocessor
           |            \-00.6  Advanced Micro Devices, Inc. [AMD] Ryzen HD Audio Controller
           +-08.2-[05]----00.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
           +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
           +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
           +-18.0  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 0
           +-18.1  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 1
           +-18.2  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 2
           +-18.3  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 3
           +-18.4  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 4
           +-18.5  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 5
           +-18.6  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 6
           \-18.7  Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 7

I just bought another computer so I will plug ssd hdd and ram into that and see how it fares.

Last edited by qualia (2026-04-02 17:53:15)

Offline

#24 2026-04-02 18:16:37

qualia
Member
Registered: 2024-01-01
Posts: 26

Re: Random Unsolvable Freezes

Its gotta be something really low. I boot into the BIOS fine, and nothing breaks there, but when I go into the Lenovo Diagnostic Tool it freezes before I can even hit a key.

Offline

#25 2026-04-02 18:40:23

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 74,663

Re: Random Unsolvable Freezes

I then stressed hdd with my ssd unmounted and it froze.

Did that result in the

Mar 29 19:40:24 HAL9000 kernel: nvme 0000:03:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)

?

Wifi and ethernet are on the same bus as the nvme.
Can you disable those in the UEFI?

Offline

Board footer

Powered by FluxBB