Skip to the content.

拯救者 R9000P 2023

基本硬件信息: https://linux-hardware.org/?probe=95c540792e

这个电脑是我当时在拼多多买的,买回来之后,发现非常容易宕机,我切换了多个内核版本, 都会有问题,记录在附录中,我感觉太奇怪了,所以我把这个切换为 Windows ,找联想售后

维修花费 3 周:

运行 Windows 平均 3 个月宕机一次。

至此,拉黑拼多多和联想。

调查过程

开始的时候,感觉和 AMD 的 cstate 有关系,但是显然这都是弯路。

/*
 * Check if the CPU can handle C2 and deeper
 */
static inline unsigned int acpi_processor_cstate_check(unsigned int max_cstate)
{
	/*
	 * Early models (<=5) of AMD Opterons are not supposed to go into
	 * C2 state.
	 *
	 * Steppings 0x0A and later are good
	 */
	if (boot_cpu_data.x86 == 0x0F &&
	    boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
	    boot_cpu_data.x86_model <= 0x05 &&
	    boot_cpu_data.x86_stepping < 0x0A)
		return 1;
	else if (boot_cpu_has(X86_BUG_AMD_APIC_C1E))
		return 1;
	else
		return max_cstate;
}

附录

1

2025-01-29

[  177.539729] Oops: general protection fault, probably for non-canonical address 0x80000001c2509203: 0000 [#1] PREEMPT SMP NOPTI
[  177.539740] CPU: 20 UID: 1000 PID: 5779 Comm: qemu-system-x86 Tainted: P           O       6.12.10 #1-NixOS
[  177.539744] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[  177.539745] Hardware name: LENOVO 82WM/INVALID, BIOS LPCN39WW 04/28/2023
[  177.539747] RIP: 0010:__apic_accept_irq+0x21/0x2a0 [kvm]
[  177.539802] Code: 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 57 41 89 cf 41 56 45 89 c6 41 55 41 54 41 89 d4 55 48 89 fd 53 89 f3 48 83 ec 08 <4c> 8b af 90 00 00 00 41 8b 75 24 0f 1f 44 00 00 81 fb 00 04 00 00
[  177.539804] RSP: 0018:ffffb07d5165fc20 EFLAGS: 00010282
[  177.539807] RAX: ffff96ff029de640 RBX: 0000000000000000 RCX: 0000000000000000
[  177.539808] RDX: 00000000000000fc RSI: 0000000000000000 RDI: 80000001c2509173
[  177.539809] RBP: 80000001c2509173 R08: 0000000000000000 R09: 0000000000000000
[  177.539810] R10: 0000000000000001 R11: 00000000000000fc R12: 00000000000000fc
[  177.539811] R13: 0000000000000015 R14: 0000000000000000 R15: 0000000000000000
[  177.539813] FS:  00007f98ee1fc6c0(0000) GS:ffff97055d800000(0000) knlGS:0000000000000000
[  177.539814] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  177.539815] CR2: 00007f2fc5898000 CR3: 000000011a21e000 CR4: 0000000000f50ef0
[  177.539816] PKRU: 55555554
[  177.539817] Call Trace:
[  177.539822]  <TASK>
[  177.539826]  ? die_addr+0x36/0x90
[  177.539831]  ? exc_general_protection+0x144/0x350
[  177.539837]  ? asm_exc_general_protection+0x26/0x30
[  177.539841]  ? __apic_accept_irq+0x21/0x2a0 [kvm]
[  177.539884]  __pv_send_ipi.part.0+0x5c/0xc0 [kvm]
[  177.539929]  kvm_pv_send_ipi+0xc4/0x130 [kvm]
[  177.539971]  __kvm_emulate_hypercall+0x2a2/0x400 [kvm]
[  177.540016]  ? trace_hardirqs_off_finish+0x32/0x90
[  177.540020]  ? svm_vcpu_enter_exit+0x8e/0xe0 [kvm_amd]
[  177.540030]  ? svm_get_segment+0x1c/0x120 [kvm_amd]
[  177.540038]  kvm_emulate_hypercall+0x17d/0x210 [kvm]
[  177.540080]  kvm_arch_vcpu_ioctl_run+0x197/0x6f0 [kvm]
[  177.540123]  kvm_vcpu_ioctl+0x233/0x980 [kvm]
[  177.540162]  ? futex_wake+0x85/0x1a0
[  177.540167]  __x64_sys_ioctl+0x99/0xe0
[  177.540170]  do_syscall_64+0xc1/0x220
[  177.540173]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  177.540176] RIP: 0033:0x7f9c41fffaef
[  177.540207] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 28 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[  177.540209] RSP: 002b:00007f98ee1fb530 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  177.540211] RAX: ffffffffffffffda RBX: 00005605bacd6950 RCX: 00007f9c41fffaef
[  177.540212] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000005a
[  177.540213] RBP: 000000000000ae80 R08: 0000000000000000 R09: 0000000000000000
[  177.540214] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  177.540214] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  177.540216]  </TASK>
[  177.540365] ---[ end trace 0000000000000000 ]---
[  177.909662] pstore: backend (efi_pstore) writing error (-28)
[  177.909664] RIP: 0010:__apic_accept_irq+0x21/0x2a0 [kvm]
[  177.909713] Code: 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 57 41 89 cf 41 56 45 89 c6 41 55 41 54 41 89 d4 55 48 89 fd 53 89 f3 48 83 ec 08 <4c> 8b af 90 00 00 00 41 8b 75 24 0f 1f 44 00 00 81 fb 00 04 00 00
[  177.909715] RSP: 0018:ffffb07d5165fc20 EFLAGS: 00010282
[  177.909718] RAX: ffff96ff029de640 RBX: 0000000000000000 RCX: 0000000000000000
[  177.909719] RDX: 00000000000000fc RSI: 0000000000000000 RDI: 80000001c2509173
[  177.909720] RBP: 80000001c2509173 R08: 0000000000000000 R09: 0000000000000000
[  177.909721] R10: 0000000000000001 R11: 00000000000000fc R12: 00000000000000fc
[  177.909722] R13: 0000000000000015 R14: 0000000000000000 R15: 0000000000000000
[  177.909723] FS:  00007f98ee1fc6c0(0000) GS:ffff97055d800000(0000) knlGS:0000000000000000
[  177.909724] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  177.909726] CR2: 00007f2fc5898000 CR3: 000000011a21e000 CR4: 0000000000f50ef0
[  177.909727] PKRU: 55555554

在 kernel 内部 mini bpf scheduler 可以触发这个错误:

[   45.613408] sched_ext: BPF scheduler "minimal_scheduler" enabled
[   45.613981] sched_ext: scx_bpf_dispatch() renamed to scx_bpf_dsq_insert()
[   67.420994] NOHZ tick-stop error: local softirq work is pending, handler #200!!!
[  101.638553] git (6951) used greatest stack depth: 10576 bytes left
[  148.278622] sched_ext: Soft lockup - CPU0 stuck for 19s, disabling "minimal_scheduler"
[  149.923968] rcu: INFO: rcu_preempt self-detected stall on CPU
[  149.923968] rcu:     13-...!: (21000 ticks this GP) idle=098c/1/0x4000000000000000 softirq=23663/23663 fqs=22
[  149.923968] rcu:     (t=21002 jiffies g=37697 q=4 ncpus=32)
[  149.923968] rcu: rcu_preempt kthread starved for 20895 jiffies! g37697 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=22
[  149.923968] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[  149.923968] rcu: RCU grace-period kthread stack dump:
[  149.923968] task:rcu_preempt     state:R  running task     stack:14664 pid:17    tgid:17    ppid:2      flags:0x00004000
[  149.923968] Sched_ext: minimal_scheduler (enabled+all), task: runnable_at=-20895ms
[  149.923968] Call Trace:
[  149.923968]  <TASK>
[  149.923968]  __schedule+0x3a4/0x13a0
[  149.923968]  ? _raw_spin_lock_irqsave+0x23/0x60
[  149.923968]  ? preempt_count_sub+0x4b/0x60
[  149.923968]  ? schedule_timeout+0x87/0x100
[  149.923968]  schedule+0x41/0x1c0
[  149.923968]  schedule_timeout+0x87/0x100
[  149.923968]  ? __pfx_process_timeout+0x10/0x10
[  149.923968]  rcu_gp_fqs_loop+0x121/0x6d0
[  149.923968]  ? __pfx_rcu_gp_kthread+0x10/0x10
[  149.923968]  rcu_gp_kthread+0x1ac/0x280
[  149.923968]  kthread+0xdc/0x110
[  149.923968]  ? __pfx_kthread+0x10/0x10
[  149.923968]  ret_from_fork+0x31/0x50
[  149.923968]  ? __pfx_kthread+0x10/0x10
[  149.923968]  ret_from_fork_asm+0x1a/0x30
[  149.923968]  </TASK>
[  149.923968] rcu: Stack dump where RCU GP kthread last ran:

2

[  552.709464] Oops: general protection fault, probably for non-canonical address 0x80000001c61d71fb: 0000 [#1] PREEMPT SMP NOPTI
[  552.709477] CPU: 20 UID: 1000 PID: 7526 Comm: qemu-system-x86 Tainted: P           O       6.12.10 #1-NixOS
[  552.709481] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[  552.709483] Hardware name: LENOVO 82WM/INVALID, BIOS LPCN39WW 04/28/2023
[  552.709485] RIP: 0010:kvm_arch_dy_has_pending_interrupt+0x12/0x40 [kvm]
[  552.709557] Code: 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 66 90 48 8b 87 08 02 00 00 <80> b8 98 00 00 00 00 74 11 e9 b0 55 fb c5 48 8b 87 08 02 00 00 48
[  552.709559] RSP: 0018:ffffa4ed1de27d88 EFLAGS: 00010202
[  552.709563] RAX: 80000001c61d7163 RBX: 000000000000000b RCX: 0000000000000000
[  552.709565] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8c08c61dccb0
[  552.709567] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000001770
[  552.709568] R10: 0000000000000bb8 R11: 0000000000000000 R12: ffffa4ed1dd19000
[  552.709569] R13: ffffa4ed1dd1a128 R14: ffff8c08c719ccb0 R15: ffff8c08c61dccb0
[  552.709571] FS:  00007f30b3fff6c0(0000) GS:ffff8c0f1d800000(0000) knlGS:0000000000000000
[  552.709573] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  552.709575] CR2: 00007f3f89b61000 CR3: 00000001c2974000 CR4: 0000000000f50ef0
[  552.709576] PKRU: 55555554
[  552.709578] Call Trace:
[  552.709582]  <TASK>
[  552.709587]  ? die_addr+0x36/0x90
[  552.709594]  ? exc_general_protection+0x144/0x350
[  552.709601]  ? asm_exc_general_protection+0x26/0x30
[  552.709607]  ? kvm_arch_dy_has_pending_interrupt+0x12/0x40 [kvm]
[  552.709661]  kvm_vcpu_on_spin+0x211/0x260 [kvm]
[  552.709716]  pause_interception+0x89/0x100 [kvm_amd]
[  552.709731]  kvm_arch_vcpu_ioctl_run+0x197/0x6f0 [kvm]
[  552.709784]  kvm_vcpu_ioctl+0x233/0x980 [kvm]
[  552.709837]  ? futex_wake+0x85/0x1a0
[  552.709845]  __x64_sys_ioctl+0x99/0xe0
[  552.709852]  do_syscall_64+0xc1/0x220
[  552.709858]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  552.709862] RIP: 0033:0x7f3482bffaef
[  552.709912] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 28 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[  552.709915] RSP: 002b:00007f30b3ffe530 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  552.709917] RAX: ffffffffffffffda RBX: 0000564908ef8770 RCX: 00007f3482bffaef
[  552.709918] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000082
[  552.709920] RBP: 000000000000ae80 R08: 0000000000000000 R09: 0000000000000000
[  552.709922] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  552.709923] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  552.709926]  </TASK>
[  552.710185] ---[ end trace 0000000000000000 ]---
[  553.121753] pstore: backend (efi_pstore) writing error (-28)
[  553.121758] RIP: 0010:kvm_arch_dy_has_pending_interrupt+0x12/0x40 [kvm]
[  553.121818] Code: 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 66 90 48 8b 87 08 02 00 00 <80> b8 98 00 00 00 00 74 11 e9 b0 55 fb c5 48 8b 87 08 02 00 00 48
[  553.121821] RSP: 0018:ffffa4ed1de27d88 EFLAGS: 00010202
[  553.121824] RAX: 80000001c61d7163 RBX: 000000000000000b RCX: 0000000000000000
[  553.121825] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8c08c61dccb0
[  553.121827] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000001770
[  553.121828] R10: 0000000000000bb8 R11: 0000000000000000 R12: ffffa4ed1dd19000
[  553.121829] R13: ffffa4ed1dd1a128 R14: ffff8c08c719ccb0 R15: ffff8c08c61dccb0
[  553.121831] FS:  00007f30b3fff6c0(0000) GS:ffff8c0f1d800000(0000) knlGS:0000000000000000
[  553.121832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  553.121834] CR2: 00007f3f89b61000 CR3: 00000001c2974000 CR4: 0000000000f50ef0
[  553.121835] PKRU: 55555554

同时 guest 中触发的问题:

localhost login: [  283.626048] rcu: INFO: rcu_preempt self-detected stall on CPU
[  283.626048] rcu:     18-....: (20999 ticks this GP) idle=5c14/1/0x4000000000000000 softirq=31417/31417 fqs=4795
[  283.626048] rcu:     (t=21001 jiffies g=40829 q=718 ncpus=32)
[  283.626048] CPU: 18 UID: 0 PID: 204 Comm: kswapd0 Not tainted 6.13.0 #14
[  283.626048] Hardware name: Martins3 Inc Hacking Alpine, BIOS 12 2012-3-4
[  283.626048] RIP: 0010:smp_call_function_many_cond+0x107/0x570
[  283.626048] Code: 74 56 f3 48 0f bc c0 89 c1 83 f8 3f 77 4a be 01 00 00 00 48 63 c1 48 8b 13 48 03 14 c5 a0 fb 8e 82 8b 42 08 a8 01 74 09 f3 90 <8b> 42 08 a8 01 75 f7 83 c1 01 48 63 c1 48 83 f8 3f 77 1b 48 89 f0
[  283.626048] RSP: 0018:ffffc900027f7808 EFLAGS: 00000202
[  283.626048] RAX: 0000000000000011 RBX: ffff8980f96b2000 RCX: 000000000000001b
[  283.626048] RDX: ffff8980f98f6d80 RSI: 0000000000000001 RDI: 000000000000001f
[  283.626048] RBP: 0000000000000001 R08: 000000000000001f R09: 0000000000000000
[  283.626048] R10: 000000007ffbfdff R11: 0000000000000000 R12: ffffffff8108cde0
[  283.626048] R13: ffff8980f96afd80 R14: ffffffff8108d670 R15: 0000000000000012
[  283.626048] FS:  0000000000000000(0000) GS:ffff8980f9680000(0000) knlGS:0000000000000000
[  283.626048] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  283.626048] CR2: 00007f3f36bf3000 CR3: 0000000003036000 CR4: 0000000000750ef0
[  283.626048] PKRU: 55555554
[  283.626048] Call Trace:
[  283.626048]  <IRQ>
[  283.626048]  ? rcu_dump_cpu_stacks+0x116/0x1d0
[  283.626048]  ? rcu_sched_clock_irq+0x3ab/0x1260
[  283.626048]  ? __walk_groups.isra.0+0x1f/0x70
[  283.626048]  ? tmigr_requires_handle_remote+0xcc/0xe0
[  283.626048]  ? update_process_times+0x6f/0xc0
[  283.626048]  ? tick_nohz_handler+0x8f/0x140
[  283.626048]  ? __pfx_tick_nohz_handler+0x10/0x10
[  283.626048]  ? __hrtimer_run_queues+0x85/0x2d0
[  283.626048]  ? hrtimer_interrupt+0xff/0x250
[  283.626048]  ? __sysvec_apic_timer_interrupt+0x52/0x120
[  283.626048]  ? sysvec_apic_timer_interrupt+0x6e/0x80
[  283.626048]  </IRQ>
[  283.626048]  <TASK>
[  283.626048]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  283.626048]  ? __pfx_flush_tlb_func+0x10/0x10
[  283.626048]  ? __pfx_tlb_is_not_lazy+0x10/0x10
[  283.626048]  ? smp_call_function_many_cond+0x107/0x570
[  283.626048]  ? smp_call_function_many_cond+0x291/0x570
[  283.626048]  ? __pfx_flush_tlb_func+0x10/0x10
[  283.626048]  ? __pfx_flush_tlb_func+0x10/0x10
[  283.626048]  ? __pfx_tlb_is_not_lazy+0x10/0x10
[  283.626048]  on_each_cpu_cond_mask+0x40/0x80
[  283.626048]  arch_tlbbatch_flush+0x115/0x130
[  283.626048]  try_to_unmap_flush_dirty+0x36/0x50
[  283.626048]  shrink_folio_list+0x69c/0xdb0
[  283.626048]  evict_folios+0x258/0x610
[  283.626048]  try_to_shrink_lruvec+0x1a4/0x2b0
[  283.626048]  shrink_one+0xfd/0x1e0
[  283.626048]  shrink_node+0xabd/0xc80
[  283.626048]  ? mem_cgroup_iter+0x1b9/0x210
[  283.626048]  balance_pgdat+0x4ca/0x910
[  283.626048]  ? perf_pmu_resched+0x10/0x60
[  283.626048]  ? trace_hardirqs_on+0x21/0x80
[  283.626048]  ? finish_task_switch.isra.0+0x9e/0x2e0
[  283.626048]  kswapd+0x1ec/0x390
[  283.626048]  ? __pfx_autoremove_wake_function+0x10/0x10
[  283.626048]  ? __pfx_kswapd+0x10/0x10
[  283.626048]  kthread+0xdc/0x110
[  283.626048]  ? __pfx_kthread+0x10/0x10
[  283.626048]  ret_from_fork+0x31/0x50
[  283.626048]  ? __pfx_kthread+0x10/0x10
[  283.626048]  ret_from_fork_asm+0x1a/0x30
[  283.626048]  </TASK>
[  288.349418] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [indexer18:4532]
[  288.349868] CPU#2 Utilization every 4s during lockup:
[  288.350055]  #1: 100% system,          0% softirq,     0% hardirq,     0% idle
[  288.350055]  #2: 101% system,          0% softirq,     0% hardirq,     0% idle
[  288.350055]  #3: 100% system,          0% softirq,     0% hardirq,     0% idle
[  288.350055]  #4: 101% system,          0% softirq,     0% hardirq,     0% idle
[  288.350055]  #5: 100% system,          0% softirq,     0% hardirq,     0% idle
[  288.350055] Modules linked in: xt_addrtype bridge stp llc overlay rpcsec_gss_krb5 auth_rpcgss xt_MASQUERADE xt_mark tun nf_tables iptable_nat 9p vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd configfs ext4 mbcache jbd2 usb_storage usbhid kvm_amd ccp uhci_hcd sha1_generic ehci_pci nvme ehci_hcd kvm 9pnet_virtio nvme_core usbcore crc32c_intel virtio_scsi virtio_console 9pnet virtio_balloon nvme_auth usb_common virtio_net sch_fq_codel nfsv4 nfs lockd grace sunrpc netfs configs fuse virtio_pci virtio_pci_modern_dev virtio_pci_legacy_dev
[  288.350055] CPU: 2 UID: 1000 PID: 4532 Comm: indexer18 Not tainted 6.13.0 #14

3

guest 和 host 同时构建内核,guest 会卡死,这显然是 kernel bug 了

[12290.947770] perf: interrupt took too long (2517 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[12291.035751] perf: interrupt took too long (3160 > 3146), lowering kernel.perf_event_max_sample_rate to 63000
[12291.327763] perf: interrupt took too long (3957 > 3950), lowering kernel.perf_event_max_sample_rate to 50000
[12292.203829] perf: interrupt took too long (4985 > 4946), lowering kernel.perf_event_max_sample_rate to 40000
[12293.299532] perf: interrupt took too long (6242 > 6231), lowering kernel.perf_event_max_sample_rate to 32000
[12298.565584] perf: interrupt took too long (7810 > 7802), lowering kernel.perf_event_max_sample_rate to 25000
[144115809.998853] sched: DL replenish lagged too much
[144115809.998853] ------------[ cut here ]------------
[144115809.998853] sa->load_avg || sa->util_avg || sa->runnable_avg
[144115809.998853] WARNING: CPU: 23 PID: 0 at kernel/sched/fair.c:4024 sched_balance_update_blocked_averages+0x706/0x760
[144115809.998853] Modules linked in: vhost_net vhost vhost_iotlb xt_addrtype bridge stp llc overlay rpcsec_gss_krb5 auth_rpcgss xt_MASQUERADE xt_mark tun nf_tables iptable_nat 9p vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd configfs ext4 mbcache jbd2 usbhid usb_storage kvm_amd ccp uhci_hcd sha1_generic ehci_pci kvm 9pnet_virtio ehci_hcd nvme usbcore nvme_core crc32c_intel 9pnet usb_common virtio_console virtio_balloon virtio_net nvme_auth virtio_scsi sch_fq_codel nfsv4 nfs lockd grace sunrpc netfs configs fuse virtio_pci virtio_pci_modern_dev virtio_pci_legacy_dev
[144115809.998853] CPU: 23 UID: 0 PID: 0 Comm: swapper/23 Not tainted 6.13.0 #13
[144115809.998853] Hardware name: Martins3 Inc Hacking Alpine, BIOS 12 2012-3-4
[144115809.998853] RIP: 0010:sched_balance_update_blocked_averages+0x706/0x760
[144115809.998853] Code: 48 8b 0c 24 e9 12 ff ff ff 48 c7 c2 ff ff ff ff e9 40 fe ff ff c6 05 b3 66 17 02 01 90 48 c7 c7 48 8c 76 82 e8 7b 60 fa ff 90 <0f> 0b 90 90 e9 23 fb ff ff c6 05 9a 66 17 02 01 90 48 c7 c7 f8 89
[144115809.998853] RSP: 0018:ffffc900037d8ee8 EFLAGS: 00010082
[144115809.998853] RAX: 0000000000000000 RBX: ffff898070867800 RCX: ffff89836f7dca88
[144115809.998853] RDX: 0000000000000027 RSI: 0000000000000027 RDI: 0000000000000001
[144115809.998853] RBP: 0000000000000001 R08: 00000000ffffbfff R09: 0000000000000001
[144115809.998853] R10: 00000000ffffbfff R11: ffff89876f8a0000 R12: ffff89801fc77000
[144115809.998853] R13: ffff89836f7f15c8 R14: ffff898070867948 R15: 00000000000000b8
[144115809.998853] FS:  0000000000000000(0000) GS:ffff89836f7c0000(0000) knlGS:0000000000000000
[144115809.998853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[144115809.998853] CR2: 00007fa6f9c14000 CR3: 000001000458c000 CR4: 0000000000750ef0
[144115809.998853] PKRU: 55555554
[144115809.998853] Call Trace:
[144115809.998853]  <IRQ>
[144115809.998853]  ? __warn+0x89/0x130
[144115809.998853]  ? sched_balance_update_blocked_averages+0x706/0x760
[144115809.998853]  ? report_bug+0x164/0x190
[144115809.998853]  ? handle_bug+0x54/0x90
[144115809.998853]  ? exc_invalid_op+0x17/0x70
[144115809.998853]  ? asm_exc_invalid_op+0x1a/0x20
[144115809.998853]  ? sched_balance_update_blocked_averages+0x706/0x760
[144115809.998853]  ? sched_balance_update_blocked_averages+0x705/0x760
[144115809.998853]  ? timerqueue_add+0x98/0xc0
[144115809.998853]  ? enqueue_hrtimer+0x35/0x90
[144115809.998853]  sched_balance_softirq+0x43/0x60
[144115809.998853]  handle_softirqs+0x10b/0x3b0
[144115809.998853]  __irq_exit_rcu+0xd9/0x100
[144115809.998853]  irq_exit_rcu+0xe/0x20
[144115809.998853]  sysvec_apic_timer_interrupt+0x73/0x80
[144115809.998853]  </IRQ>
[144115809.998853]  <TASK>
[144115809.998853]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[144115809.998853] RIP: 0010:default_idle+0xf/0x20
[144115809.998853] Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 23 4f 2c 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
[144115809.998853] RSP: 0018:ffffc9000228fee0 EFLAGS: 00000202
[144115809.998853] RAX: 0000000000000017 RBX: ffff8980023bb080 RCX: ffff89800b965308
[144115809.998853] RDX: 0000000000000017 RSI: ffffffff8288b04d RDI: 00000000005c8d44
[144115809.998853] RBP: 0000000000000017 R08: 00000000005c8d44 R09: 0000000000000002
[144115809.998853] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[144115809.998853] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[144115809.998853]  default_idle_call+0x3f/0x110
[144115809.998853]  do_idle+0x1cf/0x210
[144115809.998853]  cpu_startup_entry+0x29/0x30
[144115809.998853]  start_secondary+0x11e/0x140
[144115809.998853]  common_startup_64+0x13e/0x148
[144115809.998853]  </TASK>
[144115809.998853] ---[ end trace 0000000000000000 ]---
[144115809.998853] BUG: kernel NULL pointer dereference, address: 0000000000000051
[144115809.998853] #PF: supervisor read access in kernel mode
[144115809.998853] #PF: error_code(0x0000) - not-present page
[144115809.998853] PGD 1002a989067 P4D 1002a989067 PUD 10070b91067 PMD 0
[144115809.998853] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[144115809.998853] CPU: 25 UID: 0 PID: 345 Comm: kworker/25:1 Tainted: G        W          6.13.0 #13
[144115809.998853] Tainted: [W]=WARN
[144115809.998853] Hardware name: Martins3 Inc Hacking Alpine, BIOS 12 2012-3-4
[144115809.998853] Workqueue:  0x0 (events)
[144115809.998853] RIP: 0010:pick_task_fair+0x3a/0x140
[144115809.998853] Code: 00 00 41 54 49 89 fc 55 53 41 8b 8c 24 10 01 00 00 85 c9 0f 84 e5 00 00 00 4c 89 ed eb 2e 66 90 66 90 48 89 ef e8 06 78 ff ff <80> 78 51 00 48 89 c3 0f 85 a7 00 00 00 48 85 db 74 cd 48 8b ab a8
[144115809.998853] RSP: 0018:ffffc90004a73d68 EFLAGS: 00010086
[144115809.998853] RAX: 0000000000000000 RBX: ffffffff8289a710 RCX: 000000000000002a
[144115809.998853] RDX: ffd1ae2783c29000 RSI: 000000000000042a RDI: 0000000000000400
[144115809.998853] RBP: ffff89836f870bc0 R08: 0000000000000400 R09: 0000000000000002
[144115809.998853] R10: ffff89836f8612e0 R11: 0000000000000000 R12: ffff89836f870ac0
[144115809.998853] R13: ffff89836f870bc0 R14: ffff8980160c8000 R15: ffff89836f870ac0
[144115809.998853] FS:  0000000000000000(0000) GS:ffff89836f840000(0000) knlGS:0000000000000000
[144115809.998853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[144115809.998853] CR2: 0000000000000051 CR3: 00000100162dc000 CR4: 0000000000750ef0
[144115809.998853] PKRU: 55555554
[144115809.998853] Call Trace:
[144115809.998853]  <TASK>
[144115809.998853]  ? __die+0x23/0x70
[144115809.998853]  ? page_fault_oops+0x17d/0x550
[144115809.998853]  ? exc_page_fault+0x79/0x180
[144115809.998853]  ? asm_exc_page_fault+0x26/0x30
[144115809.998853]  ? pick_task_fair+0x3a/0x140
[144115809.998853]  ? pick_task_fair+0x3a/0x140
[144115809.998853]  pick_next_task_fair+0x21/0x3c0
[144115809.998853]  __pick_next_task+0x3e/0x1a0
[144115809.998853]  __schedule+0x166/0x1530
[144115809.998853]  ? queue_delayed_work_on+0x74/0x90
[144115809.998853]  ? worker_thread+0x1b1/0x3b0
[144115809.998853]  schedule+0x41/0x1c0
[144115809.998853]  worker_thread+0x1b1/0x3b0
[144115809.998853]  ? __pfx_worker_thread+0x10/0x10
[144115809.998853]  kthread+0xdc/0x110
[144115809.998853]  ? __pfx_kthread+0x10/0x10
[144115809.998853]  ret_from_fork+0x31/0x50
[144115809.998853]  ? __pfx_kthread+0x10/0x10
[144115809.998853]  ret_from_fork_asm+0x1a/0x30
[144115809.998853]  </TASK>
[144115809.998853] Modules linked in: vhost_net vhost vhost_iotlb xt_addrtype bridge stp llc overlay rpcsec_gss_krb5 auth_rpcgss xt_MASQUERADE xt_mark tun nf_tables iptable_nat 9p vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd configfs ext4 mbcache jbd2 usbhid usb_storage kvm_amd ccp uhci_hcd sha1_generic ehci_pci kvm 9pnet_virtio ehci_hcd nvme usbcore nvme_core crc32c_intel 9pnet usb_common virtio_console virtio_balloon virtio_net nvme_auth virtio_scsi sch_fq_codel nfsv4 nfs lockd grace sunrpc netfs configs fuse virtio_pci virtio_pci_modern_dev virtio_pci_legacy_dev
[144115809.998853] CR2: 0000000000000051
[144115809.998853] ---[ end trace 0000000000000000 ]---
[144115809.998853] RIP: 0010:pick_task_fair+0x3a/0x140
[144115809.998853] Code: 00 00 41 54 49 89 fc 55 53 41 8b 8c 24 10 01 00 00 85 c9 0f 84 e5 00 00 00 4c 89 ed eb 2e 66 90 66 90 48 89 ef e8 06 78 ff ff <80> 78 51 00 48 89 c3 0f 85 a7 00 00 00 48 85 db 74 cd 48 8b ab a8
[144115809.998853] RSP: 0018:ffffc90004a73d68 EFLAGS: 00010086
[144115809.998853] RAX: 0000000000000000 RBX: ffffffff8289a710 RCX: 000000000000002a
[144115809.998853] RDX: ffd1ae2783c29000 RSI: 000000000000042a RDI: 0000000000000400
[144115809.998853] RBP: ffff89836f870bc0 R08: 0000000000000400 R09: 0000000000000002
[144115809.998853] R10: ffff89836f8612e0 R11: 0000000000000000 R12: ffff89836f870ac0
[144115809.998853] R13: ffff89836f870bc0 R14: ffff8980160c8000 R15: ffff89836f870ac0
[144115809.998853] FS:  0000000000000000(0000) GS:ffff89836f840000(0000) knlGS:0000000000000000
[144115809.998853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[144115809.998853] CR2: 0000000000000051 CR3: 00000100162dc000 CR4: 0000000000750ef0
[144115809.998853] PKRU: 55555554
[144115809.998853] note: kworker/25:1[345] exited with irqs disabled

4

切换 6.6 内核之后,还有问题:

2月 02 16:57:28 nixos kernel: general protection fault, probably for non-canonical address 0x1650e1d88812c: 0000 [#1] PREEMPT SMP NOPTI
2月 02 16:57:28 nixos kernel: CPU: 21 PID: 2197 Comm: .gnome-shell-wr Tainted: P        W  O       6.6.72 #1-NixOS
2月 02 16:57:28 nixos kernel: Hardware name: LENOVO 82WM/INVALID, BIOS LPCN39WW 04/28/2023
2月 02 16:57:28 nixos kernel: RIP: 0010:kmem_cache_alloc+0xe0/0x3b0
2月 02 16:57:28 nixos kernel: Code: 84 11 02 00 00 66 90 8b 05 05 8f 99 01 85 c0 0f 84 86 02 00 00 48 c7 44 24 10 00 00 00 00 49 8b 04 24 65 48 03 05 58 69 83 71 <48> 8b 50 08 48 83 78 10 00 4c 8b 30 0f 84 0a 02 00 00 4d 85 f6 0f
2月 02 16:57:28 nixos kernel: RSP: 0018:ffffc90002197e50 EFLAGS: 00010207
2月 02 16:57:28 nixos kernel: RAX: 0001650e1d888124 RBX: 0000000000000cc0 RCX: 0000000000000000
2月 02 16:57:28 nixos kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
2月 02 16:57:28 nixos kernel: RBP: ffffc90002197ea0 R08: 0000000000000000 R09: 0000000000000000
2月 02 16:57:28 nixos kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810cbde400
2月 02 16:57:28 nixos kernel: R13: ffff8881010eb1c0 R14: 00007fff85a26d50 R15: 0000000000000cc0
2月 02 16:57:28 nixos kernel: FS:  00007f5fc8cd3e80(0000) GS:ffff88881d880000(0000) knlGS:0000000000000000
2月 02 16:57:28 nixos kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2月 02 16:57:28 nixos kernel: CR2: 00007f0b10e76010 CR3: 000000014a5a6000 CR4: 0000000000f50ee0
2月 02 16:57:28 nixos kernel: PKRU: 55555554
2月 02 16:57:28 nixos kernel: Call Trace:
2月 02 16:57:28 nixos kernel:  <TASK>
2月 02 16:57:28 nixos kernel:  ? die_addr+0x36/0x90
2月 02 16:57:28 nixos kernel:  ? exc_general_protection+0x143/0x3c0
2月 02 16:57:28 nixos kernel:  ? asm_exc_general_protection+0x26/0x30
2月 02 16:57:28 nixos kernel:  ? kmem_cache_alloc+0xe0/0x3b0
2月 02 16:57:28 nixos kernel:  ? nvidia_unlocked_ioctl+0x326/0x930 [nvidia]
2月 02 16:57:28 nixos kernel:  nvidia_unlocked_ioctl+0x326/0x930 [nvidia]
2月 02 16:57:28 nixos kernel:  __x64_sys_ioctl+0x9c/0xe0
2月 02 16:57:28 nixos kernel:  do_syscall_64+0x39/0x90
2月 02 16:57:28 nixos kernel:  entry_SYSCALL_64_after_hwframe+0x78/0xe2
2月 02 16:57:28 nixos kernel: RIP: 0033:0x7f5fd9d12aef
2月 02 16:57:28 nixos kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 28 48 8b 44 24 18 64 48 2b 04 25 28 00 00
2月 02 16:57:28 nixos kernel: RSP: 002b:00007fff85a26c40 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
2月 02 16:57:28 nixos kernel: RAX: ffffffffffffffda RBX: 00007fff85a26d50 RCX: 00007f5fd9d12aef
2月 02 16:57:28 nixos kernel: RDX: 00007fff85a26d50 RSI: 00000000c020462a RDI: 000000000000000d
2月 02 16:57:28 nixos kernel: RBP: 00000000c020462a R08: 00007fff85a26d50 R09: 00007fff85a26d6c
2月 02 16:57:28 nixos kernel: R10: 00000000c1d00019 R11: 0000000000000246 R12: 000000000000000d
2月 02 16:57:28 nixos kernel: R13: 00007fff85a26d6c R14: 00000000679f3378 R15: 00007fff85a26ca0
2月 02 16:57:28 nixos kernel:  </TASK>
2月 02 16:57:28 nixos kernel: ---[ end trace 0000000000000000 ]---
2月 02 16:57:28 nixos kernel: pstore: backend (efi_pstore) writing error (-5)
2月 02 16:57:28 nixos kernel: RIP: 0010:kmem_cache_alloc+0xe0/0x3b0
2月 02 16:57:28 nixos kernel: Code: 84 11 02 00 00 66 90 8b 05 05 8f 99 01 85 c0 0f 84 86 02 00 00 48 c7 44 24 10 00 00 00 00 49 8b 04 24 65 48 03 05 58 69 83 71 <48> 8b 50 08 48 83 78 10 00 4c 8b 30 0f 84 0a 02 00 00 4d 85 f6 0f
2月 02 16:57:28 nixos kernel: RSP: 0018:ffffc90002197e50 EFLAGS: 00010207
2月 02 16:57:28 nixos kernel: RAX: 0001650e1d888124 RBX: 0000000000000cc0 RCX: 0000000000000000
2月 02 16:57:28 nixos kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
2月 02 16:57:28 nixos kernel: RBP: ffffc90002197ea0 R08: 0000000000000000 R09: 0000000000000000
2月 02 16:57:28 nixos kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810cbde400
2月 02 16:57:28 nixos kernel: R13: ffff8881010eb1c0 R14: 00007fff85a26d50 R15: 0000000000000cc0
2月 02 16:57:28 nixos kernel: FS:  00007f5fc8cd3e80(0000) GS:ffff88881d880000(0000) knlGS:0000000000000000
2月 02 16:57:28 nixos kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2月 02 16:57:28 nixos kernel: CR2: 00007f0b10e76010 CR3: 000000014a5a6000 CR4: 0000000000f50ee0
2月 02 16:57:28 nixos kernel: PKRU: 55555554

5

2月 02 19:29:21 nixos kernel: list_del corruption. next->prev should be ffff888168554a38, but was 36b399449cc4488c. (next=ffff88819b9de2b8)
2月 02 19:29:21 nixos kernel: ------------[ cut here ]------------
2月 02 19:29:21 nixos kernel: kernel BUG at lib/list_debug.c:65!

本站所有文章转发 CSDN 将按侵权追究法律责任,其它情况随意。