Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Unwanted shutdown/start of a computer with Gentoo
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
kgdrenefort
Apprentice
Apprentice


Joined: 19 Sep 2023
Posts: 186
Location: Somewhere in the 77

PostPosted: Sat Mar 30, 2024 3:43 pm    Post subject: Unwanted shutdown/start of a computer with Gentoo Reply with quote

Hi,

I post here, into Other Things Gentoo, for something that is likely non directly a Gentoo’s problem, but I am not sure at this point.

Context:

A few months ago I mounted a new computer for my fiancée, which run Gentoo and Windows 11.

Problem:

Without obvious reason, the computer suddenly shutdown, then seems to start. It does not looks like a reboot in my opinion, but I can be wrong.

It looks like a shutdown followed by a start because it does not properly shuts everything it should: When I restart the computer, you can see some line indicating the normal shutdown/restart process, here it doesn’t.

It’s like the power button was pressed or worst: the A/C units was disconnected. As explained, it is followed by a normal starting of the computer.

We do not saw that on Windows 11, mostly it’s random. First days it "hard rebooted" like three times.

Then only once or twice next days.

Since last days it was OK, but yesterday evening it does that again. The only things we remember was me putting my feet on the desk while I was sit in my desktop chair.

Fix / and stuffs we tried:

1/ In doubt we memtested the RAM for a big night, no errors reported, 13 or 16 passes.
2/ We spotted, but it was maybe random, that the first days we had a lot of the "hard reboot" while she was using a connected wi-fi/bluetooth antenna. So we removed it, no problem for a few days. Then we connected it back, no problem for a few days until yesterday evening.
3/ In doubt, I tested the GPU but running in extrem the Superposition Benchmark tools, wondering if a lot of activity for CPU/GPU was in cause. Nothing.
4/ I also used for some dozen of minutes GIMPS (a tool that makes the CPU generate a lot of prime number, or something like that), nothing.
5/ Checking if the A/C units connection was ok, looks like.
6/ Checking in any wire was disconnected to the power strip, looks OK. My own computer is connected on the same power strip than her, no problem on my own desktop.

3 and 4 were, honestly, «why not» tests.

The only non-new pieces of hardware the computer use or is connected too:

- SSD hard drive (Samsung EVO 870)
- SSD hard drive (Samsung EVO 860)
- HDD Drive (Seagate ST3000DM001, which is quite old now, was buy in 2018 and hold only some datas, very few activities on it)
- An external hard drive connected in USB 3
- Keyboard and mouse (using bluetooth USB dongle)
- A webcam in USB
- A top-screen lamp in USB (the kind that is added on the top of the screen)
- An old USB hub

I came here for ideas, mostly. Did you had this kind of problem before ? For me, it looks really like an hardware problem, and she doesn’t use Windows 11 enough to have spotted it while using it so…

The Gentoo is up-to-date (at least once a week), she’s using the last 23.0 profile, kernel 6.6.21-gentoo-dist with systemd.

Following like is the dmesg (2838 lines) while writing this post : https://0x0.st/XzNR.2024-16h

Below the lines where I spotted error, relevants or not to me:

Code:
[    0.281205] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.GPP0._PRW], AE_ALREADY_EXISTS (20230628/dswload2-326)
[    0.281213] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20230628/psobject-220)
[    0.281220] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.GPP2._PRW], AE_ALREADY_EXISTS (20230628/dswload2-326)
[    0.281225] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20230628/psobject-220)
[    0.281233] ACPI BIOS Error (bug): Failure creating named object [\_GPE._L08], AE_ALREADY_EXISTS (20230628/dswload2-326)
[    0.281238] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20230628/psobject-220)


Code:
[    0.791086] hub 10-0:1.0: USB hub found
[    0.791090] hub 10-0:1.0: config failed, hub doesn't have any ports! (err -19)


Code:
[ 4595.894648] Bluetooth: hci0: unexpected event for opcode 0x0c52


Code:
[ 4595.908716] Bluetooth: hci0: Opcode 0x0c52 failed: -2


Code:
[ 4609.365556] Bluetooth: hci0: Execution of wmt command timed out
[ 4609.365563] Bluetooth: hci0: Failed to send wmt patch dwnld (-110)
[ 4609.365584] Bluetooth: hci0: Failed to set up firmware (-110)
[ 4609.365586] Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported.


Then a lot of these:

Code:
[ 9687.749366] NVRM: API mismatch: the client has the version 550.67, but
               NVRM: this kernel module has the version 535.161.07.  Please
               NVRM: make sure that this kernel module and all NVIDIA driver
               NVRM: components have the same version.


I do not think that is related, but will fix that today.

I would love to have some tips about commands to run in background, trying to find a problem or an action right before it gets shutdown. But I do not have more ideas, as said it’s very hard to spot and we did not find a way to trigger this problem.

If you need more info, please ask.

Regards,
GASPARD DE RENEFORT Kévin
_________________
«Gentoo does not have problems, only learning opportunities.» - NeddySeagoon
«If your Gentoo installation isn't valuable to you, feel free to continue to ignore the instructions.» - figueroa
Back to top
View user's profile Send private message
kgdrenefort
Apprentice
Apprentice


Joined: 19 Sep 2023
Posts: 186
Location: Somewhere in the 77

PostPosted: Sat Mar 30, 2024 3:47 pm    Post subject: Reply with quote

About the NVidia problem, I just had an update of the driver, a reboot will get rid of it mostly, so it’s not that important I think.
_________________
«Gentoo does not have problems, only learning opportunities.» - NeddySeagoon
«If your Gentoo installation isn't valuable to you, feel free to continue to ignore the instructions.» - figueroa
Back to top
View user's profile Send private message
kgdrenefort
Apprentice
Apprentice


Joined: 19 Sep 2023
Posts: 186
Location: Somewhere in the 77

PostPosted: Thu Apr 04, 2024 4:55 pm    Post subject: Reply with quote

An important addition:

I might have been wrong, just got a suspend to made, as explained it (the computer) stay "woken", fan is rolling, also the led indicator for hard-drive activity it seems was blinking.

After suspend, black screen, only mouse AND I was able to get to a TTY, restart SDDM and login again, with an error: can’t connect to KDE, more or less. Had to reboot, this time, only the mouse was responding.

Link to my dmesg before SDDM restart and crashed the computer while logging: https://bpa.st/YURA

It seems interesting from line 1139:

Code:

[ 7476.816344] perf: interrupt took too long (2543 > 2500), lowering kernel.perf_event_max_sample_rate to 78600
[14183.806793] Adding 33554428k swap on /swapfile.  Priority:-2 extents:2 across:49869548k SS
[16396.569996] Adding 33554428k swap on /swapfile.  Priority:-2 extents:2 across:49869548k SS
[28667.297230] PM: suspend entry (deep)
[28667.314245] Filesystems sync: 0.017 seconds
[28667.917663] Loading firmware: rtl_nic/rtl8168h-2.fw
[28667.918146] Freezing user space processes
[28667.921545] Freezing user space processes completed (elapsed 0.003 seconds)
[28667.921565] OOM killer disabled.
[28667.921574] Freezing remaining freezable tasks
[28667.923173] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[28667.923206] printk: Suspending console(s) (use no_console_suspend to debug)
[28667.924330] serial 00:04: disabled
[28667.924996] r8169 0000:08:00.0 enp8s0: Link is Down
[28667.945269] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[28667.945271] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
[28667.945418] ata9.00: Entering standby power mode
[28667.947348] ata1.00: Entering standby power mode
[28667.948512] sd 1:0:0:0: [sdb] Synchronizing SCSI cache
[28667.948738] sd 4:0:0:0: [sdc] Synchronizing SCSI cache
[28667.948816] ata5.00: Entering standby power mode
[28667.951262] ata2.00: Entering standby power mode
[28668.355886] PM: suspend devices took 0.434 seconds
[28668.368534] ccp 0000:0a:00.2: Refused to change power state from D0 to D3hot
[28668.405666] ACPI: PM: Preparing to enter system sleep state S3
[28668.942624] ACPI: PM: Saving platform NVS memory
[28668.942846] Disabling non-boot CPUs ...
[28668.945011] smpboot: CPU 1 is now offline
[28668.945831] Wakeup pending. Abort CPU freeze
[28668.945834] Non-boot CPUs are not disabled
[28668.945837] Enabling non-boot CPUs ...
[28668.945903] smpboot: Booting Node 0 Processor 1 APIC 0x2
[28668.948690] ACPI: \_PR_.C002: Found 2 idle states
[28668.949234] CPU1 is up
[28668.951162] ACPI: PM: Waking up from system sleep state S3
[28669.002379] pcieport 0000:00:07.1: AER: Multiple Uncorrected (Non-Fatal) error message received from 0000:00:00.0
[28669.002393] ccp 0000:0a:00.2: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[28669.002396] ccp 0000:0a:00.2:   device [1022:1456] error status/mask=00100000/00000000
[28669.002398] ccp 0000:0a:00.2:    [20] UnsupReq               (First)
[28669.002401] ccp 0000:0a:00.2: AER:   TLP Header: 40001001 0000000f f72fe00c 00000000
[28669.002407] pci 0000:0a:00.0: AER: can't recover (no error_detected callback)
[28669.002408] ccp 0000:0a:00.2: AER: can't recover (no error_detected callback)
[28669.002410] xhci_hcd 0000:0a:00.3: AER: can't recover (no error_detected callback)
[28669.002433] pcieport 0000:00:07.1: AER: device recovery failed
[28669.003462] xhci_hcd 0000:01:00.0: xHC error in resume, USBSTS 0x401, Reinit
[28669.003468] usb usb1: root hub lost power or was reset
[28669.003471] usb usb2: root hub lost power or was reset
[28669.005212] serial 00:04: activated
[28669.015132] xhci_hcd 0000:0a:00.3: Refused to change power state from D3hot to D0
[28669.015142] xhci_hcd 0000:0a:00.3: Controller not ready at resume -19
[28669.015144] xhci_hcd 0000:0a:00.3: PCI post-resume error -19!
[28669.015146] xhci_hcd 0000:0a:00.3: HC died; cleaning up
[28669.015153] xhci_hcd 0000:0a:00.3: PM: dpm_run_callback(): pci_pm_resume+0x0/0xf0 returns -19
[28669.015164] xhci_hcd 0000:0a:00.3: PM: failed to resume async: error -19
[28669.192024] r8169 0000:08:00.0 enp8s0: Link is Down
[28669.319081] ata6: SATA link down (SStatus 0 SControl 330)
[28669.319085] ata11: SATA link down (SStatus 0 SControl 300)
[28669.408827] usb 1-4: reset high-speed USB device number 2 using xhci_hcd
[28669.475149] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[28669.475156] ata2.00: Entering active power mode
[28669.475334] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[28669.475335] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[28669.475346] ata9.00: Entering active power mode
[28669.475347] ata1.00: Entering active power mode
[28669.475360] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[28669.475364] ata5.00: Entering active power mode
[28669.475368] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[28669.475796] ata1.00: supports DRM functions and may not be fully accessible
[28669.477456] ata2.00: supports DRM functions and may not be fully accessible
[28669.478265] ata10.00: configured for UDMA/133
[28669.478389] ata1.00: supports DRM functions and may not be fully accessible
[28669.478664] ata5.00: supports DRM functions and may not be fully accessible
[28669.480947] ata1.00: configured for UDMA/133
[28669.481419] ata1.00: Enabling discard_zeroes_data
[28669.481677] ata5.00: supports DRM functions and may not be fully accessible
[28669.484572] ata5.00: configured for UDMA/133
[28669.484779] ata5.00: Enabling discard_zeroes_data
[28669.492032] ata2.00: supports DRM functions and may not be fully accessible
[28669.506452] ata2.00: configured for UDMA/133
[28669.812181] usb 1-10: reset full-speed USB device number 4 using xhci_hcd
[28670.289088] usb 1-7: reset high-speed USB device number 3 using xhci_hcd
[28670.579172] PM: resume devices took 1.577 seconds
[28670.604888] OOM killer enabled.
[28670.605704] Restarting tasks ...
[28670.605848] usb 3-2: USB disconnect, device number 2
[28670.614032] done.
[28670.614480] random: crng reseeded on system resumption
[28670.614998] PM: suspend exit
[28670.615054] PM: suspend entry (s2idle)
[28670.636518] Filesystems sync: 0.021 seconds
[28670.697248] ata9.00: configured for UDMA/133
[28670.945398] pcieport 0000:00:07.1: AER: Uncorrected (Non-Fatal) error message received from 0000:00:00.0
[28670.945423] ccp 0000:0a:00.2: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[28670.945431] ccp 0000:0a:00.2:   device [1022:1456] error status/mask=00100000/00000000
[28670.945440] ccp 0000:0a:00.2:    [20] UnsupReq               (First)
[28670.945447] ccp 0000:0a:00.2: AER:   TLP Header: 40001001 0000000f f72fe00c 00000000
[28670.945458] pci 0000:0a:00.0: AER: can't recover (no error_detected callback)
[28670.945463] ccp 0000:0a:00.2: AER: can't recover (no error_detected callback)
[28670.945468] xhci_hcd 0000:0a:00.3: AER: can't recover (no error_detected callback)
[28670.945507] pcieport 0000:00:07.1: AER: device recovery failed
[28671.092028] Freezing user space processes
[28671.095552] Freezing user space processes completed (elapsed 0.003 seconds)
[28671.095845] OOM killer disabled.
[28671.096126] Freezing remaining freezable tasks
[28672.323861] r8169 0000:08:00.0 enp8s0: Link is Up - 1Gbps/Full - flow control rx/tx
[28676.041849] xhci_hcd 0000:0a:00.3: xHCI host controller not responding, assume dead
[28676.042548] xhci_hcd 0000:0a:00.3: HC died; cleaning up
[28676.043269] xhci_hcd 0000:0a:00.3: Timeout while waiting for configure endpoint command
[28676.044377] usb 3-4: USB disconnect, device number 3
[28676.052050] Freezing remaining freezable tasks completed (elapsed 4.955 seconds)
[28676.052730] printk: Suspending console(s) (use no_console_suspend to debug)
[28676.055469] serial 00:04: disabled
[28676.055611] r8169 0000:08:00.0 enp8s0: Link is Down
[28676.088718] sd 4:0:0:0: [sdc] Synchronizing SCSI cache
[28676.091741] ata5.00: Entering standby power mode
[28676.091940] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[28676.094361] ata1.00: Entering standby power mode
[28676.095272] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
[28676.095290] sd 1:0:0:0: [sdb] Synchronizing SCSI cache
[28676.095476] ata9.00: Entering standby power mode
[28676.098082] ata2.00: Entering standby power mode
[28676.506870] PM: suspend devices took 0.454 seconds
[28676.521464] ccp 0000:0a:00.2: Refused to change power state from D0 to D3hot
[35621.119460] serial 00:04: activated
[35621.130413] xhci_hcd 0000:0a:00.3: Refused to change power state from D3hot to D0
[35621.305672] r8169 0000:08:00.0 enp8s0: Link is Down
[35621.433155] ata11: SATA link down (SStatus 0 SControl 300)
[35621.433260] ata6: SATA link down (SStatus 0 SControl 330)
[35621.589053] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[35621.589063] ata5.00: Entering active power mode
[35621.589064] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[35621.589076] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[35621.589083] ata1.00: Entering active power mode
[35621.589537] ata1.00: supports DRM functions and may not be fully accessible
[35621.591881] ata10.00: configured for UDMA/133
[35621.592131] ata1.00: supports DRM functions and may not be fully accessible
[35621.592274] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[35621.592281] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[35621.592286] ata2.00: Entering active power mode
[35621.592290] ata9.00: Entering active power mode
[35621.592435] ata5.00: supports DRM functions and may not be fully accessible
[35621.594366] ata2.00: supports DRM functions and may not be fully accessible
[35621.594681] ata1.00: configured for UDMA/133
[35621.595125] ata1.00: Enabling discard_zeroes_data
[35621.595460] ata5.00: supports DRM functions and may not be fully accessible
[35621.598403] ata5.00: configured for UDMA/133
[35621.598824] ata5.00: Enabling discard_zeroes_data
[35621.609286] ata2.00: supports DRM functions and may not be fully accessible
[35621.623988] ata2.00: configured for UDMA/133
[35621.843482] PM: resume devices took 0.727 seconds
[35621.855874] OOM killer enabled.
[35621.856154] Restarting tasks ... done.
[35621.860532] random: crng reseeded on system resumption
[35621.864767] PM: suspend exit
[35624.410888] ata9.00: configured for UDMA/133
[35624.814843] r8169 0000:08:00.0 enp8s0: Link is Up - 1Gbps/Full - flow control rx/tx


I can see there is a lot of error, notably, about waking up.

If you have an idea, I would be glad :) !

Regards,
GASPARD DE RENEFORT Kévin
_________________
«Gentoo does not have problems, only learning opportunities.» - NeddySeagoon
«If your Gentoo installation isn't valuable to you, feel free to continue to ignore the instructions.» - figueroa
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum