this post was submitted on 14 Jul 2024
1 points (100.0% liked)

Linux

4906 readers
94 users here now

A community for everything relating to the linux operating system

Also check out [email protected]

Original icon base courtesy of [email protected] and The GIMP

founded 1 year ago
MODERATORS
top 11 comments
sorted by: hot top controversial new old
[–] [email protected] 0 points 1 month ago (4 children)

I'm getting random reboots, tied to nothing. Micro computer, AMD Ryzen 5 5800H. New (<6mo) computer; no re-used old components. 36GB RAM, which has passed a few runs of memtest. I have regularly seen the k10 temp spike to the low 90s without reboot, and when the reboots happen I haven't noticed that the temps were higher than 60. The only thing I've been able to correlate it at all to is composing email; I'm a fairly fast typer and markdown-oxide goes berserk and consumes in the mid-high 100% CPU use (~165%) while I'm typing. I made the correlation because multiple times this has happened has been while I was composing emails (and subsequently lost them).

There is nothing in boot-1 logs. Just normal logging and then reboot. Nothing at all suspicious, no weird errors. I struggle to use more than 50% memory, so memory contention is not an issue. It's like a sudden power cycle.

The system is on a UPS; my next avenue of investigation is the UPS itself, but power surges in the house shouldn't be a possibility; there are a half dozen other computers in the house, some on UPS, some not, and none of those are having issues.

I saw an article a few days ago about a tool to help track down mysterious reboots like this, but can't find it now. I don't know how software could help; it is literally: everything is working, the screens go blank, and in a second or so the BIOS posts.

I am suspicious of the CPU core temp readings, which I can't seem to get at. I get the GPU temp, which is never stressed (stays around 45C); and k10temp_tctl, which from what I can find is an edge temp and not the core temp; and all of the NVMe temps, which all stay in the 40s. But the fact that I don't know if I'm seeing what's really going on temp-wise in the CPU worries me. But I don't think I've had it crash during a software update, which often includes compiling a bunch of Rust, C, Go, and whatever packages which I can see pegging multiple cores.

I'm at a loss. I've looked at everything I can think of, but still haven't gotten a hint about what is triggering this. I may just do a bunch of markdown editing with markdown-oxide enabled and see of I can reliably force it to happen, but that still wouldn't tell me why. I am certain it's not memory, and have mostly convinced myself it isn't temperature, unless it's something hidden I can't get a reading on.

Help?

[–] [email protected] 0 points 1 month ago

Replace markdown oxide for another tool for some time, try breaking the correlation to find causation

[–] [email protected] 0 points 1 month ago

Couldn't be that the PSU is failing? Check with multimeter! I'd see that before UPS, or maybe both..

[–] [email protected] 0 points 1 month ago (1 children)

Shot in the drak but my latest instability has been caused by the MSI board pushing (well, allowing) too much power to the CPU, in my case it's 13th gen intel so probably not the same thing - I've updated to a beta BIOS and set Intels defaults.

One other thing that might or might not help is https://github.com/mchehab/rasdaemon
Helped me identity failig cpu - by logging MCE events/cpu errors

[–] [email protected] 0 points 1 month ago

That isn't the forensic tool I saw, but it looks like it could be really useful, thank you!

[–] [email protected] 0 points 1 month ago (1 children)
[–] [email protected] 0 points 1 month ago

Started to. There's a small learning curve as I only recently switched from grub to EFI, and am still figuring out how to manage stuff like this.

[–] [email protected] 0 points 1 month ago (2 children)

Is anyone here playing Elden Ring on their Linux machine?

[–] [email protected] 0 points 1 month ago

I am playing under stock fedora with an amd 5600g and an Intel arc a770. Runs without issue performance wise (no noticeable difference from my windows dual boot) but there is a weird bug with the anti cheat where if you don't have the dlc you need to create some file or it will fail and you will not get any online functionality. Protondb has the steps on a simple workaround though.

[–] [email protected] 0 points 1 month ago (1 children)

I haven't for months now, but it certainly ran well at release.

[–] [email protected] 0 points 1 month ago

I have i5 8400 and 1660super with 16Gb ram.

It runs fine on Windows with high settings. Do you think there would be a performance hit?