this post was submitted on 27 Oct 2024
157 points (99.4% liked)
Linux
47910 readers
1183 users here now
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
There it is! Thank you! It's a process owned by root called
kworker/0:0+kacpid
. Any idea what that is?[Edit 1] Interestingly, I can't even
kill -9
it.[Edit 2] With
kworker kacpid
to work with, I did a quick search and found this SO page that has some interesting information that I only partially understand, but the following worked like a charm:It's not clear to me what an interrupt is or whether this
gpe09
value is meant to be persistent across reboots, or why this only seems to be happening in the last couple months, but if I can make it go away by running the above from time to time, I guess it's alright?An interrupt is an input that can be triggered to interrupt normal execution. It is used for e. g. hardware devices to signal the processor something has happened that requires timely processing, so that real-time behavior can be achieved (for variable definitions of real-time). Interrupts can also be triggered by software, and this explanation is a gross oversimplification, but that information is what is most likely relevant and interesting for your case at this point.
The commands you posted will sort the interrupts and output the one with the highest count (via head -1), thereby determining the interrupt that gets triggered the most. It will then disable that interrupt via the user-space interface to the ACPI interrupts.
One of the goals of ACPI is to provide a kind of general hardware abstraction without knowing the particular details about each and every hardware device. This is facilitated by offering (among other things), general purpose events - GPEs. One of these GPEs is being triggered a lot, and the processing of that interrupt is what causes your CPU spikes.
The changes you made will not persist after a reboot.
Since this is handled by kworker, you could try and investigate further via the workqueue tools: https://github.com/torvalds/linux/tree/master/tools/workqueue
In general, Linux will detect if excessive GPEs are generated (look for the term "GPE storm" in your kernel log) and stop handling the interrupts by switching to polling. If that happens, or if the interrupts are manually disabled, the system might not react to certain events in a timely manner. What that means for each particular case depends on what the interrupts are being responsible for - hard to tell without additional details.
But still react? Resource for read more?
I'll post some links, but it's a pretty busy week for me already, so give me some time.
To me it sounds like your root cause is either a driver problem or your hardware is misbehaving a little bit in a way the driver doesn’t expect, firing a lot of interrupts that shouldn’t normally happen.
If this seems to resolve your issue, I wouldn’t lose any sleep over it. I would think my hardware is a little bit weird or there’s a bug somewhere in the driver for it. You can also try different kernel versions if your distribution gives you the option, because kernels come with different versions of drivers.
You can’t kill that because it’s a kernel thread. They are not like normal process; these objects are part of the operating system and terminating such a thread can cause in stability.
That's a kernel worker for ACPI. It sounds like you may have a driver for something that is misbehaving.
More likely is the device firmware and you likely can't fix that.