Monday, November 9, 2015

Linux : kernel:BUG: soft lockup

Encountered this problem in syslog

kernel:BUG: soft lockup - CPU#2 stuck for 22s! [process name:PID]

According to this link : http://askubuntu.com/questions/401736/what-does-cpu0-stuck-mean

The Linux kernel has a process which monitors each CPU on the system.
There are special interrupt(s) in the kernel. This interrupt(s) function calls a soft-lockup counter, it will compare the current time stamp with the specific kernel CPU data structure time information. If it looks like the current time stamp is greater than the defined threshold (in seconds) later as compared to the stored time stamp, it is assumed that the monitoring process or watchdog thread(s) have not executed in a respectable amount of time.
Why or how can a CPU soft lock occur? How can a CPU get locked if the kernel is carefully scheduling CPU access? Basically any poorly written code that loops a lot or infinitely, would own a CPU and get some priority. It can be a programming problem or 3rd party software.
Locking issues in drivers. Even kernel bugs in important drivers or the scheduler. A scheduler could tell schedule a driver routine to run and if that driver has problems and doesn’t check on it, that driver routine could own or hog that CPU for a longtime. By definition as described above, the watchdog would catch this and issue a soft lockup alert.

No comments:

Post a Comment