View previous topic :: View next topic |
Author |
Message |
rafaelzigx n00b
Joined: 09 Apr 2015 Posts: 11
|
Posted: Wed Nov 07, 2018 11:07 am Post subject: Nvidia drivers hunging UDEV resulting in one core at 100% |
|
|
Hello guys,
I'm having a issue with nvidia property drivers.
Every time I install it, it crashes with udev and make one of the cores going to 100% all the time.
The process consuming the processor:
/sbin/udev --daemon
I cant kill it.
I've already downgraded kernel versions down to 4.9.x and up to 4.19. Didnt solve.
I did the same tests with nvidia drivers.
upgraded also the eudev (even if I wasnt sure of it). Nothing.
If I use nouveau, I dont have this problem. But as soon as I install nvidia, it starts.
My Kernel log:
Code: | [ 65.832518] udevd[2723]: slow: 'lmt-udev auto' [2786]
[ 66.873927] udevd[2690]: worker [2724] /module/nvidia is taking a long time
[ 66.873931] udevd[2690]: worker [2754] /devices/pci0000:00/0000:00:01.0/0000:01:00.0 is taking a long time
[ 66.873933] udevd[2690]: worker [2723] /devices/system/machinecheck/machinecheck3 is taking a long time
[ 185.917368] udevd[2724]: timeout 'nvidia-udev.sh add'
[ 185.917378] udevd[2724]: slow: 'nvidia-udev.sh add' [2868]
[ 186.918427] udevd[2724]: timeout: killing 'nvidia-udev.sh add' [2868]
[ 186.918443] udevd[2724]: slow: 'nvidia-udev.sh add' [2868]
[ 186.918626] udevd[2724]: 'nvidia-udev.sh add' [2868] terminated by signal 9 (Killed)
[ 186.928568] udevd[2723]: timeout: killing 'lmt-udev auto' [2786]
[ 186.928577] udevd[2723]: slow: 'lmt-udev auto' [2786]
[ 186.928714] udevd[2723]: 'lmt-udev auto' [2786] terminated by signal 9 (Killed)
[ 189.931852] udevd[2690]: worker [2754] /devices/pci0000:00/0000:00:01.0/0000:01:00.0 timeout; kill it
[ 189.931861] udevd[2690]: seq 1837 '/devices/pci0000:00/0000:00:01.0/0000:01:00.0' killed
[ 246.983258] INFO: task laptop_mode:5703 blocked for more than 120 seconds.
[ 246.983260] Tainted: P OE 4.19.1-gentoo-vulkan #1
[ 246.983261] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.983262] laptop_mode D 0 5703 3755 0x00000000
[ 246.983264] Call Trace:
[ 246.983269] ? __schedule+0x250/0x800
[ 246.983271] schedule+0x28/0x80
[ 246.983272] schedule_preempt_disabled+0xa/0x10
[ 246.983274] __mutex_lock.isra.1+0x24d/0x490
[ 246.983276] ? wp_page_copy+0x318/0x640
[ 246.983279] ? control_store+0x20/0x80
[ 246.983280] control_store+0x20/0x80
[ 246.983283] kernfs_fop_write+0x105/0x180
[ 246.983286] __vfs_write+0x36/0x180
[ 246.983288] ? selinux_file_permission+0x11f/0x130
[ 246.983289] ? security_file_permission+0x2c/0xb0
[ 246.983291] vfs_write+0xb0/0x190
[ 246.983293] ksys_write+0x52/0xc0
[ 246.983295] do_syscall_64+0x5a/0x110
[ 246.983297] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 246.983299] RIP: 0033:0x7f819f211da8
[ 246.983303] Code: Bad RIP value.
[ 246.983304] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 246.983305] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8
[ 246.983306] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001
[ 246.983307] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270
[ 246.983308] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760
[ 246.983309] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003
[ 369.863272] INFO: task laptop_mode:5703 blocked for more than 120 seconds.
[ 369.863274] Tainted: P OE 4.19.1-gentoo-vulkan #1
[ 369.863274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 369.863275] laptop_mode D 0 5703 3755 0x00000000
[ 369.863277] Call Trace:
[ 369.863281] ? __schedule+0x250/0x800
[ 369.863282] schedule+0x28/0x80
[ 369.863283] schedule_preempt_disabled+0xa/0x10
[ 369.863284] __mutex_lock.isra.1+0x24d/0x490
[ 369.863287] ? wp_page_copy+0x318/0x640
[ 369.863289] ? control_store+0x20/0x80
[ 369.863290] control_store+0x20/0x80
[ 369.863292] kernfs_fop_write+0x105/0x180
[ 369.863294] __vfs_write+0x36/0x180
[ 369.863296] ? selinux_file_permission+0x11f/0x130
[ 369.863297] ? security_file_permission+0x2c/0xb0
[ 369.863299] vfs_write+0xb0/0x190
[ 369.863300] ksys_write+0x52/0xc0
[ 369.863302] do_syscall_64+0x5a/0x110
[ 369.863303] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 369.863305] RIP: 0033:0x7f819f211da8
[ 369.863308] Code: Bad RIP value.
[ 369.863309] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 369.863310] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8
[ 369.863310] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001
[ 369.863311] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270
[ 369.863311] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760
[ 369.863312] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003
[ 492.743282] INFO: task laptop_mode:5703 blocked for more than 120 seconds.
[ 492.743283] Tainted: P OE 4.19.1-gentoo-vulkan #1
[ 492.743284] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 492.743285] laptop_mode D 0 5703 3755 0x00000000
[ 492.743286] Call Trace:
[ 492.743291] ? __schedule+0x250/0x800
[ 492.743292] schedule+0x28/0x80
[ 492.743293] schedule_preempt_disabled+0xa/0x10
[ 492.743294] __mutex_lock.isra.1+0x24d/0x490
[ 492.743297] ? wp_page_copy+0x318/0x640
[ 492.743299] ? control_store+0x20/0x80
[ 492.743300] control_store+0x20/0x80
[ 492.743302] kernfs_fop_write+0x105/0x180
[ 492.743304] __vfs_write+0x36/0x180
[ 492.743306] ? selinux_file_permission+0x11f/0x130
[ 492.743307] ? security_file_permission+0x2c/0xb0
[ 492.743309] vfs_write+0xb0/0x190
[ 492.743310] ksys_write+0x52/0xc0
[ 492.743312] do_syscall_64+0x5a/0x110
[ 492.743326] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 492.743327] RIP: 0033:0x7f819f211da8
[ 492.743330] Code: Bad RIP value.
[ 492.743331] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 492.743332] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8
[ 492.743333] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001
[ 492.743333] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270
[ 492.743334] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760
[ 492.743335] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003
[ 615.623274] INFO: task laptop_mode:5703 blocked for more than 120 seconds.
[ 615.623276] Tainted: P OE 4.19.1-gentoo-vulkan #1
[ 615.623276] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 615.623277] laptop_mode D 0 5703 3755 0x00000000
[ 615.623278] Call Trace:
[ 615.623282] ? __schedule+0x250/0x800
[ 615.623283] schedule+0x28/0x80
[ 615.623284] schedule_preempt_disabled+0xa/0x10
[ 615.623286] __mutex_lock.isra.1+0x24d/0x490
[ 615.623288] ? wp_page_copy+0x318/0x640
[ 615.623290] ? control_store+0x20/0x80
[ 615.623291] control_store+0x20/0x80
[ 615.623293] kernfs_fop_write+0x105/0x180
[ 615.623295] __vfs_write+0x36/0x180
[ 615.623297] ? selinux_file_permission+0x11f/0x130
[ 615.623298] ? security_file_permission+0x2c/0xb0
[ 615.623300] vfs_write+0xb0/0x190
[ 615.623301] ksys_write+0x52/0xc0
[ 615.623303] do_syscall_64+0x5a/0x110
[ 615.623304] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 615.623306] RIP: 0033:0x7f819f211da8
[ 615.623309] Code: Bad RIP value.
[ 615.623309] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 615.623311] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8
[ 615.623311] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001
[ 615.623312] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270
[ 615.623312] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760
[ 615.623313] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003
[ 738.503276] INFO: task laptop_mode:5703 blocked for more than 120 seconds.
[ 738.503278] Tainted: P OE 4.19.1-gentoo-vulkan #1
[ 738.503278] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
|
My hardware:
Code: | ┌─[rafael][vulkan][~]
└─▪ inxi -v 2
System: Host: vulkan Kernel: 4.19.1-gentoo-vulkan x86_64 bits: 64 Desktop: Xfce 4.12.4
Distro: Gentoo Base System release 2.4.1
Machine: Device: laptop System: Dell product: XPS 15 9560 serial: N/A
Mobo: Dell model: 05FFDN v: A00 serial: N/A UEFI: Dell v: 1.12.1 date: 10/02/2018
Battery BAT0: charge: 36.4 Wh 74.2% condition: 49.0/56.0 Wh (88%)
CPU: Quad core Intel Core i7-7700HQ (-MT-MCP-) speed/max: 3510/3800 MHz
Graphics: Card-1: Intel Device 591b
Card-2: NVIDIA GP107M [GeForce GTX 1050 Mobile]
Display Server: X.Org 1.20.3 driver: modesetting Resolution: 1920x1080@59.93hz
OpenGL: renderer: Mesa DRI Intel HD Graphics 630 (Kaby Lake GT2) version: 4.5 Mesa 18.2.4
Network: Card-1: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter driver: ath10k_pci
Card-2: Qualcomm Atheros
Drives: HDD Total Size: 3024.6GB (2.5% used)
ID-1: model: Samsung_SSD_960_PRO_1TB
ID-2: model: Ultra_Slim_PL
Info: Processes: 217 Uptime: 33 min Memory: 1097.2/15930.2MB Client: Shell (bash) inxi: 2.3.56
|
Thanks in advance.
[Moderator edit: added [code] tags to preserve output layout. -Hu] |
|
Back to top |
|
|
javeree Guru
Joined: 29 Jan 2006 Posts: 453
|
Posted: Wed Nov 21, 2018 12:29 pm Post subject: |
|
|
I have the same issue since a few days.
bug https://bugs.gentoo.org/show_bug.cgi?id=454740 describes the cause, but the solution is a workaround, and I think what happens is a ratrace in the workaround of the script, as the error does not happen at all bootups. In between the time they check for existence of the nvidia module and the execution of nvidia-smi, the module is unloaded again.
I see that sometimes after say one hour, it suddenly succeeds and gets the module loaded. I also see that manually loading nvidia-drm can break the loop.
So for now as a poor workaround, I have added
Code: | cat > /etc/local.d/nvidia-break-udevd-lock.start <<EOF
#! / bin/sh
modprobe nvidia-drm
EOF
chmod +x etc/local.d/nvidia-break-udevd-lock.start
rc-update add local |
The next thing I will check: I have seen at a given moment that a module 'nvidia' was loaded, but not nvidia-drm. So maybe the check in nvidia-udev.sh should not be lsmod | grep -iq nvidia, but rather lsmod | grep -iq nvidia-drm. |
|
Back to top |
|
|
javeree Guru
Joined: 29 Jan 2006 Posts: 453
|
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
|
Back to top |
|
|
|