Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
nvidia driver segfault
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
jrevi
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 129

PostPosted: Tue Sep 30, 2014 2:04 pm    Post subject: nvidia driver segfault Reply with quote

Hi all,

My nvidia drivers segfault since a couple of weeks now and I don't know where to start:

Code:
[    6.613219] BUG: unable to handle kernel NULL pointer dereference at           (null)
[    6.613224] IP: [<ffffffff814b34a4>] __down+0x3b/0x8e
[    6.613227] PGD 1ac50a067 PUD 1ac511067 PMD 0
[    6.613228] Oops: 0002 [#1] SMP
[    6.613271] Modules linked in: cpufreq_ondemand snd_hda_codec_si3054 snd_hda_codec_realtek snd_hda_codec_generic nvidia(PO+) coretemp kvm_intel kvm microcode pcspkr i2c_i801(+) drm iwlwifi(+) cfg80211 i2c_core thermal(+) rfkill r8169 agpgart mii wmi(+) snd_hda_intel(+) battery snd_hda_codec snd_hwdep video(+) snd_pcm acpi_cpufreq snd_timer snd ac(+) processor button thermal_sys ipv6 xts gf128mul aes_x86_64 cbc sha512_generic sha256_generic sha1_generic libiscsi scsi_transport_iscsi tg3 ptp pps_core libphy e1000 fuse nfs lockd sunrpc jfs multipath linear raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash dm_log dm_mod hid_sunplus hid_sony led_class hid_samsung hid_pl hid_petalynx hid_gyration sl811_hcd
[    6.613316]  usbhid xhci_hcd ohci_pci ohci_hcd uhci_hcd usb_storage aic94xx libsas lpfc crc_t10dif crct10dif_common qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280 imm parport dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx aic79xx scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise pata_sl82c105 pata_cs5530 pata_cs5520 pata_via pata_jmicron pata_marvell pata_sis pata_netcell pata_sc1200 pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia pcmcia_core pata_ns87415 pata_ns87410
[    6.613329]  pata_serverworks pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix ahci libahci ehci_pci ehci_hcd libata usbcore usb_common
[    6.613332] CPU: 6 PID: 5487 Comm: nvidia-smi Tainted: P           O 3.14.14-gentoo #5
[    6.613334] Hardware name: CLEVO CO.                        W860CU                          /W860CU                          , BIOS CALPELLACRB.86C.0000.X
[    6.613336] task: ffff8800c27f7090 ti: ffff8800bfc36000 task.ti: ffff8800bfc36000
[    6.613340] RIP: 0010:[<ffffffff814b34a4>]  [<ffffffff814b34a4>] __down+0x3b/0x8e
[    6.613341] RSP: 0018:ffff8800bfc37b18  EFLAGS: 00010096
[    6.613343] RAX: 0000000000000000 RBX: ffffffffa15e2978 RCX: ffffffffa15e2980
[    6.613344] RDX: ffff8800bfc37b18 RSI: ffffffffa13a4004 RDI: ffffffffa15e2978
[    6.613346] RBP: ffff8800bfc37b58 R08: ffff8800c0398148 R09: ffff8801ab978e88
[    6.613347] R10: 000000000000002b R11: ffff8801b071c060 R12: 7fffffffffffffff
[    6.613348] R13: ffff8800c27f7090 R14: 00000000000000ff R15: 0000000000000000
[    6.613350] FS:  00007f84c75b2700(0000) GS:ffff8801b7d80000(0000) knlGS:0000000000000000
[    6.613351] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.613352] CR2: 0000000000000000 CR3: 00000000c26dc000 CR4: 00000000000007e0
[    6.613353] Stack:
[    6.613356]  ffffffffa15e2980 0000000000000000 ffff8800c24059c0 0000000000000202
[    6.613357]  0000000000000002 ffffffffa15e2978 0000000000000292 ffff8800c038c088
[    6.613359]  ffff8800bfc37b78 ffffffff810610e2 ffff8800c2e23d40 ffff8800c2e23d40
[    6.613360] Call Trace:
[    6.613365]  [<ffffffff810610e2>] down+0x28/0x38
[    6.613495]  [<ffffffffa11e8335>] nvidia_open+0x181/0x83b [nvidia]
[    6.613500]  [<ffffffff813b3afe>] ? kobj_lookup+0xf6/0x12f
[    6.613598]  [<ffffffffa11f075d>] nvidia_frontend_open+0x4b/0x89 [nvidia]
[    6.613603]  [<ffffffff810d6ad7>] chrdev_open+0x12a/0x155
[    6.613605]  [<ffffffff810d69ad>] ? cdev_put+0x22/0x22
[    6.613608]  [<ffffffff810d19a4>] do_dentry_open.isra.16+0x18f/0x24d
[    6.613610]  [<ffffffff810d1a7f>] finish_open+0x1d/0x28
[    6.613613]  [<ffffffff810df297>] do_last+0x910/0xb3a
[    6.613616]  [<ffffffff810dbecc>] ? inode_permission+0x40/0x42
[    6.613618]  [<ffffffff810dc241>] ? link_path_walk+0x66/0x736
[    6.613620]  [<ffffffff810df70d>] path_openat+0x24c/0x591
[    6.613624]  [<ffffffff810e8b4f>] ? setattr_copy+0x9a/0xde
[    6.613626]  [<ffffffff810dfd7d>] do_filp_open+0x35/0x85
[    6.613629]  [<ffffffff810e980f>] ? __alloc_fd+0x5b/0xe7
[    6.613632]  [<ffffffff810d29dc>] do_sys_open+0x14a/0x1d9
[    6.613634]  [<ffffffff810d2a88>] SyS_open+0x1d/0x1f
[    6.613637]  [<ffffffff814b4d62>] system_call_fastpath+0x16/0x1b
[    6.613656] Code: 49 bc ff ff ff ff ff ff ff 7f 65 4c 8b 2c 25 80 b8 00 00 53 48 89 fb 48 83 ec 28 48 8b 47 10 48 89 4d c0 48 89 57 10 48 89 45 c8 <48> 89 10 4c 89 6d d0 c6 45 d8 00 eb 05 4d 85 e4 7e 27 49 c7 45
[    6.613658] RIP  [<ffffffff814b34a4>] __down+0x3b/0x8e
[    6.613659]  RSP <ffff8800bfc37b18>
[    6.613659] CR2: 0000000000000000
[    6.613662] ---[ end trace 61a4164cc9fd2e45 ]---


Code:
# emerge --info
Portage 2.2.8-r1 (default/linux/amd64/13.0/desktop/gnome/systemd, gcc-4.7.3, glibc-2.19-r1, 3.14.14-gentoo x86_64)
=================================================================
System uname: Linux-3.14.14-gentoo-x86_64-Intel-R-_Core-TM-_CPU_X_920_@_2.00GHz-with-gentoo-2.2
KiB Mem:     6104616 total,   2953392 free
KiB Swap:    4096568 total,   4096568 free
Timestamp of tree: Mon, 29 Sep 2014 16:30:01 +0000
ld GNU ld (GNU Binutils) 2.23.2
app-shells/bash:          4.2_p48-r1
dev-java/java-config:     2.2.0
dev-lang/python:          2.7.7, 3.3.5-r1
dev-util/cmake:           2.8.12.2-r1
dev-util/pkgconfig:       0.28-r1
sys-apps/baselayout:      2.2
sys-apps/openrc:          0.12.4
sys-apps/sandbox:         2.6-r1
sys-devel/autoconf:       2.13, 2.69
sys-devel/automake:       1.11.6, 1.12.6, 1.13.4
sys-devel/binutils:       2.23.2
sys-devel/gcc:            4.7.3-r1
sys-devel/gcc-config:     1.7.3
sys-devel/libtool:        2.4.2-r1
sys-devel/make:           3.82-r4
sys-kernel/linux-headers: 3.13 (virtual/os-headers)
sys-libs/glibc:           2.19-r1
Repositories: gentoo MyOverlay Pandora
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt /usr/share/maven-bin-2.2/conf /usr/share/maven-bin-3.0/conf"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--autounmask-write"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="fr_FR.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/home/jerome/Documents/PortageOverlay /home/jerome/Documents/MAAT/SourceCode/Sources/maatG/fr.maatg_infrastructure_gentoo_ovelays_pandora"
USE="X a52 aac acl acpi alsa amd64 avahi bash-completion berkdb bluetooth branding bzip2 cairo cdda cdr cli colord cracklib crypt cups cxx dbus dri dts dvd dvdr eds emboss encode evo exif fam firefox flac fortran gdbm gif glamor gnome gnome-keyring gnome-online-accounts gnutls gpm gstreamer gtk iconv icu introspection ipv6 jpeg lcms ldap libnotify libsecret mad mmx mng modules mp3 mp4 mpeg multilib nautilus ncurses networkmanager nls nptl ogg opengl openmp pam pango pcre pdf png policykit ppds pulseaudio qt3support readline samba sdl session socialweb spell sse sse2 ssl startup-notification svg systemd tcpd tiff truetype udev udisks unicode upower usb vim-syntax vorbis wifi wxwidgets x264 xcb xinerama xml xv xvid zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="fr" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_3" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby19 ruby20" USERLAND="GNU" VIDEO_CARDS="nouveau nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, SYNC, USE_PYTHON


Code:
[I] x11-drivers/nvidia-drivers
     Available versions:  96.43.23^msd 173.14.39^msd 304.123^msd 331.89^msd 334.21-r3^msd 337.25^msd 340.32-r1^msd ~343.13-r1^msd ~343.22^msd ~343.22-r2^msd {+X acpi custom-cflags gtk multilib pax_kernel (+)tools uvm KERNEL="FreeBSD linux"}
     Installed versions:  340.32-r1^msd(11:19:22 26/09/2014)(X acpi multilib tools -pax_kernel -uvm KERNEL="linux -FreeBSD")
     Homepage:            http://www.nvidia.com/
     Description:         NVIDIA Accelerated Graphics Driver


Any help would be greatly appreciated.
Best,

Jerome
Back to top
View user's profile Send private message
Markus09
Tux's lil' helper
Tux's lil' helper


Joined: 22 Mar 2013
Posts: 78

PostPosted: Wed Oct 01, 2014 6:06 pm    Post subject: Reply with quote

Is the nouveau driver (module or built into the kernel) present at your system? (maybe from a kernel switch, ...)
I had a similar problem a few weeks ago.
For nvidia cards either "nouveau" or "nvidia" driver can be used not both.
Back to top
View user's profile Send private message
jrevi
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 129

PostPosted: Fri Oct 03, 2014 7:18 am    Post subject: Reply with quote

Hi Markus09,

Yes the nouveau driver is compiled as a module into the kernel. But I already blacklist it in "/etc/modprobe.d/blacklist.conf"

I'm compiling a new kernel without nouveau to see if it works.

For information, in order to prevent the segfault I have:
    - start without graphique environment
    - the most weird thing is that I MUST blacklist the NVIDIA driver into "/etc/modprobe.d/blacklist.conf" ! ([blacklist nvidia, blacklist rivafb, blacklist nvidiafb
    - Finally, I can start gdm without problem


I will update in a couple of minutes to see if it works without nouveau into the kernel

Best.
Back to top
View user's profile Send private message
jrevi
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 129

PostPosted: Fri Oct 03, 2014 7:47 am    Post subject: Reply with quote

ok, rebooted without nouveau compiled and same result. If I do not blacklist the nvidia module it segfault.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21498

PostPosted: Sat Oct 04, 2014 12:30 am    Post subject: Reply with quote

Based on that callstack, the nVidia driver is buggy. You will need to seek help from nVidia, find a version that is not buggy, or permanently blacklist the nVidia driver.
Back to top
View user's profile Send private message
jrevi
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 129

PostPosted: Tue Oct 07, 2014 10:44 am    Post subject: Reply with quote

Hu wrote:
Based on that callstack, the nVidia driver is buggy. You will need to seek help from nVidia, find a version that is not buggy, or permanently blacklist the nVidia driver.


Hi,

I think you are right but cannot find a working version anymore. I would like to switch to nouveau but I was still not able to get X starting with it.
I will investigate a bit more.

Best,
Jerome
Back to top
View user's profile Send private message
jrevi
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 129

PostPosted: Tue Oct 07, 2014 10:51 am    Post subject: Reply with quote

Just FYI:

Code:
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GT215M [GeForce GTS 360M] [10de:0cb1] (rev a2) (prog-if 00 [VGA controller])
   Subsystem: CLEVO/KAPOK Computer Device [1558:8687]
   Flags: bus master, fast devsel, latency 0, IRQ 50
   Memory at cc000000 (32-bit, non-prefetchable) [size=16M]
   Memory at d0000000 (64-bit, prefetchable) [size=256M]
   Memory at ce000000 (64-bit, prefetchable) [size=32M]
   I/O ports at 2000 [size=128]
   [virtual] Expansion ROM at cd000000 [disabled] [size=512K]
   Capabilities: [60] Power Management version 3
   Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
   Capabilities: [78] Express Endpoint, MSI 00
   Capabilities: [b4] Vendor Specific Information: Len=14 <?>
   Capabilities: [100] Virtual Channel
Back to top
View user's profile Send private message
i92guboj
Bodhisattva
Bodhisattva


Joined: 30 Nov 2004
Posts: 10315
Location: Córdoba (Spain)

PostPosted: Tue Oct 07, 2014 11:00 am    Post subject: Reply with quote

There's something odd going on.

First, you say you had nvidia and nouveau. That's fine, but problem is to be expected UNLESS you blacklist one of the two. Otherwise, it's fine.

Then, you said you compiled the kernel WITHOUT nouveau, and blacklisted nvidia. Taking into account that you have set VIDEO_CARDS="nouveau nvidia" in your make conf, that would have left you without access to X at all, since you haven't compiled "vesa" or any other alternative driver.

So, you either failed to blacklist "nvidia" OR you had still "nouveau" installed, right? You can find that by using lsmod.

Now, a few things you might have missed:

  • I never tried, but I don't think you can blacklist a module that's been compiled statically into the kernel. Please, if someone knows for sure confirm or deny this. In the while, and while this settles down, I would compile nouveau as a module, and not into the kernel.
  • eselect opengl list must match x11 for nouveau and nvidia for nvidia, otherwise, you are screwed.
  • nouveau has a tendency to crash when USE="vdpau vaapi", so, if you truly want to get it working (well, somehow), disable those. It's far from being the only problem with nouveau though...
  • MAKE 100% SURE you move xorg.conf or xorg.conf.d/ out of scope when using nouveau, otherwise, you are screwed.
  • when trying nouveau you should use the latest kernel from kernel.org (3.17 as of yesterday), and the xf86-video-nouveau-9999, libdrm-9999 and mesa-9999 from the x11 overlay, that is, if you want to have a minimal chance of them working at all. Bugfixes are rare, but releases are even rarer so...
Back to top
View user's profile Send private message
jrevi
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 129

PostPosted: Tue Oct 07, 2014 11:16 am    Post subject: Reply with quote

Hi,


Quote:
First, you say you had nvidia and nouveau. That's fine, but problem is to be expected UNLESS you blacklist one of the two. Otherwise, it's fine.


In the past, I already tested nouveau without success so I leaved a conf which could allow me to test from time to time.
I usually blacklist nouveau & agpgart when I use nvidia. I still have nvidia listed here but the line is commented.

Quote:
Then, you said you compiled the kernel WITHOUT nouveau, and blacklisted nvidia. Taking into account that you have set VIDEO_CARDS="nouveau nvidia" in your make conf, that would have left you without access to X at all, since you haven't compiled "vesa" or any other alternative driver.


This is the point, if I do not blacklist nvidia, it 's get loaded at boot time and segfault. If I blacklist, it is then not loaded at boot time BUT is loaded properly if I start gdm for instance (my xorg.conf explicitly specify the nvidia driver in this case)

The is my main issue.... and I cannot solve it.


As far as nouveau is concerned, I thought that it was in a better shape now.... so I would really prefer to solve my nvidia issue...

Best,
Jerome
Back to top
View user's profile Send private message
i92guboj
Bodhisattva
Bodhisattva


Joined: 30 Nov 2004
Posts: 10315
Location: Córdoba (Spain)

PostPosted: Tue Oct 07, 2014 11:30 am    Post subject: Reply with quote

Ok, now I got it. Then, it indeed seems like a bug in the binary driver. Sorry to have spammed the thread. ;)
Back to top
View user's profile Send private message
jrevi
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 129

PostPosted: Wed Oct 08, 2014 7:35 am    Post subject: Reply with quote

Hi all,

I tested a couple of versions and I cannot find any of them working.... so I think that the problem is somewhere else but I don't know where....
I even create an ebuild for the latest official stable nvidia release (340.46).

stuck.... :(

Best,
Jerome
Back to top
View user's profile Send private message
i92guboj
Bodhisattva
Bodhisattva


Joined: 30 Nov 2004
Posts: 10315
Location: Córdoba (Spain)

PostPosted: Wed Oct 08, 2014 8:35 am    Post subject: Reply with quote

Did you, by any chance, try a newer kernel, such as 3.17? Just a shot in the dark...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum