Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Improving battery life, disabling DGPU - bbswitch, nvidia
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Shoaloak
n00b
n00b


Joined: 05 Nov 2016
Posts: 48

PostPosted: Wed May 03, 2017 1:38 pm    Post subject: Improving battery life, disabling DGPU - bbswitch, nvidia Reply with quote

Hello gentoo(wo)men,

I want to disable my dedicated Nvidia GPU (Geforce 965m) when I'm not using it to improve the battery life of my laptop.

I am currently running bbswitch together with the proprietary nvidia blob which seems to be working.
Code:
$ glxspheres64
Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Visual ID of window: 0xb2
Context is Direct
OpenGL Renderer: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2)
61.603280 frames/sec - 68.749260 Mpixels/sec
$ optirun glxspheres64
Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Visual ID of window: 0x21
Context is Direct
OpenGL Renderer: GeForce GTX 965M/PCIe/SSE2
73.597201 frames/sec - 82.134477 Mpixels/sec


However, at boot I notice this message:
Code:
$ dmesg | grep bbswitch
[   39.704016] bbswitch: version 0.8
[   39.704020] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0
[   39.704025] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.PEG0.PEGP
[   39.704132] bbswitch: detected an Optimus _DSM function
[   39.704137] bbswitch: device 0000:01:00.0 is in use by driver 'nvidia', refusing OFF
[   39.704160] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on


The line that caught my attention is
Quote:
device 0000:01:00.0 is in use by driver 'nvidia', refusing OFF

I think that Nvidia is loaded before bbswitch, which is odd since i have this /etc/modprobe.d/bbswitch.conf file:
Code:
blacklist nvidia
blacklist nouveau
options bbswitch load_state=0

And i rebuilt my initramfs using buildkernel which in turn uses genkernel.

Removing the nvidia module myself is also not possible:
Code:
# rmmod nvidia
rmmod: ERROR: Module nvidia is in use by: nvidia_modeset
# rmmod nvidia_modeset
rmmod: ERROR: Module nvidia_modeset is in use by: nvidia_drm
# rmmod nvidia_drm
rmmod: ERROR: Module nvidia_drm is in use
# modprobe -r -i nvidia_drm
modprobe: FATAL: Module nvidia_drm is in use.


Does anybody have an idea how i could fix this?
Somebody using Arch mentioned something something about modules.
_________________
Happy hacking.
Back to top
View user's profile Send private message
R0b0t1
Apprentice
Apprentice


Joined: 05 Jun 2008
Posts: 264

PostPosted: Wed May 03, 2017 4:54 pm    Post subject: Reply with quote

Did you follow up on what the bug report says, where the card turns off after being used?

The fix is claimed to be in https://github.com/Bumblebee-Project/Bumblebee/pull/762, however I wasn't able to find anything explaining the startup behavior. When I used Ubuntu a number of years ago I believe I was having the same issue. This would have been right around the time Optimus came out.
Back to top
View user's profile Send private message
Shoaloak
n00b
n00b


Joined: 05 Nov 2016
Posts: 48

PostPosted: Fri May 05, 2017 8:55 am    Post subject: Reply with quote

R0b0t1 wrote:
Did you follow up on what the bug report says, where the card turns off after being used?


This is the point, my card is always on. I can't shut it down.

Code:
# cat /proc/acpi/bbswitch           
0000:01:00.0 ON
# tee /proc/acpi/bbswitch <<<OFF
OFF
# cat /proc/acpi/bbswitch           
0000:01:00.0 ON


R0b0t1 wrote:
The fix is claimed to be in https://github.com/Bumblebee-Project/Bumblebee/pull/762, however I wasn't able to find anything explaining the startup behavior. When I used Ubuntu a number of years ago I believe I was having the same issue. This would have been right around the time Optimus came out.

I've read the bugreports but I can't find anything that helps me solving this problem :(
_________________
Happy hacking.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Sun Jul 16, 2017 5:19 pm    Post subject: Reply with quote

Shoaloak wrote:
R0b0t1 wrote:
Did you follow up on what the bug report says, where the card turns off after being used?


This is the point, my card is always on. I can't shut it down.

Code:
# cat /proc/acpi/bbswitch           
0000:01:00.0 ON
# tee /proc/acpi/bbswitch <<<OFF
OFF
# cat /proc/acpi/bbswitch           
0000:01:00.0 ON


R0b0t1 wrote:
The fix is claimed to be in https://github.com/Bumblebee-Project/Bumblebee/pull/762, however I wasn't able to find anything explaining the startup behavior. When I used Ubuntu a number of years ago I believe I was having the same issue. This would have been right around the time Optimus came out.

I've read the bugreports but I can't find anything that helps me solving this problem :(


Are you using bumblebee/primus also? If so, try running primusrun glxgears, and afterwards run dmesg | grep -C 10 bbswitch.

If you see a message like
Code:
pci 0000:01:00.0: Refused to change power state, currently in D0

then welcome to the club. It is a bug without known workaround, something to do with some kernel-4.4+ modification in the KMS and power management modules:
https://github.com/Bumblebee-Project/bbswitch/issues/140
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Wed Aug 16, 2017 8:25 pm    Post subject: Try the instructions in the following article Reply with quote

I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:

https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/

A couple of users have found it helpful, give it a shot and tell me what went wrong :P

Good Luck
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Sat Sep 09, 2017 9:47 am    Post subject: Re: Try the instructions in the following article Reply with quote

firasuke wrote:
I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:

https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/

A couple of users have found it helpful, give it a shot and tell me what went wrong :P

Good Luck


Does your guide tackle the issue with bbswitch failing to change the ACPI state of the card?
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Sat Sep 09, 2017 12:14 pm    Post subject: Re: Try the instructions in the following article Reply with quote

Holysword wrote:
firasuke wrote:
I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:

https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/

A couple of users have found it helpful, give it a shot and tell me what went wrong :P

Good Luck


Does your guide tackle the issue with bbswitch failing to change the ACPI state of the card?


Yes, it does. Check the (optional) USE flags section, I've mentioned similar cases and included 2 workarounds to fix this problem.

Several users have reported that they work. Give them a try and tell me how things go.
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Sun Sep 10, 2017 5:45 pm    Post subject: Re: Try the instructions in the following article Reply with quote

firasuke wrote:
Holysword wrote:
firasuke wrote:
I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:

https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/

A couple of users have found it helpful, give it a shot and tell me what went wrong :P

Good Luck


Does your guide tackle the issue with bbswitch failing to change the ACPI state of the card?


Yes, it does. Check the (optional) USE flags section, I've mentioned similar cases and included 2 workarounds to fix this problem.

Several users have reported that they work. Give them a try and tell me how things go.


That doesn't help here, sadly. The modules nvidia_modesetting and nvidia_uvm do not even exist, but I still fail to turn off the card.
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Sun Sep 10, 2017 5:51 pm    Post subject: Re: Try the instructions in the following article Reply with quote

Holysword wrote:
firasuke wrote:
Holysword wrote:
firasuke wrote:
I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:

https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/

A couple of users have found it helpful, give it a shot and tell me what went wrong :P

Good Luck


Does your guide tackle the issue with bbswitch failing to change the ACPI state of the card?


Yes, it does. Check the (optional) USE flags section, I've mentioned similar cases and included 2 workarounds to fix this problem.

Several users have reported that they work. Give them a try and tell me how things go.


That doesn't help here, sadly. The modules nvidia_modesetting and nvidia_uvm do not even exist, but I still fail to turn off the card.


Sorry to hear that. I'd suggest that you go through the guide again, this time in greater detail. Check the comments' section as well as some users have undergone a situation similar to yours.

I'd recommend that you double check the USE flags of your nvidia-drivers, and re emerge them, and (if possible) remove the /lib/modules/YOURKERNELVERSION directory as it might still have the nvidia_kms and uvm modules (even if you removed the USE flags, these may not have been removed so you have to manually ensure that all other nvidia modules besides nvidia.ko should be removed).

I'd also recommend that you switch to the live versions (-9999) for bbswitch,bumblebee and primus.

Hopefully, it'll work for you this time!

If the error is still persisting, leave a comment here and on the website as I'm much more active there (with logs if possible).

Best of luck
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Sun Sep 10, 2017 5:57 pm    Post subject: Re: Try the instructions in the following article Reply with quote

firasuke wrote:
Holysword wrote:
firasuke wrote:
Holysword wrote:
firasuke wrote:
I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:

https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/

A couple of users have found it helpful, give it a shot and tell me what went wrong :P

Good Luck


Does your guide tackle the issue with bbswitch failing to change the ACPI state of the card?


Yes, it does. Check the (optional) USE flags section, I've mentioned similar cases and included 2 workarounds to fix this problem.

Several users have reported that they work. Give them a try and tell me how things go.


That doesn't help here, sadly. The modules nvidia_modesetting and nvidia_uvm do not even exist, but I still fail to turn off the card.


Sorry to hear that. I'd suggest that you go through the guide again, this time in greater detail. Check the comments' section as well as some users have undergone a situation similar to yours.

I'd recommend that you double check the USE flags of your nvidia-drivers, and re emerge them, and (if possible) remove the /lib/modules/YOURKERNELVERSION directory as it might still have the nvidia_kms and uvm modules (even if you removed the USE flags, these may not have been removed so you have to manually ensure that all other nvidia modules besides nvidia.ko should be removed).

I'd also recommend that you switch to the live versions (-9999) for bbswitch,bumblebee and primus.

Hopefully, it'll work for you this time!

If the error is still persisting, leave a comment here and on the website as I'm much more active there (with logs if possible).

Best of luck

I've done all those suggestions even before your comment.
It really doesn't work.
There is a thread on github about that, apparently the people with Lenovo laptops are the lucky ones, because some workarounds work for them.
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Sun Sep 10, 2017 6:54 pm    Post subject: Re: Try the instructions in the following article Reply with quote

Holysword wrote:
firasuke wrote:
Holysword wrote:

That doesn't help here, sadly. The modules nvidia_modesetting and nvidia_uvm do not even exist, but I still fail to turn off the card.


Sorry to hear that. I'd suggest that you go through the guide again, this time in greater detail. Check the comments' section as well as some users have undergone a situation similar to yours.

I'd recommend that you double check the USE flags of your nvidia-drivers, and re emerge them, and (if possible) remove the /lib/modules/YOURKERNELVERSION directory as it might still have the nvidia_kms and uvm modules (even if you removed the USE flags, these may not have been removed so you have to manually ensure that all other nvidia modules besides nvidia.ko should be removed).

I'd also recommend that you switch to the live versions (-9999) for bbswitch,bumblebee and primus.

Hopefully, it'll work for you this time!

If the error is still persisting, leave a comment here and on the website as I'm much more active there (with logs if possible).

Best of luck

I've done all those suggestions even before your comment.
It really doesn't work.
There is a thread on github about that, apparently the people with Lenovo laptops are the lucky ones, because some workarounds work for them.


I'm using a Toshiba laptop, with a super buggy bios and it works fine for me (and it worked fine for several users, some were using Thinkpads others were using ASUS, the list goes on and on).

Your dmesg should only include "nvidia". What does lsmod show? Are you sure only nvidia is being loaded? How about nvidia_drm, this shouldn't be loading if you've followed the guide properly.

Well I really hoped for some logs but if you're sure it's an upstream problem then I do hope that it gets fixed soon.
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Mon Sep 11, 2017 3:44 pm    Post subject: Re: Try the instructions in the following article Reply with quote

firasuke wrote:
I'm using a Toshiba laptop, with a super buggy bios and it works fine for me (and it worked fine for several users, some were using Thinkpads others were using ASUS, the list goes on and on).

Your dmesg should only include "nvidia". What does lsmod show? Are you sure only nvidia is being loaded? How about nvidia_drm, this shouldn't be loading if you've followed the guide properly.

Well I really hoped for some logs but if you're sure it's an upstream problem then I do hope that it gets fixed soon.


I'm not *sure* that it is due to upstream, but so far I am convinced that it is :P

The module nvidia-drm does not exist here:
Code:
sleipnir ~ # uname -a
Linux sleipnir 4.9.16-gentoo #9 SMP Fri Sep 8 20:27:48 CEST 2017 x86_64 Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz GenuineIntel GNU/Linux
sleipnir ~ # find /lib/modules/4.9.16-gentoo/ -name *nvidia*.ko
/lib/modules/4.9.16-gentoo/video/nvidia.ko
sleipnir ~ #


With some combinations of kernel+nvidia-drivers, the fan starts as soon as bumblebee starts and never turns off again. With other combinations, the fan starts only after calling primusrun. Either way, its not possible to turn off the GPU and it fires the complaint:
Code:
[  306.471528] pci 0000:01:00.0: Refused to change power state, currently in D0


There are a bunch of other errors and warnings though. I'm not quite sure what they mean, my full dmesg you can find here: https://pastebin.com/5BFpGw57
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Mon Sep 11, 2017 4:16 pm    Post subject: Re: Try the instructions in the following article Reply with quote

Holysword wrote:
firasuke wrote:
I'm using a Toshiba laptop, with a super buggy bios and it works fine for me (and it worked fine for several users, some were using Thinkpads others were using ASUS, the list goes on and on).

Your dmesg should only include "nvidia". What does lsmod show? Are you sure only nvidia is being loaded? How about nvidia_drm, this shouldn't be loading if you've followed the guide properly.

Well I really hoped for some logs but if you're sure it's an upstream problem then I do hope that it gets fixed soon.


I'm not *sure* that it is due to upstream, but so far I am convinced that it is :P

The module nvidia-drm does not exist here:
Code:
sleipnir ~ # uname -a
Linux sleipnir 4.9.16-gentoo #9 SMP Fri Sep 8 20:27:48 CEST 2017 x86_64 Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz GenuineIntel GNU/Linux
sleipnir ~ # find /lib/modules/4.9.16-gentoo/ -name *nvidia*.ko
/lib/modules/4.9.16-gentoo/video/nvidia.ko
sleipnir ~ #


With some combinations of kernel+nvidia-drivers, the fan starts as soon as bumblebee starts and never turns off again. With other combinations, the fan starts only after calling primusrun. Either way, its not possible to turn off the GPU and it fires the complaint:
Code:
[  306.471528] pci 0000:01:00.0: Refused to change power state, currently in D0


There are a bunch of other errors and warnings though. I'm not quite sure what they mean, my full dmesg you can find here: https://pastebin.com/5BFpGw57


Looking through your dmesg I found something interesting:

Code:
[  301.189363] vgaarb: this pci device is not a vga device
[  301.199609] vgaarb: this pci device is not a vga device


It should look similar to this:

Code:
[    1.926048] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[    1.926100] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    1.926185] pci 0000:00:02.0: vgaarb: bridge control possible
[    1.926231] vgaarb: loaded
[    1.986304] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem


Can you check your kernel and see if
Code:
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=2

_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Mon Sep 11, 2017 4:34 pm    Post subject: Re: Try the instructions in the following article Reply with quote

firasuke wrote:
Looking through your dmesg I found something interesting:

Code:
[  301.189363] vgaarb: this pci device is not a vga device
[  301.199609] vgaarb: this pci device is not a vga device


It should look similar to this:

Code:
[    1.926048] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[    1.926100] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    1.926185] pci 0000:00:02.0: vgaarb: bridge control possible
[    1.926231] vgaarb: loaded
[    1.986304] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem


Can you check your kernel and see if
Code:
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=2


It looks like this:
Code:
Symbol: VGA_ARB [=y]                                                         
Type  : boolean                                                               
Prompt: VGA Arbitration                                                       
  Location:                                                                   
    -> Device Drivers                                                         
(1)   -> Graphics support                                                     
  Defined at drivers/gpu/vga/Kconfig:1                                       
  Depends on: HAS_IOMEM [=y] && PCI [=y] && !S390                             
  Selected by: VGA_SWITCHEROO [=n] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y]
                                                                             
                                                                             
Symbol: VGA_ARB_MAX_GPUS [=16]                                               
Type  : integer                                                               
Prompt: Maximum number of GPUs                                               
  Location:                                                                   
    -> Device Drivers                                                         
      -> Graphics support                                                     
(2)     -> VGA Arbitration (VGA_ARB [=y])                                     
  Defined at drivers/gpu/vga/Kconfig:12                                       
  Depends on: HAS_IOMEM [=y] && VGA_ARB [=y]

I dunno where the 16 came from, but if you think it could be the source of the problem I have no objections to testing it.
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Mon Sep 11, 2017 4:57 pm    Post subject: Re: Try the instructions in the following article Reply with quote

Holysword wrote:
firasuke wrote:
Looking through your dmesg I found something interesting:

Code:
[  301.189363] vgaarb: this pci device is not a vga device
[  301.199609] vgaarb: this pci device is not a vga device


It should look similar to this:

Code:
[    1.926048] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[    1.926100] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    1.926185] pci 0000:00:02.0: vgaarb: bridge control possible
[    1.926231] vgaarb: loaded
[    1.986304] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem


Can you check your kernel and see if
Code:
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=2


It looks like this:
Code:
Symbol: VGA_ARB [=y]                                                         
Type  : boolean                                                               
Prompt: VGA Arbitration                                                       
  Location:                                                                   
    -> Device Drivers                                                         
(1)   -> Graphics support                                                     
  Defined at drivers/gpu/vga/Kconfig:1                                       
  Depends on: HAS_IOMEM [=y] && PCI [=y] && !S390                             
  Selected by: VGA_SWITCHEROO [=n] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y]
                                                                             
                                                                             
Symbol: VGA_ARB_MAX_GPUS [=16]                                               
Type  : integer                                                               
Prompt: Maximum number of GPUs                                               
  Location:                                                                   
    -> Device Drivers                                                         
      -> Graphics support                                                     
(2)     -> VGA Arbitration (VGA_ARB [=y])                                     
  Defined at drivers/gpu/vga/Kconfig:12                                       
  Depends on: HAS_IOMEM [=y] && VGA_ARB [=y]

I dunno where the 16 came from, but if you think it could be the source of the problem I have no objections to testing it.


Setting CONFIG_VGA_ARB_MAX_GPUS to 16 is an overkill, if you're using an optimus laptop this should be set to 2 (it won't solve your current problem though).

Looking through the web, this looks like a previous bug that was somehow fixed in 3.10 (Bugzilla Kernel 63641) (Bumblebee Github Issue #159)

I found a couple of workarounds, one of them is changing the BusID of your nvidia card in /etc/bumblebee/xorg.conf.nvidia (personally I don't think it has to do anything with the kernel bug above but give it a try).

The other workaround that should work is patching vgaarb to allow it to detect the 3d controller (which is confusing as it was fixed according to the bugzilla link above). You can find the patch file here (vgaarb patch).

Then you apply the patch to your kernel:
Code:
patch -Np1 -i patch_file.patch


Hopefully, it should work if you applied the patch correctly. Keep me updated!
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Mon Sep 11, 2017 5:31 pm    Post subject: Re: Try the instructions in the following article Reply with quote

firasuke wrote:
Setting CONFIG_VGA_ARB_MAX_GPUS to 16 is an overkill, if you're using an optimus laptop this should be set to 2 (it won't solve your current problem though).

Looking through the web, this looks like a previous bug that was somehow fixed in 3.10 (Bugzilla Kernel 63641) (Bumblebee Github Issue #159)

I found a couple of workarounds, one of them is changing the BusID of your nvidia card in /etc/bumblebee/xorg.conf.nvidia (personally I don't think it has to do anything with the kernel bug above but give it a try).

The other workaround that should work is patching vgaarb to allow it to detect the 3d controller (which is confusing as it was fixed according to the bugzilla link above). You can find the patch file here (vgaarb patch).


16 is most likely the default.

This patch is from 2012, kernel-4.9 is from 2016. Is there any reason why it was not incorporated into the master version?
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Mon Sep 11, 2017 5:39 pm    Post subject: Re: Try the instructions in the following article Reply with quote

Holysword wrote:
firasuke wrote:
Setting CONFIG_VGA_ARB_MAX_GPUS to 16 is an overkill, if you're using an optimus laptop this should be set to 2 (it won't solve your current problem though).

Looking through the web, this looks like a previous bug that was somehow fixed in 3.10 (Bugzilla Kernel 63641) (Bumblebee Github Issue #159)

I found a couple of workarounds, one of them is changing the BusID of your nvidia card in /etc/bumblebee/xorg.conf.nvidia (personally I don't think it has to do anything with the kernel bug above but give it a try).

The other workaround that should work is patching vgaarb to allow it to detect the 3d controller (which is confusing as it was fixed according to the bugzilla link above). You can find the patch file here (vgaarb patch).


16 is most likely the default.

This patch is from 2012, kernel-4.9 is from 2016. Is there any reason why it was not incorporated into the master version?


Yes 16 is the default value.

Ikr, that's why I was confused... According to the bugzilla link, it was fixed in 3.10 which is really weird...

It won't hurt if you give it a try. Lemme know what happens after applying it, I'm curious :D :D
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Mon Sep 11, 2017 6:10 pm    Post subject: Re: Try the instructions in the following article Reply with quote

firasuke wrote:
Yes 16 is the default value.

Ikr, that's why I was confused... According to the bugzilla link, it was fixed in 3.10 which is really weird...

It won't hurt if you give it a try. Lemme know what happens after applying it, I'm curious :D :D


It does get rid of that error. However, it still cannot change the ACPI state of the card. This is the new dmesg: https://pastebin.com/6nKSHWMT

I don't know what this new line means, though:
Code:
[  106.141685] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none

_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Mon Sep 11, 2017 6:34 pm    Post subject: Reply with quote

Ok brought you some workarounds, let's see how many workarounds is needed to fix this thing :lol:

Try adding
Code:
acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager

to your kernel command line and notify your bootloader about these changes.

If you're using tlp or powertop or laptop-mode-tools, try disabling them for now or even uninstall them as they may interfere with bbswitch.

Another workaround is to play with your IOMMU kernel settings (although I personally see no benefit whatsoever from doing this but you can give it a try).

It looks like kernel commit 1cc0c998fdf2cb665d625fb565a0d6db5c81c639 was the root of all these problems.

Good Luck :D
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.


Last edited by firasuke on Mon Sep 11, 2017 6:55 pm; edited 1 time in total
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Mon Sep 11, 2017 6:50 pm    Post subject: Reply with quote

firasuke wrote:
Ok brought you some workarounds, let's see how many workarounds is needed to fix this thing :lol:

Try adding
Code:
"acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager"

to your kernel command line and notify your bootloader about these changes.

If you're using tlp or powertop or laptop-mode-tools, try disabling them for now or even uninstall them as they may interfere with bbswitch.

Another workaround is to play with your IOMMU kernel settings (although I personally see no benefit whatsoever from doing this but you can give it a try).

It looks like kernel commit 1cc0c998fdf2cb665d625fb565a0d6db5c81c639 was the root of all these problems.

Good Luck :D


Is the second part of your command line correct? would it recognise "Linux nogpumanager" with space and all?

None of the mentioned tools are installed here.
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Mon Sep 11, 2017 6:56 pm    Post subject: Reply with quote

Holysword wrote:
firasuke wrote:
Ok brought you some workarounds, let's see how many workarounds is needed to fix this thing :lol:

Try adding
Code:
"acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager"

to your kernel command line and notify your bootloader about these changes.

If you're using tlp or powertop or laptop-mode-tools, try disabling them for now or even uninstall them as they may interfere with bbswitch.

Another workaround is to play with your IOMMU kernel settings (although I personally see no benefit whatsoever from doing this but you can give it a try).

It looks like kernel commit 1cc0c998fdf2cb665d625fb565a0d6db5c81c639 was the root of all these problems.

Good Luck :D


Is the second part of your command line correct? would it recognise "Linux nogpumanager" with space and all?

None of the mentioned tools are installed here.


Removed the outside quotes since they were ambiguous (I presumed that you should put them in GRUB_CMDLINE_LINUX="HERE"), add this to your kernel command line (boot parameters) with the quotes listed here and everything you can use ' (single quotation marks) instead of " (double) if your boot params line uses " (double) like grub's :

Code:
"acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager


I noticed another thing earlier, that you're using systemd. In the guide on my website I added steps that required modifying bumblebee's service's script and removed a couple of lines (that check if xorg is installed).

Did you make sure that you did the equivalent thing for systemd?
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.


Last edited by firasuke on Mon Sep 11, 2017 7:26 pm; edited 2 times in total
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Mon Sep 11, 2017 7:07 pm    Post subject: Reply with quote

firasuke wrote:
Removed the outside quotes since they were ambiguous (I presumed that you should put them in GRUB_CMDLINE_LINUX="HERE"), add this to your kernel command line (boot parameters) with the quotes listed here and everything you can use ' (single quotation marks) instead of " (double) if your boot params line uses " (double) like grub's :

Code:
"acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager


Nothing.

Code:
† sleipnir † ~ $  cat /proc/cmdline
root=/dev/sda2 rootfstype=ext4 init=/usr/lib/systemd/systemd acpi_osi=!Windows\x202013 acpi_osi=Linux nogpumanager

_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Mon Sep 11, 2017 7:14 pm    Post subject: Reply with quote

This is really getting interesting, If you don't mind can you share some more logs?

Let's start with:

1- lspci -nnkkvvv

2- lsmod

3- your kernel's .config file (if possible)

4- rc-update (if possible)

5- /var/log/Xorg.0.log

6- /var/log/rc.log (if any/ if possible)

I updated the previous reply of mine, I mentioned that I've included some steps in my guide on how to modify the bumblebee's service script for OpenRC to get it working, did you do the equivalent of this for systemd?

Can you manually turn the card off? Or unload the nvidia module?

Code:
modprobe -r nvidia && echo "OFF" >> /proc/acpi/bbswitch


Did you try using different kernel versions? Different patchsets?

Can you try adding this to your kernel command line:

Code:
i915.enable_hd_vgaarb=1 enable_hd_vgaarb=1


Hopefully, we'll get this solved...
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
firasuke
n00b
n00b


Joined: 19 Sep 2016
Posts: 26
Location: Aleppo, Syria

PostPosted: Sun Sep 17, 2017 1:22 pm    Post subject: Reply with quote

Ok, I've been trying to recreate the problem that you have, and I came upon one important file /etc/modprobe.d/nvidia-rmmod.conf, having already disabled uvm and ksm, this file looks like this:
Code:
# Nvidia UVM support

remove nvidia modprobe -r --ignore-remove nvidia-drm nvidia-modeset nvidia-uvm nvidia


Just make sure that you remove every other module except for nvidia, so the end file should look like this:

Code:
remove nvidia modprobe -r --ignore-remove nvidia


Let me know if it worked for you!
_________________
DOTSLASHLINUX is a GNU/Linux enthusiasts' hub, featuring configuration guides for the linux kernel and several software.
Back to top
View user's profile Send private message
Holysword
l33t
l33t


Joined: 19 Nov 2006
Posts: 946
Location: Greece

PostPosted: Mon Sep 18, 2017 7:37 pm    Post subject: Reply with quote

firasuke wrote:
Ok, I've been trying to recreate the problem that you have, and I came upon one important file /etc/modprobe.d/nvidia-rmmod.conf, having already disabled uvm and ksm, this file looks like this:
Code:
# Nvidia UVM support

remove nvidia modprobe -r --ignore-remove nvidia-drm nvidia-modeset nvidia-uvm nvidia


Just make sure that you remove every other module except for nvidia, so the end file should look like this:

Code:
remove nvidia modprobe -r --ignore-remove nvidia


Let me know if it worked for you!

Those modules do not exist in my machine.

I am starting a fresh new Gentoo Installation to see if it fixes the problem. It will take a while.

EDIT#1: It does not.
I'll stick to some other Distro until Gentoo gets usable again.
_________________
"Nolite arbitrari quia venerim mittere pacem in terram non veni pacem mittere sed gladium" (Yeshua Ha Mashiach)
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum