Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Unresolved symbols in nvidia-kernel
View unanswered posts
View posts from last 24 hours
View posts from last 7 days

 
Reply to topic    Gentoo Forums Forum Index Desktop Environments
View previous topic :: View next topic  
Author Message
Herodot
Guru
Guru


Joined: 29 Jul 2002
Posts: 429
Location: Professor Xavier's school for gifted youngsters

PostPosted: Wed Aug 21, 2002 11:36 am    Post subject: Unresolved symbols in nvidia-kernel Reply with quote

Hi all.

Once again a new thread by me. I hope the moderators will let me know if this is a problem.

As I've mentioned in other threads, I have problem with the nvidia-kernel. When I change parameters and recompile the kernel, I most often can't emerge the nvidia-kernel. These thing are certainly no-nos: Enabling mtrr, disabling smp, enabling ip tables. Good stuff - I want it!

I'm using the gentoo-kernel. There are no other sources. The /usr/src/linux symlink is correct - I never changed it.

My compile procedure:
cd /usr/src/linux
cp .config ../
make mrproper
(emerge unmerge nvidia-kernel)
cp ../.config ./
make menuconfig
(change settings)
make dep && make clean bzImage modules modules_install
cp /usr/src/linux/arch/i386/boot/bzImage /boot (yes, it's mounted)

This is fine, the kernel compiles, the system can boot with the new kernel. "emerge nvidia-kernel" gives a varying number of unresolved symbols and is therefore not working at all. How can this be corrected? What are unresolved symbols? I am not much of a programmer, but I'm guessing it's some kind of name space violation. Does my USE string have anything to do with this?

Help!

- Herodot
Back to top
View user's profile Send private message
fidler
Apprentice
Apprentice


Joined: 03 Jul 2002
Posts: 162
Location: Utah

PostPosted: Wed Aug 21, 2002 2:31 pm    Post subject: Reply with quote

I would suggest the following:

1. Save your kernel confiruation to an alternate file
2. Run the following before recompiling the kernel:

make mrproper
make distclean
make clean

This usually fixes unresolved symbols. I can't guarentee it, but it usually works. Unfortunately, it also removes your configuration so *make sure* to back up your kernel configuration before you take the steps.
Back to top
View user's profile Send private message
Herodot
Guru
Guru


Joined: 29 Jul 2002
Posts: 429
Location: Professor Xavier's school for gifted youngsters

PostPosted: Wed Aug 21, 2002 5:07 pm    Post subject: Reply with quote

Sadly, this had no effect.

What is the cause of these problems? Me? Nvidia? The kernel? The Impossible man? Ambush Bug?

Where can I (a pure newbie) read more about what goes on in the compilation process? Also, I don't understand how modules get compiled and how the dependencies work.

- Herodot
Back to top
View user's profile Send private message
Guest






PostPosted: Wed Aug 21, 2002 6:25 pm    Post subject: Reply with quote

I am using the gentoo-sources and have an nvidia card as well. I always enable mtrr and ip_tables (I can't remember what smp is set to but I never change that so whatever the default is). I have never had any trouble with emerging the nvidia-kernel with this setup.

Looking at your compile procedure, it is very similar to mine with the exception of two lines:

cp .config ../
cp ../.config ./

I have never done this when compiling the kernel. I don't know if this causes problems but what happens if you try without. Actually, I have only had to use make mrproper once in my failed attempt to get lm_sensors to work. As far as use variables go, I put everything but the kitchen sink in mine.

I wish I could be of more help. It's confusing to me that you are having problems and I am not since we seem to have similar setups. Perhaps it is related to somthing else you have enabled in the kernel. I have a very basic desktop setup and so I don't enable a whole lot of options.
Back to top
Guest






PostPosted: Wed Aug 21, 2002 6:42 pm    Post subject: Reply with quote

Herodot wrote:
Sadly, this had no effect.

What is the cause of these problems? Me? Nvidia? The kernel? The Impossible man? Ambush Bug?

Where can I (a pure newbie) read more about what goes on in the compilation process? Also, I don't understand how modules get compiled and how the dependencies work.

- Herodot


I personally don't thank tha the nvidia driver (closed source) is compiled against a 2.4.19 kernel. I had it working on my computer sporatically under a 2.4.19 kernel, where it seemed to work all the time on a 2.4.18 kernel. Hence I downgraded my kernel.

Also you *should* have your NVdriver in your /etc/modules.autoload. If it isn't it may not resolve the symbols correctly even if you have the appropriate files compiled.
Back to top
fidler
Apprentice
Apprentice


Joined: 03 Jul 2002
Posts: 162
Location: Utah

PostPosted: Wed Aug 21, 2002 6:46 pm    Post subject: Reply with quote

Oops that was me. And I don't think that it is compiled against a 2.4.19 kernel, but rather a 2.4.18 kernel....
Back to top
View user's profile Send private message
Guest






PostPosted: Wed Aug 21, 2002 6:57 pm    Post subject: Reply with quote

I had this same problem and it came down to the fact the the include files in /usr/include/linux and /usr/include/asm where not the same kernel revision as in /usr/src/linux/include/linux and /usr/src/linux/include/asm. I removed the directories /usr/include/linux and /usr/include/asm. Then I symbolically linked the ones from /usr/src/linux:

Code:

rm -rf /usr/include/linux
rm -rf /usr/include/asm
ln -sf /usr/src/linux/include/linux /usr/include
ln -sf /usr/src/linux/include/asm /usr/include


Then I recompiled nvidia-kernel and alsa-drivers. It worked fine. Remember though, every time you modify and build kernel modules in /usr/src/linux, you will have to reinstall the nvidia and alsa stuff.

Qfingers
Back to top
Herodot
Guru
Guru


Joined: 29 Jul 2002
Posts: 429
Location: Professor Xavier's school for gifted youngsters

PostPosted: Wed Aug 21, 2002 10:47 pm    Post subject: Reply with quote

Good advice from Qfingers and others. Thank you. It didn't work.

I'm beginning to think I'm missing something fairly simple.

Next up is trying to remove all gentoo-sources and nvidia-stuff completely, and then emerging it again. How would I do this? How do I get rid of all modules and other nasty things in the filesystem? Then I would compile a kernel with only very few options enabled, and work my way up. Unfortunately it takes a long time to compile the kernel. When should I reboot? Immediately after "cp bzImage -> /boot" or can I try the troublesome nvidia emerge first?

With smp disabled I can't even compile the kernel!


- Herodot
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Wed Aug 21, 2002 11:06 pm    Post subject: Reply with quote

Anonymous wrote:
Code:

rm -rf /usr/include/linux
rm -rf /usr/include/asm
ln -sf /usr/src/linux/include/linux /usr/include
ln -sf /usr/src/linux/include/asm /usr/include

I'm going to have to disagree here, but I'm perfectly willing to be proven wrong. :) I remember a big blowout discussion on debian-devel several years ago on this issue, and here's the way I remember it being concluded:

/usr/include/linux and /usr/include/asm should be a stable set of headers. glibc gets built against them. Other programs that then depend on glibc get built against them. If they have changed in the meantime, you can end up with internally incompatible structure definitions and the like. glibc, being built against kernel headers for version A, believes that a "struct foo" is 12 bytes; whereas some poor binary app, being compiled with kernel headers for version B, thinks it's 10 bytes. A library function gets called in libc, gets a pointer to a 10-byte structure, walks off the end, smashes the stack and blammo: all sorts of strange and mysterious crashing ensues.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
fidler
Apprentice
Apprentice


Joined: 03 Jul 2002
Posts: 162
Location: Utah

PostPosted: Wed Aug 21, 2002 11:40 pm    Post subject: Reply with quote

Herodot wrote:
Good advice from Qfingers and others. Thank you. It didn't work.

I'm beginning to think I'm missing something fairly simple.

Next up is trying to remove all gentoo-sources and nvidia-stuff completely, and then emerging it again. How would I do this? How do I get rid of all modules and other nasty things in the filesystem? Then I would compile a kernel with only very few options enabled, and work my way up. Unfortunately it takes a long time to compile the kernel. When should I reboot? Immediately after "cp bzImage -> /boot" or can I try the troublesome nvidia emerge first?

With smp disabled I can't even compile the kernel!


- Herodot


As for your questions:

cd /lib/modules
rm -rf *
cd /usr/src/
rm -rf *
emerge nvidia-sources
*or*
emerge redhat-sources
*or*
emerge vanilla-sources

Try a vanilla kernel and add the patches you need manually.

Or, perhpas try a redhat kernel. Everthing seems to be compiled against that one.... Perhaps the nvidia-kernel driver is compiled against it as well.

I don't know if you have to reboot before you emerge nvidia-kernel but I would suggest it...
Back to top
View user's profile Send private message
fidler
Apprentice
Apprentice


Joined: 03 Jul 2002
Posts: 162
Location: Utah

PostPosted: Wed Aug 21, 2002 11:49 pm    Post subject: Reply with quote

rac wrote:
Anonymous wrote:
Code:

rm -rf /usr/include/linux
rm -rf /usr/include/asm
ln -sf /usr/src/linux/include/linux /usr/include
ln -sf /usr/src/linux/include/asm /usr/include

I'm going to have to disagree here, but I'm perfectly willing to be proven wrong. :) I remember a big blowout discussion on debian-devel several years ago on this issue, and here's the way I remember it being concluded:

/usr/include/linux and /usr/include/asm should be a stable set of headers. glibc gets built against them. Other programs that then depend on glibc get built against them. If they have changed in the meantime, you can end up with internally incompatible structure definitions and the like. glibc, being built against kernel headers for version A, believes that a "struct foo" is 12 bytes; whereas some poor binary app, being compiled with kernel headers for version B, thinks it's 10 bytes. A library function gets called in libc, gets a pointer to a 10-byte structure, walks off the end, smashes the stack and blammo: all sorts of strange and mysterious crashing ensues.


Just a point of order... Because of this, Linus T. suggests that the actual kenel for which glibc is compiled against is located in /usr/src/linux should not be a symbolic link but rather an actual directory... (Remeber back from my LFS days). Why isn't this the default for gentoo?
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Thu Aug 22, 2002 12:00 am    Post subject: Reply with quote

fidler wrote:
Just a point of order... Because of this, Linus T. suggests that the actual kenel for which glibc is compiled against is located in /usr/src/linux should not be a symbolic link but rather an actual directory... (Remeber back from my LFS days). Why isn't this the default for gentoo?

As I see it, the critical thing to get is "headers that are immune from changes in kernel version" in /usr/include.

Back in the day, it was SOP to symlink /usr/include/asm and /usr/include/linux to track the appropriate directories in /usr/src/linux/. It was in that climate that Linus made his plea, and his solution was to freeze /usr/src/linux all together.

At this point, I jumped on the Debian train, so I will have to defer to you on how LFS handled it, but what Debian did was to say "no symlinks to kernel source in /usr/include; let libc keep its own copies of the kernel headers in there." At this point, /usr/src/linux can change as much as the user wants, without breaking anything. It doesn't even have to exist.

So to summarize, I think we're both remembering the same problem, and our distributions of choice at the time took different routes to solving it.

Gentoo has followed what for lack of a better term I will call the Debian track. The kernel-headers package contains the headers that glibc is built against, and they're the folks in /usr/include. gentoo-sources (or whatever sources you want) goes in /usr/src/linux, and (apart from stuff like the nvidia modules), Gentoo doesn't care much about how you manage your kernel source tree. Your development enviroment doesn't depend on it.

I sincerely hope the Guest that started this tangent is still attending the party.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
qfingers
n00b
n00b


Joined: 20 Aug 2002
Posts: 19

PostPosted: Thu Aug 22, 2002 2:19 am    Post subject: Reply with quote

You cannot make the system immune to kernel changes because you are building code that gets dynamically linked with the kernel. The symbols must resolve the the running kernel.

The problem I was having is exactly as described. Only when I actually emerged the nvidia-kernel and alsa-driver when I symbolically linked the current running/built kernel everything ran find.

One solution would be to have two sets of linux kernel includes: one for building kernel modules and one for everyday compilation. But this is a hack.

The best would be to relink glibc with the latest kernel sources. And each time the kernel changes, glibc gets remerged.

Qfingers
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Thu Aug 22, 2002 3:55 am    Post subject: Reply with quote

qfingers wrote:
You cannot make the system immune to kernel changes because you are building code that gets dynamically linked with the kernel. The symbols must resolve the the running kernel.

If you are talking specifically about building kernel modules, I agree. If you're including all userspace software, I don't. I think it is possible to shield userland software from changes in kernel changes.

Quote:
The problem I was having is exactly as described. Only when I actually emerged the nvidia-kernel and alsa-driver when I symbolically linked the current running/built kernel everything ran find.

If this is true, all I can say is that I would consider that a bug in the Makefile or ebuild of those modules. Kernel modules should look for their headers in the source tree that matches the running kernel, I agree. They should not also ask all other userland software to do so also, in my opinion.

Quote:
One solution would be to have two sets of linux kernel includes: one for building kernel modules and one for everyday compilation. But this is a hack.

Well, it seems like a good compromise to me between having unpredictable system instability and having to recompile everything on an entire system every time the kernel changes.

Quote:
The best would be to relink glibc with the latest kernel sources. And each time the kernel changes, glibc gets remerged.

That's certainly one option. I hope that it is not considered to be "the best" for Gentoo, however, because I would consider it an undue burden to recompile my entire system just to upgrade a kernel.

If you remerged glibc and everything else on your system when you changed the system headers in /usr/include, I agree that there would be no problem.

I was worried that someone would see your post, take your advice, change the system headers so that they didn't match the oncs glibc was compiled with, and then proceed to compile software linked against glibc with potential inconsistent type definitions. I can envision lots of system instability that could ensue from doing that.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
Herodot
Guru
Guru


Joined: 29 Jul 2002
Posts: 429
Location: Professor Xavier's school for gifted youngsters

PostPosted: Thu Aug 22, 2002 11:25 am    Post subject: Reply with quote

So, there's a small hope that the fault is with the makefile/ebuild/portage?

I'm not too keen on learning how to patch a 2.4.18 kernel, and it takes away the ease of the portage system ("just emerge anything you want").

I guess I could try a vanilla 2.4.18 unpatched (if such a thing exists), to see if there's a problem there as well. Different kernels have different options and I guess they bring their own menuconfig with those options. But how do patches enter the menuconfig? How do I see which patches the current gentoo kernel has?

Other people don't have these problems on similar setups, which is somewhat strange.

- Herodot
Back to top
View user's profile Send private message
Guest






PostPosted: Thu Aug 22, 2002 1:13 pm    Post subject: Reply with quote

[quote="rac]If this is true, all I can say is that I would consider that a bug in the Makefile or ebuild of those modules. Kernel modules should look for their headers in the source tree that matches the running kernel, I agree. They should not also ask all other userland software to do so also, in my opinion.
[/quote]
Agreed. If your building kernel modules, it should look at /usr/src/linux/include. This would be a good change to kernel modules. I also think the kernel management should be included into the ebuild scripts. If I want to make changes to the kernel and/or modules and install them, it should have the smarts to rebuild my "external" built modules when building it. Otherwise it should not remove all the files in /lib/modules/versionX.X.X/ when doing a "make modules_install".
I think it could be done with a dependancy check using emerge. I'm still learning portage and am getting better. Eventually I'm sure this type of situation will come up, so it should/will be solved.

Qfingers
Back to top
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Thu Aug 22, 2002 7:59 pm    Post subject: Reply with quote

Anonymous wrote:
If your building kernel modules, it should look at /usr/src/linux/include. This would be a good change to kernel modules.

Have you checked the situation lately? When did you have the problem that required you to fiddle with stuff in /usr/include? The last time I looked at the nvidia-kernel ebuild, all I remember is discovering that it relies on the sources for the running kernel to be in /usr/src/linux, so mabye things have already improved.

Quote:
If I want to make changes to the kernel and/or modules and install them, it should have the smarts to rebuild my "external" built modules when building it. Otherwise it should not remove all the files in /lib/modules/versionX.X.X/ when doing a "make modules_install".

That's a great suggestion, but I'm not sure leaving cruft in /lib/modules is the answer. Say the user changes something critical that affects the way modules are linked - like that versioning system, and at the same time deletes a module. You end up with a stale module that, if linked, could crash the kernel. Maybe adding a modules_clean target would be a good idea. But we're talking about messing with kernel makefiles now, and that would mean that people that emerge vanilla-sources and people who download from kernel.org would end up with slightly different build procedures, and that could be a support hassle.

I envision something along the lines of the init scripts, where you have a kernel-update program like rc-update that allows you to add and delete external kernel modules from the kernel dependency. Then there could be some kernel building command (like Debian's kernel-package or some such) that builds the kernel along with the extra modules you have specified. How does that sound to you?

We should probably take this discussion out of Newbies and over into Gentoo Chat. We've hijacked Herodot's thread here.

Herodot, in order to diagnose your problem, some concrete error messages would be really helpful. Trying the vanilla-sources might help you, and is usually one of the first things I recommend when people are encountering kernel problems. You can get a list of the gentoo patches in the gentoo-sources ebuild itself, which is located in /usr/portage/sys-kernel/gentoo-sources/. You can see the actual contents of the patches in /usr/portage/distfiles/linux-gentoo-[version].patch.bz2.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
Herodot
Guru
Guru


Joined: 29 Jul 2002
Posts: 429
Location: Professor Xavier's school for gifted youngsters

PostPosted: Thu Aug 22, 2002 9:42 pm    Post subject: Reply with quote

Exact error messages.. Well, I didn't write them down, so this is from memory.

disabling smp: the kernel doesn't compile at all.

enable mttr: The kernel compiles, the system boots. Emerge nvidia-kernel fails with something like "depmod: unresolved symbols in /some/where/video/NVdriver"

enable acpi or ip-tables (and probably others): emerge nvidia-kernel fails with many unresolved symbols. They have better descriptions, but I can't remember them.

So, I guess I'll try the vanilla kernel.

Quote:

I was worried that someone would see your post, take your advice, change the system headers so that they didn't match the oncs glibc was compiled with, and then proceed to compile software linked against glibc with potential inconsistent type definitions. I can envision lots of system instability that could ensue from doing that.


did I do that? I followed the advice to rm and slink the directories. How do I get back? (actually, instead of rm -rf, I moved the directories to a safe place...). I really, really don't want an unstable system. I've been running this gentoo installation 24/7 for a month, with billions of emerges, edits and tries. Not a single crash, lockup, panic or anything nasty. Sweet.

So, I'll just emerge vanilla-kernel, make menuconfig, make dep blahblah ? I'll have to live without any patches, it look to be a little over my level to go that far.

When compiling the kernel, is it important what's running? I usually kill gdm and such. How about loaded modules, should they be unloaded? What about my USE-string, does it matter at this point?
Does /etc/modules.autoload matter when compiling the kernel? "Guest" claims so, but at the very first kernel compile it's obviously empty...

can I simplify the make/compile directive if I don't have anything marked as <module> in menuconfig?


- Herodot
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Thu Aug 22, 2002 9:52 pm    Post subject: Reply with quote

Herodot wrote:
did I do that? I followed the advice to rm and slink the directories. How do I get back? (actually, instead of rm -rf, I moved the directories to a safe place...).

If you stashed them somewhere, I would remove those symlinks and replace the contents in /usr/include with those you stashed away. Another way to do it would be to remove the symlinks and then
Code:
# emerge linux-headers

Quote:
When compiling the kernel, is it important what's running? I usually kill gdm and such. How about loaded modules, should they be unloaded? What about my USE-string, does it matter at this point?
Does /etc/modules.autoload matter when compiling the kernel?

Don't worry about any of that. The only thing that will change is that it may take slightly longer for your compile to complete if your system is otherwise heaviily loaded.

Quote:
can I simplify the make/compile directive if I don't have anything marked as <module> in menuconfig?

Yes. In general, you can skip the modules and modules_install steps completely, and proceed directly to installing your new kernel in /boot if "make bzImage" completes without error. If you are going to install the nvidia driver later (or alsa), however, it would probably be a good idea to include "make modules" and "make modules_install" the first time, just to make sure your /lib/modules directory is free of old garbage.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
Herodot
Guru
Guru


Joined: 29 Jul 2002
Posts: 429
Location: Professor Xavier's school for gifted youngsters

PostPosted: Fri Aug 23, 2002 1:09 pm    Post subject: Reply with quote

I tried the vanilla sources. I tried a simple configuration, and it compiled. The nvidia-kernel emerged succesfully. But there was a problem with the NVdriver, the module couldn't be loaded. I got the feeling that it was somehow in the wrong folder or something. It was clearly in /lib/modules/2.4.19/video/ but apperently that wasn't good enough.

When I have several sources in /usr/src/ I select them with the /usr/src/linux symlink. How so for /lib/modules/ ?

I'm so glad this is a newbie forum - I can ask all the stupid newbie questions I want!

- Herodot
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Fri Aug 23, 2002 6:48 pm    Post subject: Reply with quote

Herodot wrote:
I tried the vanilla sources. I tried a simple configuration, and it compiled. The nvidia-kernel emerged succesfully.

The key question here is: was the /usr/src/linux symlink pointing at the vanilla sources for the kernel you are running now when you emerged nvidia-kernel? nvidia-kernel looks in /usr/src/linux.

Quote:
It was clearly in /lib/modules/2.4.19/video/ but apperently that wasn't good enough.

What kernel are you running now? (uname -a will tell you)

Quote:
When I have several sources in /usr/src/ I select them with the /usr/src/linux symlink. How so for /lib/modules/ ?

Don't worry about it, and don't move things around in there unless you are sure you know what you are doing. Each kernel looks for its modules in /lib/modules/kernelversion, so you can install two different versions of the kernel side-by-side.

What happens when you try to modprobe NVdriver by hand?
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
Herodot
Guru
Guru


Joined: 29 Jul 2002
Posts: 429
Location: Professor Xavier's school for gifted youngsters

PostPosted: Sat Aug 24, 2002 12:24 pm    Post subject: Reply with quote

What can I say... It suddenly works. I can't think of what I've done dfferent, but...

Status:

Kernel: Vanilla 2.4.19
I can't disable smp, the kernel won't compile without. I want smp out, it might help with the time drift problem I have.
Mtrr. The nvidia-kernel emerges and the module loads fine now. It didn't raise my glxgears performance though.
Acpi. The nvidia-kernel emerges and the module loads fine now. My computer still doesn't shut down though. I'll try apm next.
IP-tables. I first tried to compile ip-tables as modules, but that didn't go over well with the nvidia-kernel. I then compiled everything into the kernel and that went better. When I start firestarter, it still gives some error messages, but I haven't looked into this yet.

So, I now have a vanilla kernel, a system running as before, minus 100 hours of sleep and a lot of new linux knowledge.

Thank you all!

- Herodot
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Desktop Environments All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum