View previous topic :: View next topic |
Author |
Message |
Herodot Guru
Joined: 29 Jul 2002 Posts: 429 Location: Professor Xavier's school for gifted youngsters
|
Posted: Wed Aug 21, 2002 11:36 am Post subject: Unresolved symbols in nvidia-kernel |
|
|
Hi all.
Once again a new thread by me. I hope the moderators will let me know if this is a problem.
As I've mentioned in other threads, I have problem with the nvidia-kernel. When I change parameters and recompile the kernel, I most often can't emerge the nvidia-kernel. These thing are certainly no-nos: Enabling mtrr, disabling smp, enabling ip tables. Good stuff - I want it!
I'm using the gentoo-kernel. There are no other sources. The /usr/src/linux symlink is correct - I never changed it.
My compile procedure:
cd /usr/src/linux
cp .config ../
make mrproper
(emerge unmerge nvidia-kernel)
cp ../.config ./
make menuconfig
(change settings)
make dep && make clean bzImage modules modules_install
cp /usr/src/linux/arch/i386/boot/bzImage /boot (yes, it's mounted)
This is fine, the kernel compiles, the system can boot with the new kernel. "emerge nvidia-kernel" gives a varying number of unresolved symbols and is therefore not working at all. How can this be corrected? What are unresolved symbols? I am not much of a programmer, but I'm guessing it's some kind of name space violation. Does my USE string have anything to do with this?
Help!
- Herodot |
|
Back to top |
|
|
fidler Apprentice
Joined: 03 Jul 2002 Posts: 162 Location: Utah
|
Posted: Wed Aug 21, 2002 2:31 pm Post subject: |
|
|
I would suggest the following:
1. Save your kernel confiruation to an alternate file
2. Run the following before recompiling the kernel:
make mrproper
make distclean
make clean
This usually fixes unresolved symbols. I can't guarentee it, but it usually works. Unfortunately, it also removes your configuration so *make sure* to back up your kernel configuration before you take the steps. |
|
Back to top |
|
|
Herodot Guru
Joined: 29 Jul 2002 Posts: 429 Location: Professor Xavier's school for gifted youngsters
|
Posted: Wed Aug 21, 2002 5:07 pm Post subject: |
|
|
Sadly, this had no effect.
What is the cause of these problems? Me? Nvidia? The kernel? The Impossible man? Ambush Bug?
Where can I (a pure newbie) read more about what goes on in the compilation process? Also, I don't understand how modules get compiled and how the dependencies work.
- Herodot |
|
Back to top |
|
|
Guest
|
Posted: Wed Aug 21, 2002 6:25 pm Post subject: |
|
|
I am using the gentoo-sources and have an nvidia card as well. I always enable mtrr and ip_tables (I can't remember what smp is set to but I never change that so whatever the default is). I have never had any trouble with emerging the nvidia-kernel with this setup.
Looking at your compile procedure, it is very similar to mine with the exception of two lines:
cp .config ../
cp ../.config ./
I have never done this when compiling the kernel. I don't know if this causes problems but what happens if you try without. Actually, I have only had to use make mrproper once in my failed attempt to get lm_sensors to work. As far as use variables go, I put everything but the kitchen sink in mine.
I wish I could be of more help. It's confusing to me that you are having problems and I am not since we seem to have similar setups. Perhaps it is related to somthing else you have enabled in the kernel. I have a very basic desktop setup and so I don't enable a whole lot of options. |
|
Back to top |
|
|
Guest
|
Posted: Wed Aug 21, 2002 6:42 pm Post subject: |
|
|
Herodot wrote: | Sadly, this had no effect.
What is the cause of these problems? Me? Nvidia? The kernel? The Impossible man? Ambush Bug?
Where can I (a pure newbie) read more about what goes on in the compilation process? Also, I don't understand how modules get compiled and how the dependencies work.
- Herodot |
I personally don't thank tha the nvidia driver (closed source) is compiled against a 2.4.19 kernel. I had it working on my computer sporatically under a 2.4.19 kernel, where it seemed to work all the time on a 2.4.18 kernel. Hence I downgraded my kernel.
Also you *should* have your NVdriver in your /etc/modules.autoload. If it isn't it may not resolve the symbols correctly even if you have the appropriate files compiled. |
|
Back to top |
|
|
fidler Apprentice
Joined: 03 Jul 2002 Posts: 162 Location: Utah
|
Posted: Wed Aug 21, 2002 6:46 pm Post subject: |
|
|
Oops that was me. And I don't think that it is compiled against a 2.4.19 kernel, but rather a 2.4.18 kernel.... |
|
Back to top |
|
|
Guest
|
Posted: Wed Aug 21, 2002 6:57 pm Post subject: |
|
|
I had this same problem and it came down to the fact the the include files in /usr/include/linux and /usr/include/asm where not the same kernel revision as in /usr/src/linux/include/linux and /usr/src/linux/include/asm. I removed the directories /usr/include/linux and /usr/include/asm. Then I symbolically linked the ones from /usr/src/linux:
Code: |
rm -rf /usr/include/linux
rm -rf /usr/include/asm
ln -sf /usr/src/linux/include/linux /usr/include
ln -sf /usr/src/linux/include/asm /usr/include
|
Then I recompiled nvidia-kernel and alsa-drivers. It worked fine. Remember though, every time you modify and build kernel modules in /usr/src/linux, you will have to reinstall the nvidia and alsa stuff.
Qfingers |
|
Back to top |
|
|
Herodot Guru
Joined: 29 Jul 2002 Posts: 429 Location: Professor Xavier's school for gifted youngsters
|
Posted: Wed Aug 21, 2002 10:47 pm Post subject: |
|
|
Good advice from Qfingers and others. Thank you. It didn't work.
I'm beginning to think I'm missing something fairly simple.
Next up is trying to remove all gentoo-sources and nvidia-stuff completely, and then emerging it again. How would I do this? How do I get rid of all modules and other nasty things in the filesystem? Then I would compile a kernel with only very few options enabled, and work my way up. Unfortunately it takes a long time to compile the kernel. When should I reboot? Immediately after "cp bzImage -> /boot" or can I try the troublesome nvidia emerge first?
With smp disabled I can't even compile the kernel!
- Herodot |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Wed Aug 21, 2002 11:06 pm Post subject: |
|
|
Anonymous wrote: | Code: |
rm -rf /usr/include/linux
rm -rf /usr/include/asm
ln -sf /usr/src/linux/include/linux /usr/include
ln -sf /usr/src/linux/include/asm /usr/include
|
|
I'm going to have to disagree here, but I'm perfectly willing to be proven wrong. I remember a big blowout discussion on debian-devel several years ago on this issue, and here's the way I remember it being concluded:
/usr/include/linux and /usr/include/asm should be a stable set of headers. glibc gets built against them. Other programs that then depend on glibc get built against them. If they have changed in the meantime, you can end up with internally incompatible structure definitions and the like. glibc, being built against kernel headers for version A, believes that a "struct foo" is 12 bytes; whereas some poor binary app, being compiled with kernel headers for version B, thinks it's 10 bytes. A library function gets called in libc, gets a pointer to a 10-byte structure, walks off the end, smashes the stack and blammo: all sorts of strange and mysterious crashing ensues. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
fidler Apprentice
Joined: 03 Jul 2002 Posts: 162 Location: Utah
|
Posted: Wed Aug 21, 2002 11:40 pm Post subject: |
|
|
Herodot wrote: | Good advice from Qfingers and others. Thank you. It didn't work.
I'm beginning to think I'm missing something fairly simple.
Next up is trying to remove all gentoo-sources and nvidia-stuff completely, and then emerging it again. How would I do this? How do I get rid of all modules and other nasty things in the filesystem? Then I would compile a kernel with only very few options enabled, and work my way up. Unfortunately it takes a long time to compile the kernel. When should I reboot? Immediately after "cp bzImage -> /boot" or can I try the troublesome nvidia emerge first?
With smp disabled I can't even compile the kernel!
- Herodot |
As for your questions:
cd /lib/modules
rm -rf *
cd /usr/src/
rm -rf *
emerge nvidia-sources
*or*
emerge redhat-sources
*or*
emerge vanilla-sources
Try a vanilla kernel and add the patches you need manually.
Or, perhpas try a redhat kernel. Everthing seems to be compiled against that one.... Perhaps the nvidia-kernel driver is compiled against it as well.
I don't know if you have to reboot before you emerge nvidia-kernel but I would suggest it... |
|
Back to top |
|
|
fidler Apprentice
Joined: 03 Jul 2002 Posts: 162 Location: Utah
|
Posted: Wed Aug 21, 2002 11:49 pm Post subject: |
|
|
rac wrote: | Anonymous wrote: | Code: |
rm -rf /usr/include/linux
rm -rf /usr/include/asm
ln -sf /usr/src/linux/include/linux /usr/include
ln -sf /usr/src/linux/include/asm /usr/include
|
|
I'm going to have to disagree here, but I'm perfectly willing to be proven wrong. I remember a big blowout discussion on debian-devel several years ago on this issue, and here's the way I remember it being concluded:
/usr/include/linux and /usr/include/asm should be a stable set of headers. glibc gets built against them. Other programs that then depend on glibc get built against them. If they have changed in the meantime, you can end up with internally incompatible structure definitions and the like. glibc, being built against kernel headers for version A, believes that a "struct foo" is 12 bytes; whereas some poor binary app, being compiled with kernel headers for version B, thinks it's 10 bytes. A library function gets called in libc, gets a pointer to a 10-byte structure, walks off the end, smashes the stack and blammo: all sorts of strange and mysterious crashing ensues. |
Just a point of order... Because of this, Linus T. suggests that the actual kenel for which glibc is compiled against is located in /usr/src/linux should not be a symbolic link but rather an actual directory... (Remeber back from my LFS days). Why isn't this the default for gentoo? |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Thu Aug 22, 2002 12:00 am Post subject: |
|
|
fidler wrote: | Just a point of order... Because of this, Linus T. suggests that the actual kenel for which glibc is compiled against is located in /usr/src/linux should not be a symbolic link but rather an actual directory... (Remeber back from my LFS days). Why isn't this the default for gentoo? |
As I see it, the critical thing to get is "headers that are immune from changes in kernel version" in /usr/include.
Back in the day, it was SOP to symlink /usr/include/asm and /usr/include/linux to track the appropriate directories in /usr/src/linux/. It was in that climate that Linus made his plea, and his solution was to freeze /usr/src/linux all together.
At this point, I jumped on the Debian train, so I will have to defer to you on how LFS handled it, but what Debian did was to say "no symlinks to kernel source in /usr/include; let libc keep its own copies of the kernel headers in there." At this point, /usr/src/linux can change as much as the user wants, without breaking anything. It doesn't even have to exist.
So to summarize, I think we're both remembering the same problem, and our distributions of choice at the time took different routes to solving it.
Gentoo has followed what for lack of a better term I will call the Debian track. The kernel-headers package contains the headers that glibc is built against, and they're the folks in /usr/include. gentoo-sources (or whatever sources you want) goes in /usr/src/linux, and (apart from stuff like the nvidia modules), Gentoo doesn't care much about how you manage your kernel source tree. Your development enviroment doesn't depend on it.
I sincerely hope the Guest that started this tangent is still attending the party. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
qfingers n00b
Joined: 20 Aug 2002 Posts: 19
|
Posted: Thu Aug 22, 2002 2:19 am Post subject: |
|
|
You cannot make the system immune to kernel changes because you are building code that gets dynamically linked with the kernel. The symbols must resolve the the running kernel.
The problem I was having is exactly as described. Only when I actually emerged the nvidia-kernel and alsa-driver when I symbolically linked the current running/built kernel everything ran find.
One solution would be to have two sets of linux kernel includes: one for building kernel modules and one for everyday compilation. But this is a hack.
The best would be to relink glibc with the latest kernel sources. And each time the kernel changes, glibc gets remerged.
Qfingers |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Thu Aug 22, 2002 3:55 am Post subject: |
|
|
qfingers wrote: | You cannot make the system immune to kernel changes because you are building code that gets dynamically linked with the kernel. The symbols must resolve the the running kernel. |
If you are talking specifically about building kernel modules, I agree. If you're including all userspace software, I don't. I think it is possible to shield userland software from changes in kernel changes.
Quote: | The problem I was having is exactly as described. Only when I actually emerged the nvidia-kernel and alsa-driver when I symbolically linked the current running/built kernel everything ran find. |
If this is true, all I can say is that I would consider that a bug in the Makefile or ebuild of those modules. Kernel modules should look for their headers in the source tree that matches the running kernel, I agree. They should not also ask all other userland software to do so also, in my opinion.
Quote: | One solution would be to have two sets of linux kernel includes: one for building kernel modules and one for everyday compilation. But this is a hack. |
Well, it seems like a good compromise to me between having unpredictable system instability and having to recompile everything on an entire system every time the kernel changes.
Quote: | The best would be to relink glibc with the latest kernel sources. And each time the kernel changes, glibc gets remerged. |
That's certainly one option. I hope that it is not considered to be "the best" for Gentoo, however, because I would consider it an undue burden to recompile my entire system just to upgrade a kernel.
If you remerged glibc and everything else on your system when you changed the system headers in /usr/include, I agree that there would be no problem.
I was worried that someone would see your post, take your advice, change the system headers so that they didn't match the oncs glibc was compiled with, and then proceed to compile software linked against glibc with potential inconsistent type definitions. I can envision lots of system instability that could ensue from doing that. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
Herodot Guru
Joined: 29 Jul 2002 Posts: 429 Location: Professor Xavier's school for gifted youngsters
|
Posted: Thu Aug 22, 2002 11:25 am Post subject: |
|
|
So, there's a small hope that the fault is with the makefile/ebuild/portage?
I'm not too keen on learning how to patch a 2.4.18 kernel, and it takes away the ease of the portage system ("just emerge anything you want").
I guess I could try a vanilla 2.4.18 unpatched (if such a thing exists), to see if there's a problem there as well. Different kernels have different options and I guess they bring their own menuconfig with those options. But how do patches enter the menuconfig? How do I see which patches the current gentoo kernel has?
Other people don't have these problems on similar setups, which is somewhat strange.
- Herodot |
|
Back to top |
|
|
Guest
|
Posted: Thu Aug 22, 2002 1:13 pm Post subject: |
|
|
[quote="rac]If this is true, all I can say is that I would consider that a bug in the Makefile or ebuild of those modules. Kernel modules should look for their headers in the source tree that matches the running kernel, I agree. They should not also ask all other userland software to do so also, in my opinion.
[/quote]
Agreed. If your building kernel modules, it should look at /usr/src/linux/include. This would be a good change to kernel modules. I also think the kernel management should be included into the ebuild scripts. If I want to make changes to the kernel and/or modules and install them, it should have the smarts to rebuild my "external" built modules when building it. Otherwise it should not remove all the files in /lib/modules/versionX.X.X/ when doing a "make modules_install".
I think it could be done with a dependancy check using emerge. I'm still learning portage and am getting better. Eventually I'm sure this type of situation will come up, so it should/will be solved.
Qfingers |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Thu Aug 22, 2002 7:59 pm Post subject: |
|
|
Anonymous wrote: | If your building kernel modules, it should look at /usr/src/linux/include. This would be a good change to kernel modules. |
Have you checked the situation lately? When did you have the problem that required you to fiddle with stuff in /usr/include? The last time I looked at the nvidia-kernel ebuild, all I remember is discovering that it relies on the sources for the running kernel to be in /usr/src/linux, so mabye things have already improved.
Quote: | If I want to make changes to the kernel and/or modules and install them, it should have the smarts to rebuild my "external" built modules when building it. Otherwise it should not remove all the files in /lib/modules/versionX.X.X/ when doing a "make modules_install". |
That's a great suggestion, but I'm not sure leaving cruft in /lib/modules is the answer. Say the user changes something critical that affects the way modules are linked - like that versioning system, and at the same time deletes a module. You end up with a stale module that, if linked, could crash the kernel. Maybe adding a modules_clean target would be a good idea. But we're talking about messing with kernel makefiles now, and that would mean that people that emerge vanilla-sources and people who download from kernel.org would end up with slightly different build procedures, and that could be a support hassle.
I envision something along the lines of the init scripts, where you have a kernel-update program like rc-update that allows you to add and delete external kernel modules from the kernel dependency. Then there could be some kernel building command (like Debian's kernel-package or some such) that builds the kernel along with the extra modules you have specified. How does that sound to you?
We should probably take this discussion out of Newbies and over into Gentoo Chat. We've hijacked Herodot's thread here.
Herodot, in order to diagnose your problem, some concrete error messages would be really helpful. Trying the vanilla-sources might help you, and is usually one of the first things I recommend when people are encountering kernel problems. You can get a list of the gentoo patches in the gentoo-sources ebuild itself, which is located in /usr/portage/sys-kernel/gentoo-sources/. You can see the actual contents of the patches in /usr/portage/distfiles/linux-gentoo-[version].patch.bz2. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
Herodot Guru
Joined: 29 Jul 2002 Posts: 429 Location: Professor Xavier's school for gifted youngsters
|
Posted: Thu Aug 22, 2002 9:42 pm Post subject: |
|
|
Exact error messages.. Well, I didn't write them down, so this is from memory.
disabling smp: the kernel doesn't compile at all.
enable mttr: The kernel compiles, the system boots. Emerge nvidia-kernel fails with something like "depmod: unresolved symbols in /some/where/video/NVdriver"
enable acpi or ip-tables (and probably others): emerge nvidia-kernel fails with many unresolved symbols. They have better descriptions, but I can't remember them.
So, I guess I'll try the vanilla kernel.
Quote: |
I was worried that someone would see your post, take your advice, change the system headers so that they didn't match the oncs glibc was compiled with, and then proceed to compile software linked against glibc with potential inconsistent type definitions. I can envision lots of system instability that could ensue from doing that.
|
did I do that? I followed the advice to rm and slink the directories. How do I get back? (actually, instead of rm -rf, I moved the directories to a safe place...). I really, really don't want an unstable system. I've been running this gentoo installation 24/7 for a month, with billions of emerges, edits and tries. Not a single crash, lockup, panic or anything nasty. Sweet.
So, I'll just emerge vanilla-kernel, make menuconfig, make dep blahblah ? I'll have to live without any patches, it look to be a little over my level to go that far.
When compiling the kernel, is it important what's running? I usually kill gdm and such. How about loaded modules, should they be unloaded? What about my USE-string, does it matter at this point?
Does /etc/modules.autoload matter when compiling the kernel? "Guest" claims so, but at the very first kernel compile it's obviously empty...
can I simplify the make/compile directive if I don't have anything marked as <module> in menuconfig?
- Herodot |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Thu Aug 22, 2002 9:52 pm Post subject: |
|
|
Herodot wrote: | did I do that? I followed the advice to rm and slink the directories. How do I get back? (actually, instead of rm -rf, I moved the directories to a safe place...). |
If you stashed them somewhere, I would remove those symlinks and replace the contents in /usr/include with those you stashed away. Another way to do it would be to remove the symlinks and then Code: | # emerge linux-headers |
Quote: | When compiling the kernel, is it important what's running? I usually kill gdm and such. How about loaded modules, should they be unloaded? What about my USE-string, does it matter at this point?
Does /etc/modules.autoload matter when compiling the kernel? |
Don't worry about any of that. The only thing that will change is that it may take slightly longer for your compile to complete if your system is otherwise heaviily loaded.
Quote: | can I simplify the make/compile directive if I don't have anything marked as <module> in menuconfig? |
Yes. In general, you can skip the modules and modules_install steps completely, and proceed directly to installing your new kernel in /boot if "make bzImage" completes without error. If you are going to install the nvidia driver later (or alsa), however, it would probably be a good idea to include "make modules" and "make modules_install" the first time, just to make sure your /lib/modules directory is free of old garbage. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
Herodot Guru
Joined: 29 Jul 2002 Posts: 429 Location: Professor Xavier's school for gifted youngsters
|
Posted: Fri Aug 23, 2002 1:09 pm Post subject: |
|
|
I tried the vanilla sources. I tried a simple configuration, and it compiled. The nvidia-kernel emerged succesfully. But there was a problem with the NVdriver, the module couldn't be loaded. I got the feeling that it was somehow in the wrong folder or something. It was clearly in /lib/modules/2.4.19/video/ but apperently that wasn't good enough.
When I have several sources in /usr/src/ I select them with the /usr/src/linux symlink. How so for /lib/modules/ ?
I'm so glad this is a newbie forum - I can ask all the stupid newbie questions I want!
- Herodot |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Fri Aug 23, 2002 6:48 pm Post subject: |
|
|
Herodot wrote: | I tried the vanilla sources. I tried a simple configuration, and it compiled. The nvidia-kernel emerged succesfully. |
The key question here is: was the /usr/src/linux symlink pointing at the vanilla sources for the kernel you are running now when you emerged nvidia-kernel? nvidia-kernel looks in /usr/src/linux.
Quote: | It was clearly in /lib/modules/2.4.19/video/ but apperently that wasn't good enough. |
What kernel are you running now? (uname -a will tell you)
Quote: | When I have several sources in /usr/src/ I select them with the /usr/src/linux symlink. How so for /lib/modules/ ? |
Don't worry about it, and don't move things around in there unless you are sure you know what you are doing. Each kernel looks for its modules in /lib/modules/kernelversion, so you can install two different versions of the kernel side-by-side.
What happens when you try to modprobe NVdriver by hand? _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
Herodot Guru
Joined: 29 Jul 2002 Posts: 429 Location: Professor Xavier's school for gifted youngsters
|
Posted: Sat Aug 24, 2002 12:24 pm Post subject: |
|
|
What can I say... It suddenly works. I can't think of what I've done dfferent, but...
Status:
Kernel: Vanilla 2.4.19
I can't disable smp, the kernel won't compile without. I want smp out, it might help with the time drift problem I have.
Mtrr. The nvidia-kernel emerges and the module loads fine now. It didn't raise my glxgears performance though.
Acpi. The nvidia-kernel emerges and the module loads fine now. My computer still doesn't shut down though. I'll try apm next.
IP-tables. I first tried to compile ip-tables as modules, but that didn't go over well with the nvidia-kernel. I then compiled everything into the kernel and that went better. When I start firestarter, it still gives some error messages, but I haven't looked into this yet.
So, I now have a vanilla kernel, a system running as before, minus 100 hours of sleep and a lot of new linux knowledge.
Thank you all!
- Herodot |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|