View previous topic :: View next topic |
Author |
Message |
mijenix Guru


Joined: 22 Apr 2003 Posts: 393 Location: Switzerland
|
Posted: Sat Jul 01, 2006 10:02 pm Post subject: e1000 only works when a Cable is plugged in |
|
|
Hi
I use Kernel 2.6.17.1 and I noticed that the internal NIC of my Lenovo T60 works only when a cable is plugged in.
When no cable is plugged in, then this message appears:
Quote: |
e1000: 0000:02:00.0: e1000_probe: The EEPROM Checksum Is Not Valid
e1000: probe of 0000:02:00.0 failed with error -5
|
I compiled the driver in the kernel no module.
Does anyone noticed that too? |
|
Back to top |
|
 |
sundialsvc4 Guru

Joined: 10 Nov 2005 Posts: 436
|
Posted: Sun Jul 02, 2006 12:26 am Post subject: |
|
|
That sounds quite odd. I mean, it's a good thing that things seem to be working when the cable's plugged in, but that message rather-obviously makes little sense. I'd start by chasing through the source-code to see where and why it actually comes up... |
|
Back to top |
|
 |
phorn Tux's lil' helper

Joined: 01 Jul 2006 Posts: 109
|
Posted: Sun Jul 02, 2006 5:20 am Post subject: |
|
|
I'm actually having this problem with an onboard motherboard e1000, though in my case it happens all the time.
In my case, the EEPROM checksum is off by 3 (very little data actually changed apparently), but it happens all the time.
I have submitted a bug report recently, but it seems for now, the kernel developer who replied didn't want to add an override option:
This happens around line 4697 of drivers/net/e1000/e1000_hw.c in your kernel source tree.
The easiest way to avoid th error is to do:
Code: |
...
if(checksum != (uint16_t) EEPROM_SUM) {
DEBUGOUT("EEPROM Checksum Invalid\n");
// return -E1000_ERR_EEPROM;
}
return E1000_SUCCESS;
}
|
Add "return E1000_SUCCESS" to the end if it isn't already there, and comment out with "//" the return -E1000_ERR_EEPROM near the end of the function)
Then you will want to recompile the kernel, and re-copy the kernel into /boot/ and then reboot. |
|
Back to top |
|
 |
mijenix Guru


Joined: 22 Apr 2003 Posts: 393 Location: Switzerland
|
Posted: Sun Jul 02, 2006 12:04 pm Post subject: |
|
|
but that can't be the solution, it sounds like a dirty hack. Why this error appears when no cable is plugged in?
Is this a normal error, or is the driver broken?
Where have you post the bug about his gentoo or kernel? |
|
Back to top |
|
 |
phorn Tux's lil' helper

Joined: 01 Jul 2006 Posts: 109
|
Posted: Mon Jul 03, 2006 12:09 am Post subject: |
|
|
Quote: | but that can't be the solution, it sounds like a dirty hack. |
I understand that it is a hack, but in my case, I have tried my computer in windows, and my card is in fact not broken (more probably, windows just ignores the EEPROM checksum). In fact my e1000 still works, using the hack.
I wish there were a way to fix it, but I'm not the type to want to write to my eeprom, as that will only have the potential to permanently break my card.
In your case, it's worth checking if this workaround allows things to work properly for you. If it breaks your ethernet connetivity all the time, then revert to a backup kernel, but if you do not see any side-effects then leave it.
Quote: | Why this error appears when no cable is plugged in? |
I recommend looking through the ethtool tests after getting your card online. ethtool -t eth0 offline will temporarily bring your card offline to do tests, including an eeprom checksum test. (my hack will not affect any tests, so the EEPROM test should report an error)
ethtool -e will print out the card eeprom. If you use the kernel patch (even temporarily) to get your card to load, you can try comparing the output of "ethtool -e" before and after plugging a cable in. This may give some indicators about what the problem byte(s) are in the EEPROM, and may give some hints about where to fix it.
In my case, in order to detect any further changes to the eeprom, I changed the logic for my kernel to allow either 0xBABA (EEPROM_SUM -- what the checksum should be), or 0xBABD (which is what my NIC is getting).
Quote: | Is this a normal error, or is the driver broken? |
If you search for the error "The eeprom checksum is not valid" on Google, there are a lot of hits, so it is in fact very common.
Quote: | Where have you post the bug about his gentoo or kernel? |
I submitted a bug report to the kernel hoping that they will add an override option once you have determined that the EEPROM is not broken.
I understand that you probably don't like patching the kernel, but I do not know of a way to fix the EEPROM without premanently changing hardware (I'm a programmer, so I prefer software hacks if possible).
I would be happy if you could find a better solution to this issue, but I tend to just go with whatever works. |
|
Back to top |
|
 |
techm2 n00b

Joined: 01 May 2006 Posts: 3
|
Posted: Wed Jul 05, 2006 5:33 pm Post subject: |
|
|
The issue with the T60 is interesting - for some reason the EEPROM read by the driver is different with link vs. no link. One thing that comes to mind is power saving - some HW will try to preseve power when there is no link present by shutting down the PHY.
There are 2 issues being discussed here:
1. The one from mijenix is about the driver reporting invalid EEPROM checksum when there is no link.
2. The one from phorn is about a broken EEPROM.
The EEPROM check in the e1000 driver simply compares the checksum value in the EEPROM with the one calculated by the driver (based on the data read by the driver). If they don't match - that means that either the EEPROM was modified improperly (somehow the data was changed without recalculating the checksum) OR there is a HW problem. That is why removing the check in e1000 is generally not a good idea.
mijenix - have you tried using the latest released e1000 driver from e1000.sf.net - I believe it's version 7.1.9 - the drivers in the kernel sometimes lag behind in features and support for newer HW. If this doesn't help - try disabling the EEPROM check in e1000 and capture the EEPROM using in both cases - with and without cable. |
|
Back to top |
|
 |
troymc Guru

Joined: 22 Mar 2006 Posts: 553
|
Posted: Wed Jul 05, 2006 6:06 pm Post subject: |
|
|
techm2 wrote: | The issue with the T60 is interesting - for some reason the EEPROM read by the driver is different with link vs. no link. One thing that comes to mind is power saving |
The other thing that comes to mind is that they may be chksum'ing more than the eeprom.
Maybe the address range they are chksum'ing also includes some internal registers - like the link state register.
troymc |
|
Back to top |
|
 |
wswartzendruber Veteran


Joined: 23 Mar 2004 Posts: 1261 Location: Idaho, USA
|
Posted: Mon Oct 09, 2006 3:35 am Post subject: |
|
|
Damn! I used to use Gentoo (back when I had broadband), and Ubuntu Dapper Drake's doing the same thing. The only thing we know over there is that it's a power-saving issue, and not a bad EEPROM.
Note that with the T60, hiding the network interface in the BIOS, booting Linux, shutting it down, and then re-enabling the interface in the BIOS will make it work for a short time. I got two reboots out of it, then it started failing again. _________________ Git has obsoleted SVN.
10mm Auto has obsoleted 45 ACP. |
|
Back to top |
|
 |
danv n00b

Joined: 02 Jan 2007 Posts: 1
|
Posted: Sat Jan 20, 2007 5:26 pm Post subject: works in suspend2 |
|
|
I get the same error on my t60 using the gentoo-sources 2.6.18-r6 kernel, but the error does not occur with a suspend2-sources kernel. Perhaps one of the patches applied to the suspend2-sources kernel fixes this problem? |
|
Back to top |
|
 |
|