Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Can't mount usb drive: unable to enumerate
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Wed Aug 12, 2009 10:49 pm    Post subject: Can't mount usb drive: unable to enumerate Reply with quote

Hi all,

I have a portable USB drive and it no longer mounts, though it was working fine fairly recently. I seriously doubt it's the hardware, namely as this problem seems to be going around a lot.

Anyway, when I type in dmesg I get the following:

Code:
usb 1-6: new high speed USB device using ehci_hcd and address 96
hub 1-0:1.0: unable to enumerate USB device on port 6


The second line is repeated btw.

I have checked through the forums and my kernel is properly configured. I hear that it has something to do with the order in which EHCI and something else are loaded up. Has there been a satisfactory solution to this yet, and, if so, what is it?

Alex


Last edited by evoweiss on Wed Jul 17, 2013 5:38 pm; edited 4 times in total
Back to top
View user's profile Send private message
stan666
Apprentice
Apprentice


Joined: 25 Jun 2007
Posts: 165
Location: Germany

PostPosted: Sat Aug 15, 2009 5:40 pm    Post subject: Reply with quote

I had similiar problems trying to mount the memory stick of my cellphone
Code:

update-usbids

helped me out
_________________
BOFH Excuse #450:
Terrorists crashed an airplane into the server room, have to remove /bin/laden. (rm -rf /bin/laden)
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Wed Sep 30, 2009 6:20 pm    Post subject: Reply with quote

stan666 wrote:
I had similiar problems trying to mount the memory stick of my cellphone
Code:

update-usbids

helped me out


Hi, the problem went away and returned, though using update-usbids did not help either this time or last. It's really frustrating and I'm just wondering what the cause is and how to fix the damn thing. The USB drive is fairly new, too.

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Sun Oct 18, 2009 5:58 pm    Post subject: Reply with quote

Bump... all was well until i recently rebooted. From googling around I think the problem is likely in udev or the kernel. Any thoughts from anybody would be appreciated.

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Mon Oct 19, 2009 7:28 am    Post subject: Reply with quote

Hi all,

Again, after waiting for a pretty good while, the problem seemed to resolve itself. I'm wondering if this has something to do with the drive going into suspend mode before being umounted or, again, some udev configuration problem. Any help/tips would be appreciated.

Here's what finally showed up after all the enumerate problems.

Code:

usb 4-1: new full speed USB device using uhci_hcd and address 2
usb 4-1: not running at top speed; connect to a high speed hub
usb 4-1: configuration #1 chosen from 1 choice
scsi0 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
usb 4-1: reset full speed USB device using uhci_hcd and address 2
scsi 0:0:0:0: Direct-Access     WDC WD10 EAVS-00D7B0      01.0 PQ: 0 ANSI: 0
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
usb-storage: device scan complete
sd 0:0:0:0: [sda] Assuming drive cache: write through
 sda: sda1
sd 0:0:0:0: [sda] Attached SCSI disk
EXT4-fs: barriers enabled
kjournald2 starting: pid 6710, dev sda1:8, commit interval 5 seconds
EXT4 FS on sda1, internal journal on sda1:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem sda1 with ordered data mode


Note the bit about not running at top speed. What's that all about?

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Fri Oct 23, 2009 3:00 am    Post subject: Reply with quote

Finally, I figured it out. I apparently don't have USB 2 ports on this older machine of mine and USB 2 was enabled in the kernel. Once I took that away, it all started working fine.

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Tue Jun 11, 2013 6:08 am    Post subject: Reply with quote

This is an update on the problem. It turns out, according to my computer's manual, that I did have USB 2.0 in the first place. So I re-enabled it in the kernel (now 3.8.13) and all was working well.

Last night we had a power cut. I switched on my computer and the problem has returned with a vengeance.

Code:

[  123.100832] hub 1-0:1.0: unable to enumerate USB device on port 3
[  123.316812] hub 1-0:1.0: unable to enumerate USB device on port 3
[  123.532777] hub 1-0:1.0: unable to enumerate USB device on port 3
[  123.804513] usb 1-3: new high-speed USB device number 8 using ehci_hcd
[  123.872835] hub 1-0:1.0: unable to enumerate USB device on port 3
[  124.144530] usb 1-3: new high-speed USB device number 9 using ehci_hcd


... and on it goes.

Does anybody have a clue as to what may be going on? The fact that it was working for a while and over reboots and that it worked as a USB 1.0 device would suggest this is a configuration problem of some sort. Any help would be appreciated.

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Tue Jun 11, 2013 7:45 am    Post subject: Reply with quote

Hi all,

Just another update. Once again, disabling USB 2.0 allows the drive to work again.

This is really a frustrating problem. I feel as if there's something relatively simple that I am missing.

Best,

Alex
Back to top
View user's profile Send private message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2506
Location: Silver City, NM

PostPosted: Fri Jun 21, 2013 9:04 am    Post subject: Reply with quote

This problem is usually caused when ehci binds to the device instead of ohci. If you never need ehci and ehci-hcd is a module, not compiled-in, then you can just blacklist ehci-hcd and the problem will go away.

If you need ehci for other devices or if it is compiled-in then you need to play around in /sys. First do an "ls -F /sys/bus/pci/drivers/*_hcd/". On my system, I get:
Code:
 ls -F /sys/bus/pci/drivers/*_hcd
/sys/bus/pci/drivers/ehci_hcd:
0000:00:12.2@  bind     new_id     uevent
0000:00:13.2@  module@  remove_id  unbind

/sys/bus/pci/drivers/ohci_hcd:
0000:00:12.0@  0000:00:13.0@  0000:00:14.5@  new_id     uevent
0000:00:12.1@  0000:00:13.1@  bind           remove_id  unbind

/sys/bus/pci/drivers/uhci_hcd:
bind  module@  new_id  remove_id  uevent  unbind

/sys/bus/pci/drivers/xhci_hcd:
bind  module@  new_id  remove_id  uevent  unbind
The devices are the symlinks (with trailing @ signs in the ls listing). The trick is to unbind the slow usb device from ehci and bind it to ohci (or, if needed uhci). For example if I wanted to unbind 0000:00:12.2 from echi and rebind it to ohci, I would, as root, run:
Code:
# echo 0000:00:12.2 > /sys/bus/pci/drivers/ehci_hcd/unbind
# echo 0000:00:12.2 > /sys/bus/pci/drivers/ohci_hcd/bind

You can see which devices the numbers correspond to in the output of lspci. For example:
Code:
$ lspci | grep :12.2
00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Fri Jun 21, 2013 9:18 am    Post subject: Reply with quote

Hi there,

It's the high speed mode I need. The USB drive runs very slowly on ohci, so ehci is preferred. Fortunately, I discovered that the problem resolved itself when upgrading to kernel 3.9.6 (I read about this somewhere). It's odd and I apologize for taking your time as I should have marked this as solved. However, your advice is helpful in case the problem returns.

Best,

Alex

BitJam wrote:
This problem is usually caused when ehci binds to the device instead of ohci. If you never need ehci and ehci-hcd is a module, not compiled-in, then you can just blacklist ehci-hcd and the problem will go away.

If you need ehci for other devices or if it is compiled-in then you need to play around in /sys. First do an "ls -F /sys/bus/pci/drivers/*_hcd/". On my system, I get:
Code:
 ls -F /sys/bus/pci/drivers/*_hcd
/sys/bus/pci/drivers/ehci_hcd:
0000:00:12.2@  bind     new_id     uevent
0000:00:13.2@  module@  remove_id  unbind

/sys/bus/pci/drivers/ohci_hcd:
0000:00:12.0@  0000:00:13.0@  0000:00:14.5@  new_id     uevent
0000:00:12.1@  0000:00:13.1@  bind           remove_id  unbind

/sys/bus/pci/drivers/uhci_hcd:
bind  module@  new_id  remove_id  uevent  unbind

/sys/bus/pci/drivers/xhci_hcd:
bind  module@  new_id  remove_id  uevent  unbind
The devices are the symlinks (with trailing @ signs in the ls listing). The trick is to unbind the slow usb device from ehci and bind it to ohci (or, if needed uhci). For example if I wanted to unbind 0000:00:12.2 from echi and rebind it to ohci, I would, as root, run:
Code:
# echo 0000:00:12.2 > /sys/bus/pci/drivers/ehci_hcd/unbind
# echo 0000:00:12.2 > /sys/bus/pci/drivers/ohci_hcd/bind

You can see which devices the numbers correspond to in the output of lspci. For example:
Code:
$ lspci | grep :12.2
00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Wed Jul 17, 2013 5:07 pm    Post subject: Reply with quote

Hi again,

The problem has returned. It seems to happen whenever my computer unexpectedly gets shut down via a power problem or a cord getting knocked out. Sigh...

I have to admit, too, that I'm not 100% clear on what I have to do to resolve it. I believe I do need ehci namely as the external USB drive is very slow without USB 2.0 support.

The output I get from ls -F /sys/bus/pci/drivers/*_hcd is as follows:

Code:

/sys/bus/pci/drivers/ohci_hcd:
bind  new_id  remove_id  uevent  unbind

/sys/bus/pci/drivers/uhci_hcd:
0000:00:1d.0@  0000:00:1d.2@  bind     new_id     uevent
0000:00:1d.1@  0000:00:1d.3@  module@  remove_id  unbind


An lspci of any of those devices reveals the same thing:

Code:

00:1d.3 USB controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #4 (rev 02)


I am at a loss of where to go, namely as I do not see any ehci options.

Best,

Alex
Back to top
View user's profile Send private message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2506
Location: Silver City, NM

PostPosted: Wed Jul 17, 2013 5:57 pm    Post subject: Reply with quote

That is strange. If echi-hcd is compiled as a module, then make sure that module gets loaded. Did the lspci output change after the power outage? This site has the lspci output of many computers. If your make/model is listed, you could compare what you get with what others get.

If power outages trigger the problem then what fixes it?

I'm now wondering if maybe it is a hardware problem. A variable lspci outout would indicate that. Another possibility is there is file system corruption but why that would always target echi is a mystery.
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Wed Jul 17, 2013 6:14 pm    Post subject: Reply with quote

Hi,

BitJam wrote:
That is strange. If echi-hcd is compiled as a module, then make sure that module gets loaded.


EHCI as well as the other USB related stuff, is compiled in the kernel.

Code:

# USB HID support
#
CONFIG_USB_HID=y
# CONFIG_HID_PID is not set
# CONFIG_USB_HIDDEV is not set
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB_ARCH_HAS_XHCI=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set
# CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set

#
# Miscellaneous USB options
#
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_DWC3 is not set
CONFIG_USB_MON=y
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
# CONFIG_USB_XHCI_HCD is not set
CONFIG_USB_EHCI_HCD=y
# CONFIG_USB_EHCI_ROOT_HUB_TT is not set
# CONFIG_USB_EHCI_TT_NEWSCHED is not set
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1760_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
CONFIG_USB_OHCI_HCD=y
# CONFIG_USB_OHCI_HCD_SSB is not set
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_HCD_SSB is not set
# CONFIG_USB_CHIPIDEA is not set

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set


Quote:
Did the lspci output change after the power outage? This site has the lspci output of many computers. If your make/model is listed, you could compare what you get with what others get.


No change at all.

Code:

If power outages trigger the problem then what fixes it?


I have yet to figure that out. Last time I think it was compiling a fresh (new) kernel. However, that may have just been a coincidence.

Quote:
I'm now wondering if maybe it is a hardware problem. A variable lspci outout would indicate that. Another possibility is there is file system corruption but why that would always target echi is a mystery.


I think both are unlikely but have no way to check. The thing is, there used to be no problem. Moreover, if I disable USB 2.0 support, it works, though I have a very slow usb drive.

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Wed Jul 17, 2013 10:58 pm    Post subject: Reply with quote

Hi again,

I have tracked down where the problem was but I don't know exactly what it was. Anyway, an earlier kernel of mine (3.8.13) did not have the problem. I copied over the configuration file, did a make oldconfig, recompiled the kernel, etc. and all is well again.

I guess it would be nice to know where I had gone wrong previously.

Best,

Alex
Back to top
View user's profile Send private message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2506
Location: Silver City, NM

PostPosted: Wed Jul 17, 2013 11:22 pm    Post subject: Reply with quote

You said that recompiling the kernel fixed this problem before. Are you sure that it is now solved? Are you sure a simple re-compile without changing the config doesn't fix it?
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Thu Jul 18, 2013 5:13 am    Post subject: Reply with quote

BitJam wrote:
You said that recompiling the kernel fixed this problem before. Are you sure that it is now solved? Are you sure a simple re-compile without changing the config doesn't fix it?


I'm not sure, no. Moreover, I can now boot into a kernel that wasn't working before without trouble.

I tried to recreate the problem by turning the computer off, but it did not come back. Should it do so, I'll try to just recompile the present kernel to see if that does any good.

Best,

Alex
Back to top
View user's profile Send private message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2506
Location: Silver City, NM

PostPosted: Thu Jul 18, 2013 6:07 am    Post subject: Reply with quote

This is the third really strange problem I've seen this week. It feels like reality is leaking. One of the really strange problems also involved this ehci stuff. A usb stick wasn't being recognized but the device would reliable show up after doing and "ls /sys/bus/pci/drivers/*_hcd". yes, just the ls command made the device show up. I hit the other one when I was trying to debug the first one.
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Thu Jul 18, 2013 1:27 pm    Post subject: Reply with quote

Hi,

BitJam wrote:
This is the third really strange problem I've seen this week. It feels like reality is leaking. One of the really strange problems also involved this ehci stuff. A usb stick wasn't being recognized but the device would reliable show up after doing and "ls /sys/bus/pci/drivers/*_hcd". yes, just the ls command made the device show up. I hit the other one when I was trying to debug the first one.


Strange these things. In any event, should I have any updates for you, I will let you know.

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Thu Jul 18, 2013 4:57 pm    Post subject: Reply with quote

Hi there,

I just rebooted and it happened again. I then rebooted once more and no problem.

Also, it seems to work consistently off one of my older kernels (3.8.13). Here's the output from a diff of the two kernel config files.

Code:

3c3
< # Linux/i386 3.8.13-gentoo Kernel Configuration
---
> # Linux/x86 3.9.6-gentoo Kernel Configuration
41d40
< CONFIG_HAVE_IRQ_WORK=y
48d46
< CONFIG_EXPERIMENTAL=y
111a110
> CONFIG_RCU_STALL_COMMON=y
194a194
> # CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
195a196
> CONFIG_ARCH_USE_BUILTIN_BSWAP=y
199a201
> CONFIG_HAVE_KPROBES_ON_FTRACE=y
223d224
< CONFIG_GENERIC_SIGALTSTACK=y
224a226,227
> CONFIG_OLD_SIGSUSPEND3=y
> CONFIG_OLD_SIGACTION=y
280a284,285
> # CONFIG_X86_GOLDFISH is not set
> # CONFIG_X86_INTEL_LPSS is not set
368a374
> # CONFIG_HAVE_BOOTMEM_INFO_NODE is not set
567d572
< # CONFIG_WAN_ROUTER is not set
573a579
> # CONFIG_VSOCKETS is not set
611a618
> CONFIG_FW_LOADER_USER_HELPER=y
652a660
> # CONFIG_BLK_DEV_RSXX is not set
662a671
> # CONFIG_ATMEL_SSC is not set
681a691
> # CONFIG_VMWARE_VMCI is not set
1032a1043
> CONFIG_MOUSE_PS2_CYPRESS=y
1065a1077
> CONFIG_TTY=y
1088a1101
> CONFIG_SERIAL_8250_DEPRECATED_OPTIONS=y
1095a1109
> # CONFIG_SERIAL_8250_DW is not set
1109a1124
> # CONFIG_SERIAL_RP2 is not set
1154a1170
> CONFIG_GPIO_DEVRES=y
1164a1181
> # CONFIG_BATTERY_GOLDFISH is not set
1172,1174c1189,1193
< # CONFIG_FAIR_SHARE is not set
< CONFIG_STEP_WISE=y
< # CONFIG_USER_SPACE is not set
---
> # CONFIG_THERMAL_GOV_FAIR_SHARE is not set
> CONFIG_THERMAL_GOV_STEP_WISE=y
> # CONFIG_THERMAL_GOV_USER_SPACE is not set
> # CONFIG_THERMAL_EMULATION is not set
> # CONFIG_INTEL_POWERCLAMP is not set
1234d1252
< # CONFIG_STUB_POULSBO is not set
1401a1420
> # CONFIG_HID_STEELSERIES is not set
1432,1433c1451,1454
< # CONFIG_USB_SUSPEND is not set
< CONFIG_USB_MON=y
---
> CONFIG_USB_SUSPEND=y
> # CONFIG_USB_OTG is not set
> # CONFIG_USB_DWC3 is not set
> # CONFIG_USB_MON is not set
1441c1462,1465
< # CONFIG_USB_EHCI_HCD is not set
---
> CONFIG_USB_EHCI_HCD=y
> # CONFIG_USB_EHCI_ROOT_HUB_TT is not set
> # CONFIG_USB_EHCI_TT_NEWSCHED is not set
> CONFIG_USB_EHCI_PCI=y
1446,1451c1470,1471
< CONFIG_USB_OHCI_HCD=y
< # CONFIG_USB_OHCI_HCD_SSB is not set
< # CONFIG_USB_OHCI_HCD_PLATFORM is not set
< # CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
< # CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
< CONFIG_USB_OHCI_LITTLE_ENDIAN=y
---
> # CONFIG_USB_OHCI_HCD is not set
> # CONFIG_USB_EHCI_HCD_PLATFORM is not set
1515a1536
> # CONFIG_USB_SISUSBVGA is not set
1526a1548,1549
> # CONFIG_OMAP_USB3 is not set
> # CONFIG_OMAP_CONTROL_USB is not set
1565a1589
> # CONFIG_MAILBOX is not set
1569c1593
< # Remoteproc drivers (EXPERIMENTAL)
---
> # Remoteproc drivers
1574c1598
< # Rpmsg drivers (EXPERIMENTAL)
---
> # Rpmsg drivers
1749d1772
< # CONFIG_SPARSE_RCU_POINTER is not set
1753a1777,1781
>
> #
> # RCU Debugging
> #
> # CONFIG_SPARSE_RCU_POINTER is not set
1761a1790
> CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
1858a1888,1889
> # CONFIG_CRYPTO_CRC32 is not set
> # CONFIG_CRYPTO_CRC32_PCLMUL is not set
1925d1955
< CONFIG_PERCPU_RWSEM=y


Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Tue Jul 23, 2013 9:53 pm    Post subject: Reply with quote

Hi,

Just another update, though I don't know what good it will do. I had to shut the computer down for a bit and then back on. The problem resumed. However, after rebooting into the 3.8.13 kernel, and then rebooting back into 3.9.6, the problem went away again.

It seems like there is something wrong with the kernel configuration or otherwise, but I cannot suss it out. Any idea on what the best way to move forward and figure this out would be?

Best,

Alex
Back to top
View user's profile Send private message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2506
Location: Silver City, NM

PostPosted: Tue Jul 23, 2013 10:23 pm    Post subject: Reply with quote

I suggest you look at the dmesg output and try to see difference in it between when it is broken and when it is working. You should probably focus your attention around lines that contain "hci_hcd".

So the problem starts, usually after a power outage and then lasts through reboots. But if you rebuild the kernel then it goes away. Is that right?

It sort of sounds like a filesystem problem. Some file or files get corrupted when the system goes down and then they get repaired when you rebuild. The part that doesn't make sense is: why would it be the same file getting corrupted in the same place every time?

Another approach is to install tripwire (or something like it). Tripwire will keep a database of checksums of files. It was designed as an intrusion detection system so you would be alerted if a file got changed (presumably by an intruder). Have it keep track of your kernel and all of your modules. If the problem occurs and none of those have changed then you can probably rule out file system corruption. There still might be corruption but if the kernel and the modules don't change then I don't see how reinstalling the kernel would fix it.

You could also install smartmontools (if you haven't already) and see check the health of your hard drive. The output is rather cryptic but I'm sure there are instructions or tools somewhere for deciphering it.
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Wed Jul 24, 2013 8:36 am    Post subject: Reply with quote

Hi,

BitJam wrote:
I suggest you look at the dmesg output and try to see difference in it between when it is broken and when it is working. You should probably focus your attention around lines that contain "hci_hcd".


I will definitely try that next time around.

Quote:
So the problem starts, usually after a power outage and then lasts through reboots. But if you rebuild the kernel then it goes away. Is that right?


Apparently not (see recent emails). I have one kernel, 3.8.13 that seems to have no problem at all from what I can tell. Maybe it's a fluke as I haven't systematically checked, but that's the nature of the beast (hard to predict when things will foul up). If, after booting up with the error messages, I reboot into that kernel, all seems to go fine. Then the next time I boot into 3.9.6 all is well, too.

Quote:
It sort of sounds like a filesystem problem. Some file or files get corrupted when the system goes down and then they get repaired when you rebuild. The part that doesn't make sense is: why would it be the same file getting corrupted in the same place every time?


No idea...

Quote:
Another approach is to install tripwire (or something like it). Tripwire will keep a database of checksums of files. It was designed as an intrusion detection system so you would be alerted if a file got changed (presumably by an intruder). Have it keep track of your kernel and all of your modules. If the problem occurs and none of those have changed then you can probably rule out file system corruption. There still might be corruption but if the kernel and the modules don't change then I don't see how reinstalling the kernel would fix it.


I'll try this if dmesg doesn't yield any clues.

Quote:
You could also install smartmontools (if you haven't already) and see check the health of your hard drive. The output is rather cryptic but I'm sure there are instructions or tools somewhere for deciphering it.


I'll give that a go, too, if I don't get anything with dmesg. Watch this space.

Best,

Alex
Back to top
View user's profile Send private message
tuber
Apprentice
Apprentice


Joined: 12 Nov 2004
Posts: 267

PostPosted: Mon Jul 29, 2013 2:15 am    Post subject: Reply with quote

I get something similar
Code:
[45513.345333] usb 2-1.5.4: new full-speed USB device number 22 using ehci-pci
[45513.417159] usb 2-1.5.4: device descriptor read/64, error -32
[45513.999813] usb 2-1.5.4: device not accepting address 22, error -32
[45514.016769] hub 2-1.5:1.0: unable to enumerate USB device on port 4
and
Code:
[529997.384865] usb 2-1.5.4: new full-speed USB device number 41 using ehci-pci
[529997.404521] hub 2-1.5:1.0: unable to enumerate USB device on port 4
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Tue Sep 10, 2013 7:45 pm    Post subject: Reply with quote

Hi all,

I upgraded to 3.10.7 and the problem has returned.

Two things that may or may not help diagnose it.

First, when I restart the system it stops at:

Code:
remounting / read only


No error, it just stops doing anything and the system does not reboot until I turn off the USB drive.

Similarly, when the system is booting when it initializes uevents, it stops until I turn off the USB drive.

Finally, I have a kernel that works fine (3.8.13), though I cannot seem to find the magic .config file for that kernel that works so well.

This is extraordinarily frustrating. One would think it's something that should work pretty easily and it had for a long time.

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Tue Sep 10, 2013 8:19 pm    Post subject: Reply with quote

Further details on the uevents hang... it eventually resolves itself with an error:

Code:

udevadm settle - timeout of 60 seconds reached, the event queue contains:
  /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-4 (1199)
  /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-4/1-4:1.0 (1200)            [ !! ]


Alex
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum