View previous topic :: View next topic |
Author |
Message |
Hyper_Eye Guru
Joined: 17 Aug 2003 Posts: 462 Location: Huntsville, AL.
|
Posted: Fri Apr 18, 2014 6:52 pm Post subject: Desktop hijacks the network and refuses to respond |
|
|
A couple times a month my network becomes completely unresponsive to any client requests. Machines can't talk to each other, talk to the router, or talk to the internet. When I look at my switches I see all of the LED status lights blinking rapidly. I also find that my main Gentoo desktop is unresponsive. Hitting keys on the keyboard won't wake up the monitors. I can't ssh to it. If I disconnect the ethernet cable from the Gentoo machine or hit the reset button on it the network will immediately start working properly. After the Gentoo machine reboots everything appears normal. There are no entries in /var/log/messages for the period of time that it was unresponsive.
This issue is completely unpredictable. It only happens a couple times a month, always after a period of inactivity (typically over night), and I've never had any monitoring going in anticipation as I have no idea how long to keep such tools running. This morning I found the issue occurring again. I disconnected the Gentoo desktop from the network but did not reset the machine. I installed wireshark on a laptop running Linux and connected the ethernet from the Gentoo machine directly into the laptop. I could see that the indicators began blinking the same way the switch indicators did. Wireshark showed no activity. I then connected the Gentoo desktop back into the switch. The status LED for that specific connection began blinking continuously but the network did not immediately become inaccessible to other machines. Wireshark showed only expected traffic. It did not show anything coming from the Gentoo desktop's MAC address. After a few minutes the network did become unresponsive again and all of the status LED's started blinking quickly in tandem with the Gentoo desktop's status LED. Wireshark then showed no activity. The final thing I tried was connecting the Gentoo desktop directly into the router. It exhibited the same behavior. Looking at the router's web interface I could see that it did show a 1000mbit ethernet connection on the port but it showed no client there ("none" for the MAC address).
At this point I'm stuck. I don't know how to proceed to diagnose this problem. I originally associated the issue with leaving a Windows XP guest running in Virtualbox and I wrote a thread here based on that assumption (https://forums.gentoo.org/viewtopic-t-986016.html). I do not believe that Virtualbox is related at this point. I have not left Virtualbox running but this issue is still occurring. This is a serious issue and I really need to attempt to diagnose it. Any assistance would be greatly appreciated.
My Kernel Config
Code: | # emerge --info
Portage 2.2.10 (default/linux/amd64/13.0/desktop/kde, gcc-4.8.2, glibc-2.19, 3.13.5-gentoo x86_64)
=================================================================
System uname: Linux-3.13.5-gentoo-x86_64-Intel-R-_Core-TM-_i7-3770_CPU_@_3.40GHz-with-gentoo-2.2
KiB Mem: 16394632 total, 14946760 free
KiB Swap: 4194296 total, 4194296 free
Timestamp of tree: Tue, 08 Apr 2014 23:15:01 +0000
ld GNU ld (GNU Binutils) 2.24
app-shells/bash: 4.2_p46-r1
dev-java/java-config: 2.2.0
dev-lang/python: 2.7.6, 3.2.5-r3, 3.3.5, 3.4.0
dev-util/cmake: 2.8.12.2-r1::kde
dev-util/pkgconfig: 0.28-r1
sys-apps/baselayout: 2.2
sys-apps/openrc: 0.12.4
sys-apps/sandbox: 2.6-r1
sys-devel/autoconf: 2.13, 2.69
sys-devel/automake: 1.4_p6-r1, 1.10.3, 1.11.6, 1.12.6, 1.13.4, 1.14.1
sys-devel/binutils: 2.24-r2
sys-devel/gcc: 4.7.3-r1, 4.8.2
sys-devel/gcc-config: 1.8
sys-devel/libtool: 2.4.2
sys-devel/make: 4.0-r1
sys-kernel/linux-headers: 3.14 (virtual/os-headers)
sys-libs/glibc: 2.19
Repositories: gentoo dmwoodlx2_local kde vincent gamerlay sunrise roslin steam-overlay anders-larsson luman science
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="${CONFIG_PROTECT} /etc /etc/idea/conf /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.1/conf /usr/share/polkit-1/actions /usr/share/themes/oxygen-gtk/gtk-3.0 /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j8"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage /var/lib/layman/kde /var/lib/layman/vincent /var/lib/layman/gamerlay /var/lib/layman/sunrise /var/lib/layman/roslin /var/lib/layman/steam /var/lib/layman/anders-larsson /var/lib/layman/luman /var/lib/layman/science"
SYNC="rsync://rsync.us.gentoo.org/gentoo-portage"
USE="X a52 aac aacs acl acpi alsa alstream amd64 berkdb bindist bluetooth bluray branding bzip2 cairo cdda cddb cdr cli cmake consolekit cracklib crypt cuda cups cxx dbus declarative dri dts dvd dvdr emboss encode exif fakevim fam fbcondecor ffmpeg firefox flac fortran ftp gdbm gif git gpm gtk hddtemp iconv imagemagick ipv6 java javascript joystick jpeg kde kipi lame lastfm lcms ldap libnotify lirc lm_sensors mad md5sum mercurial midi minizip mmx mng modules mp3 mp4 mpeg multilib ncurses network nls nptl nsplugin ogg openal opengl openmp pam pango pcre pdf perl phonon plasma png policykit portmidi ppds python qt3support qt4 readline s3 samba sdl semantic-desktop session spell sqlite sse sse2 sse3 ssl startup-notification subversion svg tcpd threads tiff timidity truetype udev udisks unicode upower usb valgrind vdpau vim-syntax vorbis wxwidgets x264 xcb xcomposite xinerama xml xpm xscreensaver xv xvid zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="keyboard mouse joystick" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en" LIRC_DEVICES="sb0540" NETBEANS_MODULES="*" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_3" RUBY_TARGETS="ruby19 ruby20" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON |
Code: | # lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4)
00:1c.5 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c4)
00:1c.6 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 7 (rev c4)
00:1c.7 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 8 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Z77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 670] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)
03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
04:00.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 41)
05:00.0 Multimedia audio controller: Creative Labs SB0400 Audigy2 Value
05:01.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller (rev c0)
06:00.0 Ethernet controller: Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
07:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11) |
_________________ Gentoo Gaming Videos |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Sat Apr 19, 2014 1:57 am Post subject: |
|
|
When it bug, did you check your card IP ?
If for some reason, a tool change your IP to another one that is use by another host, you can get that kind of mess.
And if your IP is fine, you can seek out if it's not your network card that is buggy or put in a a buggy state. Try unload/reload the module the card, it should re-init the device and works. |
|
Back to top |
|
|
Hyper_Eye Guru
Joined: 17 Aug 2003 Posts: 462 Location: Huntsville, AL.
|
Posted: Sat Apr 19, 2014 4:41 am Post subject: |
|
|
When the bug occurs I can't interact with the system at all. It is completely unresponsive. _________________ Gentoo Gaming Videos |
|
Back to top |
|
|
Hyper_Eye Guru
Joined: 17 Aug 2003 Posts: 462 Location: Huntsville, AL.
|
Posted: Mon Apr 21, 2014 3:47 pm Post subject: |
|
|
This happened again this morning. I plugged the laptop in and started wireshark. I could see that there was no regular activity. There were just tons of arp requests as the machines tried to establish who was who. As soon as I rebooted the Gentoo machine the router sent a local master announcement and domain/workgroup announcement. The network also started working correctly. There was still no indication of why the Gentoo desktop machine was frozen up or why the network was broken. _________________ Gentoo Gaming Videos |
|
Back to top |
|
|
Hyper_Eye Guru
Joined: 17 Aug 2003 Posts: 462 Location: Huntsville, AL.
|
Posted: Mon Apr 21, 2014 4:22 pm Post subject: |
|
|
Code: | Apr 21 06:21:40 dmwoodlx kernel: [233564.845395] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec36000
Apr 21 06:21:40 dmwoodlx kernel: [233564.845528] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec71000
Apr 21 06:21:40 dmwoodlx kernel: [233564.845571] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec91000
Apr 21 06:51:42 dmwoodlx kernel: [235367.380752] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cecf1000
Apr 21 06:51:42 dmwoodlx kernel: [235367.380796] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced11000
Apr 21 06:51:42 dmwoodlx kernel: [235367.380853] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced31000
Apr 21 07:21:44 dmwoodlx kernel: [237169.826064] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced7d000
Apr 21 07:21:44 dmwoodlx kernel: [237169.826201] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cedb1000
Apr 21 07:21:44 dmwoodlx kernel: [237169.826223] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cedd1000
Apr 21 07:51:45 dmwoodlx kernel: [238972.365676] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee31000
Apr 21 07:51:45 dmwoodlx kernel: [238972.365729] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee51000
Apr 21 07:51:45 dmwoodlx kernel: [238972.365754] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee71000
Apr 21 08:21:47 dmwoodlx kernel: [240774.876977] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceebe000
Apr 21 08:21:47 dmwoodlx kernel: [240774.877030] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceef1000
Apr 21 08:21:47 dmwoodlx kernel: [240774.877048] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef11000
Apr 21 08:51:49 dmwoodlx kernel: [242577.360005] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef71000
Apr 21 08:51:49 dmwoodlx kernel: [242577.360022] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef91000
Apr 21 08:51:49 dmwoodlx kernel: [242577.360389] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cefb1000
Apr 21 09:21:51 dmwoodlx kernel: [244379.835306] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceffe000
Apr 21 09:21:51 dmwoodlx kernel: [244379.835333] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cf031000
Apr 21 09:21:51 dmwoodlx kernel: [244379.835368] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cf051000
Apr 21 10:30:50 dmwoodlx syslog-ng[2705]: syslog-ng starting up; version='3.4.7' |
One thing that is consistent with this issue are these messages in my kernel log. This is always the last thing in the log before the problem occurs. You can see that once this starts happening it happens every 30 minutes.
Here is the code in the kernel that triggers this message:
lib/swiotlb.c
Code: | void *
swiotlb_alloc_coherent(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags)
{
dma_addr_t dev_addr;
void *ret;
int order = get_order(size);
u64 dma_mask = DMA_BIT_MASK(32);
if (hwdev && hwdev->coherent_dma_mask)
dma_mask = hwdev->coherent_dma_mask;
ret = (void *)__get_free_pages(flags, order);
if (ret) {
dev_addr = swiotlb_virt_to_bus(hwdev, ret);
if (dev_addr + size - 1 > dma_mask) {
/*
* The allocated memory isn't reachable by the device.
*/
free_pages((unsigned long) ret, order);
ret = NULL;
}
}
if (!ret) {
/*
* We are either out of memory or the device can't DMA to
* GFP_DMA memory; fall back on map_single(), which
* will grab memory from the lowest available address range.
*/
phys_addr_t paddr = map_single(hwdev, 0, size, DMA_FROM_DEVICE);
if (paddr == SWIOTLB_MAP_ERROR)
return NULL;
ret = phys_to_virt(paddr);
dev_addr = phys_to_dma(hwdev, paddr);
/* Confirm address can be DMA'd by device */
if (dev_addr + size - 1 > dma_mask) {
printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
(unsigned long long)dma_mask,
(unsigned long long)dev_addr);
/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
swiotlb_tbl_unmap_single(hwdev, paddr,
size, DMA_TO_DEVICE);
return NULL;
}
}
*dma_handle = dev_addr;
memset(ret, 0, size);
return ret;
}
EXPORT_SYMBOL(swiotlb_alloc_coherent); |
To get there there must be no free pages or the memory isn't reachable by the device.
If anyone knows what could cause this or how I might elicit more information from my system I would appreciate it. _________________ Gentoo Gaming Videos |
|
Back to top |
|
|
Hyper_Eye Guru
Joined: 17 Aug 2003 Posts: 462 Location: Huntsville, AL.
|
Posted: Fri Apr 25, 2014 7:50 pm Post subject: |
|
|
I'm still hoping for a solution to this problem. Should I post this issue somewhere else? Is there a place where someone may have a better idea of what is causing the DMA allocation errors? Thanks. _________________ Gentoo Gaming Videos |
|
Back to top |
|
|
toneus n00b
Joined: 27 Feb 2008 Posts: 35
|
Posted: Tue May 06, 2014 5:04 pm Post subject: |
|
|
I am experiencing many if not all of the same. In my search for an answer, I tested the the Memory on my server with memtest86. My RAM is failing Test #2 gloriously! Test #2 is a parallel memory.
Quote: | Test 1 [Address test, own address, Sequential]
Each address is written with its own address and then is checked for consistency. In theory previous tests should have caught any memory addressing problems. This test should catch any addressing errors that somehow were not previously detected. This test is done sequentially with each available CPU.
Test 2 [Address test, own address, Parallel]
Same as test 1 but the testing is done in parallel using all CPUs and using overlapping addresses. |
I have gone through the memtest86 on each individual RAM module, in differing memory module slots. Each continues to fail Text #2. I don't think it's a module or MOBO issue.
This has now happened for my last two kernel upgrades, and I'm currently using Linux 3.12.13-gentoo.
Did we miss an important setting, or RAM related change in the latest kernels?
This is one of the most difficult problems I've even encountered on Linux. There is literally no smoking gun, no log message, and no way to access the system once it is this state.
Any help would be greatly appreciated!
Toneus |
|
Back to top |
|
|
Hyper_Eye Guru
Joined: 17 Aug 2003 Posts: 462 Location: Huntsville, AL.
|
Posted: Wed May 07, 2014 5:44 am Post subject: |
|
|
Are you running memtest in Gentoo? Booting to a memtest CD I don't get any memory errors. _________________ Gentoo Gaming Videos |
|
Back to top |
|
|
toneus n00b
Joined: 27 Feb 2008 Posts: 35
|
Posted: Wed May 07, 2014 5:04 pm Post subject: |
|
|
I am running memtest on boot basically like a CD. |
|
Back to top |
|
|
chithanh Developer
Joined: 05 Aug 2006 Posts: 2158 Location: Berlin, Germany
|
Posted: Thu May 08, 2014 12:22 pm Post subject: |
|
|
Maybe it is a problem with your network card. It looks like the ethernet port is left in a state that causes problems to your router. You could try one or more of the following:
- Install a different network card in the computer
- Force ethernet connection to 100 Mbps with ethtool
- Connect another ethernet switch in between the router and the computer
|
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|