Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Desktop hijacks the network and refuses to respond
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Hyper_Eye
Guru
Guru


Joined: 17 Aug 2003
Posts: 446
Location: Huntsville, AL.

PostPosted: Fri Apr 18, 2014 6:52 pm    Post subject: Desktop hijacks the network and refuses to respond Reply with quote

A couple times a month my network becomes completely unresponsive to any client requests. Machines can't talk to each other, talk to the router, or talk to the internet. When I look at my switches I see all of the LED status lights blinking rapidly. I also find that my main Gentoo desktop is unresponsive. Hitting keys on the keyboard won't wake up the monitors. I can't ssh to it. If I disconnect the ethernet cable from the Gentoo machine or hit the reset button on it the network will immediately start working properly. After the Gentoo machine reboots everything appears normal. There are no entries in /var/log/messages for the period of time that it was unresponsive.

This issue is completely unpredictable. It only happens a couple times a month, always after a period of inactivity (typically over night), and I've never had any monitoring going in anticipation as I have no idea how long to keep such tools running. This morning I found the issue occurring again. I disconnected the Gentoo desktop from the network but did not reset the machine. I installed wireshark on a laptop running Linux and connected the ethernet from the Gentoo machine directly into the laptop. I could see that the indicators began blinking the same way the switch indicators did. Wireshark showed no activity. I then connected the Gentoo desktop back into the switch. The status LED for that specific connection began blinking continuously but the network did not immediately become inaccessible to other machines. Wireshark showed only expected traffic. It did not show anything coming from the Gentoo desktop's MAC address. After a few minutes the network did become unresponsive again and all of the status LED's started blinking quickly in tandem with the Gentoo desktop's status LED. Wireshark then showed no activity. The final thing I tried was connecting the Gentoo desktop directly into the router. It exhibited the same behavior. Looking at the router's web interface I could see that it did show a 1000mbit ethernet connection on the port but it showed no client there ("none" for the MAC address).

At this point I'm stuck. I don't know how to proceed to diagnose this problem. I originally associated the issue with leaving a Windows XP guest running in Virtualbox and I wrote a thread here based on that assumption (https://forums.gentoo.org/viewtopic-t-986016.html). I do not believe that Virtualbox is related at this point. I have not left Virtualbox running but this issue is still occurring. This is a serious issue and I really need to attempt to diagnose it. Any assistance would be greatly appreciated.

My Kernel Config

Code:
# emerge --info
Portage 2.2.10 (default/linux/amd64/13.0/desktop/kde, gcc-4.8.2, glibc-2.19, 3.13.5-gentoo x86_64)
=================================================================
System uname: Linux-3.13.5-gentoo-x86_64-Intel-R-_Core-TM-_i7-3770_CPU_@_3.40GHz-with-gentoo-2.2
KiB Mem:    16394632 total,  14946760 free
KiB Swap:    4194296 total,   4194296 free
Timestamp of tree: Tue, 08 Apr 2014 23:15:01 +0000
ld GNU ld (GNU Binutils) 2.24
app-shells/bash:          4.2_p46-r1
dev-java/java-config:     2.2.0
dev-lang/python:          2.7.6, 3.2.5-r3, 3.3.5, 3.4.0
dev-util/cmake:           2.8.12.2-r1::kde
dev-util/pkgconfig:       0.28-r1
sys-apps/baselayout:      2.2
sys-apps/openrc:          0.12.4
sys-apps/sandbox:         2.6-r1
sys-devel/autoconf:       2.13, 2.69
sys-devel/automake:       1.4_p6-r1, 1.10.3, 1.11.6, 1.12.6, 1.13.4, 1.14.1
sys-devel/binutils:       2.24-r2
sys-devel/gcc:            4.7.3-r1, 4.8.2
sys-devel/gcc-config:     1.8
sys-devel/libtool:        2.4.2
sys-devel/make:           4.0-r1
sys-kernel/linux-headers: 3.14 (virtual/os-headers)
sys-libs/glibc:           2.19
Repositories: gentoo dmwoodlx2_local kde vincent gamerlay sunrise roslin steam-overlay anders-larsson luman science
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="${CONFIG_PROTECT} /etc /etc/idea/conf /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.1/conf /usr/share/polkit-1/actions /usr/share/themes/oxygen-gtk/gtk-3.0 /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j8"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage /var/lib/layman/kde /var/lib/layman/vincent /var/lib/layman/gamerlay /var/lib/layman/sunrise /var/lib/layman/roslin /var/lib/layman/steam /var/lib/layman/anders-larsson /var/lib/layman/luman /var/lib/layman/science"
SYNC="rsync://rsync.us.gentoo.org/gentoo-portage"
USE="X a52 aac aacs acl acpi alsa alstream amd64 berkdb bindist bluetooth bluray branding bzip2 cairo cdda cddb cdr cli cmake consolekit cracklib crypt cuda cups cxx dbus declarative dri dts dvd dvdr emboss encode exif fakevim fam fbcondecor ffmpeg firefox flac fortran ftp gdbm gif git gpm gtk hddtemp iconv imagemagick ipv6 java javascript joystick jpeg kde kipi lame lastfm lcms ldap libnotify lirc lm_sensors mad md5sum mercurial midi minizip mmx mng modules mp3 mp4 mpeg multilib ncurses network nls nptl nsplugin ogg openal opengl openmp pam pango pcre pdf perl phonon plasma png policykit portmidi ppds python qt3support qt4 readline s3 samba sdl semantic-desktop session spell sqlite sse sse2 sse3 ssl startup-notification subversion svg tcpd threads tiff timidity truetype udev udisks unicode upower usb valgrind vdpau vim-syntax vorbis wxwidgets x264 xcb xcomposite xinerama xml xpm xscreensaver xv xvid zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="keyboard mouse joystick" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en" LIRC_DEVICES="sb0540" NETBEANS_MODULES="*" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_3" RUBY_TARGETS="ruby19 ruby20" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON


Code:
# lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4)
00:1c.5 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c4)
00:1c.6 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 7 (rev c4)
00:1c.7 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 8 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Z77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 670] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)
03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
04:00.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 41)
05:00.0 Multimedia audio controller: Creative Labs SB0400 Audigy2 Value
05:01.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller (rev c0)
06:00.0 Ethernet controller: Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
07:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)

_________________
Gentoo Gaming Videos
Back to top
View user's profile Send private message
krinn
Advocate
Advocate


Joined: 02 May 2003
Posts: 4741

PostPosted: Sat Apr 19, 2014 1:57 am    Post subject: Reply with quote

When it bug, did you check your card IP ?
If for some reason, a tool change your IP to another one that is use by another host, you can get that kind of mess.

And if your IP is fine, you can seek out if it's not your network card that is buggy or put in a a buggy state. Try unload/reload the module the card, it should re-init the device and works.
Back to top
View user's profile Send private message
Hyper_Eye
Guru
Guru


Joined: 17 Aug 2003
Posts: 446
Location: Huntsville, AL.

PostPosted: Sat Apr 19, 2014 4:41 am    Post subject: Reply with quote

When the bug occurs I can't interact with the system at all. It is completely unresponsive.
_________________
Gentoo Gaming Videos
Back to top
View user's profile Send private message
Hyper_Eye
Guru
Guru


Joined: 17 Aug 2003
Posts: 446
Location: Huntsville, AL.

PostPosted: Mon Apr 21, 2014 3:47 pm    Post subject: Reply with quote

This happened again this morning. I plugged the laptop in and started wireshark. I could see that there was no regular activity. There were just tons of arp requests as the machines tried to establish who was who. As soon as I rebooted the Gentoo machine the router sent a local master announcement and domain/workgroup announcement. The network also started working correctly. There was still no indication of why the Gentoo desktop machine was frozen up or why the network was broken.
_________________
Gentoo Gaming Videos
Back to top
View user's profile Send private message
Hyper_Eye
Guru
Guru


Joined: 17 Aug 2003
Posts: 446
Location: Huntsville, AL.

PostPosted: Mon Apr 21, 2014 4:22 pm    Post subject: Reply with quote

Code:
Apr 21 06:21:40 dmwoodlx kernel: [233564.845395] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec36000
Apr 21 06:21:40 dmwoodlx kernel: [233564.845528] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec71000
Apr 21 06:21:40 dmwoodlx kernel: [233564.845571] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec91000
Apr 21 06:51:42 dmwoodlx kernel: [235367.380752] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cecf1000
Apr 21 06:51:42 dmwoodlx kernel: [235367.380796] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced11000
Apr 21 06:51:42 dmwoodlx kernel: [235367.380853] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced31000
Apr 21 07:21:44 dmwoodlx kernel: [237169.826064] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced7d000
Apr 21 07:21:44 dmwoodlx kernel: [237169.826201] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cedb1000
Apr 21 07:21:44 dmwoodlx kernel: [237169.826223] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cedd1000
Apr 21 07:51:45 dmwoodlx kernel: [238972.365676] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee31000
Apr 21 07:51:45 dmwoodlx kernel: [238972.365729] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee51000
Apr 21 07:51:45 dmwoodlx kernel: [238972.365754] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee71000
Apr 21 08:21:47 dmwoodlx kernel: [240774.876977] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceebe000
Apr 21 08:21:47 dmwoodlx kernel: [240774.877030] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceef1000
Apr 21 08:21:47 dmwoodlx kernel: [240774.877048] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef11000
Apr 21 08:51:49 dmwoodlx kernel: [242577.360005] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef71000
Apr 21 08:51:49 dmwoodlx kernel: [242577.360022] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef91000
Apr 21 08:51:49 dmwoodlx kernel: [242577.360389] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cefb1000
Apr 21 09:21:51 dmwoodlx kernel: [244379.835306] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceffe000
Apr 21 09:21:51 dmwoodlx kernel: [244379.835333] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cf031000
Apr 21 09:21:51 dmwoodlx kernel: [244379.835368] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cf051000
Apr 21 10:30:50 dmwoodlx syslog-ng[2705]: syslog-ng starting up; version='3.4.7'


One thing that is consistent with this issue are these messages in my kernel log. This is always the last thing in the log before the problem occurs. You can see that once this starts happening it happens every 30 minutes.

Here is the code in the kernel that triggers this message:

lib/swiotlb.c
Code:
void *
swiotlb_alloc_coherent(struct device *hwdev, size_t size,
                       dma_addr_t *dma_handle, gfp_t flags)
{
        dma_addr_t dev_addr;
        void *ret;
        int order = get_order(size);
        u64 dma_mask = DMA_BIT_MASK(32);

        if (hwdev && hwdev->coherent_dma_mask)
                dma_mask = hwdev->coherent_dma_mask;

        ret = (void *)__get_free_pages(flags, order);
        if (ret) {
                dev_addr = swiotlb_virt_to_bus(hwdev, ret);
                if (dev_addr + size - 1 > dma_mask) {
                        /*
                         * The allocated memory isn't reachable by the device.
                         */
                        free_pages((unsigned long) ret, order);
                        ret = NULL;
                }
        }
        if (!ret) {
                /*
                 * We are either out of memory or the device can't DMA to
                 * GFP_DMA memory; fall back on map_single(), which
                 * will grab memory from the lowest available address range.
                 */
                phys_addr_t paddr = map_single(hwdev, 0, size, DMA_FROM_DEVICE);
                if (paddr == SWIOTLB_MAP_ERROR)
                        return NULL;

                ret = phys_to_virt(paddr);
                dev_addr = phys_to_dma(hwdev, paddr);

                /* Confirm address can be DMA'd by device */
                if (dev_addr + size - 1 > dma_mask) {
                        printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
                               (unsigned long long)dma_mask,
                               (unsigned long long)dev_addr);

                        /* DMA_TO_DEVICE to avoid memcpy in unmap_single */
                        swiotlb_tbl_unmap_single(hwdev, paddr,
                                                 size, DMA_TO_DEVICE);
                        return NULL;
                }
        }

        *dma_handle = dev_addr;
        memset(ret, 0, size);

        return ret;
}
EXPORT_SYMBOL(swiotlb_alloc_coherent);


To get there there must be no free pages or the memory isn't reachable by the device.

If anyone knows what could cause this or how I might elicit more information from my system I would appreciate it.
_________________
Gentoo Gaming Videos
Back to top
View user's profile Send private message
Hyper_Eye
Guru
Guru


Joined: 17 Aug 2003
Posts: 446
Location: Huntsville, AL.

PostPosted: Fri Apr 25, 2014 7:50 pm    Post subject: Reply with quote

I'm still hoping for a solution to this problem. Should I post this issue somewhere else? Is there a place where someone may have a better idea of what is causing the DMA allocation errors? Thanks.
_________________
Gentoo Gaming Videos
Back to top
View user's profile Send private message
toneus
n00b
n00b


Joined: 27 Feb 2008
Posts: 35

PostPosted: Tue May 06, 2014 5:04 pm    Post subject: Reply with quote

I am experiencing many if not all of the same. In my search for an answer, I tested the the Memory on my server with memtest86. My RAM is failing Test #2 gloriously! Test #2 is a parallel memory.

Quote:
Test 1 [Address test, own address, Sequential]

Each address is written with its own address and then is checked for consistency. In theory previous tests should have caught any memory addressing problems. This test should catch any addressing errors that somehow were not previously detected. This test is done sequentially with each available CPU.

Test 2 [Address test, own address, Parallel]

Same as test 1 but the testing is done in parallel using all CPUs and using overlapping addresses.


I have gone through the memtest86 on each individual RAM module, in differing memory module slots. Each continues to fail Text #2. I don't think it's a module or MOBO issue.

This has now happened for my last two kernel upgrades, and I'm currently using Linux 3.12.13-gentoo.

Did we miss an important setting, or RAM related change in the latest kernels?

This is one of the most difficult problems I've even encountered on Linux. There is literally no smoking gun, no log message, and no way to access the system once it is this state.

Any help would be greatly appreciated!

Toneus
Back to top
View user's profile Send private message
Hyper_Eye
Guru
Guru


Joined: 17 Aug 2003
Posts: 446
Location: Huntsville, AL.

PostPosted: Wed May 07, 2014 5:44 am    Post subject: Reply with quote

Are you running memtest in Gentoo? Booting to a memtest CD I don't get any memory errors.
_________________
Gentoo Gaming Videos
Back to top
View user's profile Send private message
toneus
n00b
n00b


Joined: 27 Feb 2008
Posts: 35

PostPosted: Wed May 07, 2014 5:04 pm    Post subject: Reply with quote

I am running memtest on boot basically like a CD.
Back to top
View user's profile Send private message
chithanh
Developer
Developer


Joined: 05 Aug 2006
Posts: 1792
Location: Berlin, Germany

PostPosted: Thu May 08, 2014 12:22 pm    Post subject: Reply with quote

Maybe it is a problem with your network card. It looks like the ethernet port is left in a state that causes problems to your router. You could try one or more of the following:
  • Install a different network card in the computer
  • Force ethernet connection to 100 Mbps with ethtool
  • Connect another ethernet switch in between the router and the computer
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum