View previous topic :: View next topic |
Author |
Message |
garythompson n00b

Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Thu Jun 03, 2010 8:57 am Post subject: Corrupt Network Comms from Busy Atom |
|
|
Hi All,
I have an atom running Hardened Gentoo which seems to corrupt network traffic when the CPU is busy. It all started:
- I'm using the atom as a network storage (CIFS) which is running a combination of LVM, Raid5, Raid10 and dmcrypt LUKS
- CPU is happily handling network load (worse case 10% CPU load)
- System is holding VirtualBox vdi files for access as a samba mount on my main desktop
Problems started:
- Atom box is compiling new kernel (removing hardnened due to problem with VirtualBox, PAX and non hwvirt system)
- Started virtualbox on main system, reading disk files from atom box reported corruption (no significant log messages)
- SSHed in and when I ran dmesg, it dropped out with :
Corrupted MAC on input.
Disconnecting: Packet corrupt
- After compilation completed, I ran a bzip operation and could repeat the same issues.
- nothing in dmesg
I'm concerned now at the reliability of CIFS. If something as simple as the CPU becoming loaded can cause data corruption on the network, I might need to reconsider what I'm doing here.
Has anyone else has similar problems? I've read elsewhere about people having problems copying large files over the network and finding different MD5 hashes from source (copied over samba) but nothing that can help my diagnose this problem.
I tried renice on the smbd processes on the atom box but it still didn't fix the problem, I suspect an issue with the networking side of things. _________________ Life is for Living |
|
Back to top |
|
 |
garythompson n00b

Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Thu Jun 03, 2010 12:47 pm Post subject: |
|
|
This appears to have sorted itself out. I need to read up more on PAX, PIE and so forth because once I disabled PAX in the hardened kernel my system became very unstable. (This was disabled in an attempt to get virtualbox to work, see http://www.virtualbox.org/ticket/941).
I returned to the hardened kernel (one I had spare) and all has returned to stable. Do I really need to rebuild my entire system in order to no longer use PAX? I still wanted to use GRSEC at least. _________________ Life is for Living |
|
Back to top |
|
 |
Hu Administrator

Joined: 06 Mar 2007 Posts: 23523
|
Posted: Thu Jun 03, 2010 10:04 pm Post subject: |
|
|
Aside from the configuration change between PaX/non-PaX, do the working and nonworking kernels differ? Are they both the same version and patch level of the kernel? Are there any other configuration differences? |
|
Back to top |
|
 |
garythompson n00b

Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Sat Jun 05, 2010 10:47 pm Post subject: |
|
|
Hi,
Exact same kernel (2.6.28-r3) using the same config. I disabled PAX in menuconfig and recompiled. _________________ Life is for Living |
|
Back to top |
|
 |
garythompson n00b

Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Mon Jun 07, 2010 8:41 am Post subject: |
|
|
Well, I changed profile to a non-hardened profile and recompiled a new kernel (adopting the same driver settinngs) but no PAX / GRSEC etc... (gentoo sources).
The system is unstable. Network corruption (as noted above on SSH) when CPU is busy. I'm going to regret cheaping out and going with the Atom, I have that feeling.
Note that I did emerge the tool chain first and then emerge -e world and then did a make clean on the kernel before rebuilding it.
EDIT:
I now suspect my main computer. I'm getting "Failed on RMD160 verification" on portage downloads, my web traffic is sporadically garbled (refresh fixes) and other symptoms are developing. I'm stuck though because this system has worked fine for a while (since I rebuilt it with new Mobo/Cpu/RAM a couple of months ago).
It seems to be only network traffic that's playing up. It could be my network hardware but I am not sure on how I would diagnose that until I can make the problem repeatable.
ANOTHER EDIT:
Symptoms have been on my main computer:
- SSH drop out during network or system loading with message noted above
- Garbled Web browsing (sporadic)
- Thunderbird failed to download mail with error ssl_error_bad_mac_read
- NTP failed to download (emerge) with Failed on RMD160 verification
- Vitualbox fails to start (loading saved state from network device)
Almost all the above symptoms went away with a system reboot (although I have had them previously). I guess the problem is no longer just a network issue. I've posted about this particular issue elsewhere (https://forums.gentoo.org/viewtopic-p-6309988.html#6309988) _________________ Life is for Living
Last edited by garythompson on Thu Jun 17, 2010 7:33 am; edited 1 time in total |
|
Back to top |
|
 |
garythompson n00b

Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Thu Jun 17, 2010 7:32 am Post subject: |
|
|
Summary:
- Everything was working fine, Virtual Machines being loaded from a hardened samba server running gentoo from a gentoo based client.
- I downgraded security and disabled PAX in Kernel (in order to run a VM on the same server), moving large files over network appeared to be corrupted
- moving to a vanilla kernel didn't seem to help
- No changes were made at all to the OS of the samba client (gentoo based system), but it appears this ended up possibly being a client issue. See post linked to previously. _________________ Life is for Living |
|
Back to top |
|
 |
|