Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Other Things Gentoo
  • Search

Need to maximize filesystem read performance at any cost

Still need help with Gentoo, and your question doesn't fit in the above forums? Here is your last bastion of hope.
Post Reply
Advanced search
8 posts • Page 1 of 1
Author
Message
no_hope
Guru
Guru
User avatar
Posts: 482
Joined: Mon Jun 23, 2003 8:50 pm

Need to maximize filesystem read performance at any cost

  • Quote

Post by no_hope » Sat Apr 21, 2007 12:16 am

I need to maximize filesystem sequential read performance and I am willing to do unreasonable things as long as I don't have to spend money.

Context: I have a bunch of data files (1-2GB each) that I need to read sequentially and process as I read them. The files need to be processed in multiple passes in different orders, so basically I am sequentially reading a bunch of big files over and over.

Files are stored on a two-disk (3Gb SATA) software RAID-0 (which strangely doesn't seem to help much). I also only have 1GB of memory, significant part of which is used by data processing code.

Could somebody suggest ways to optimize my RAID and filesystem configuration that would help with sequential reads? I am willing to trade-off pretty much everything for read performance.

So far XFS with default options seems to perform best overall. I tried to configure my RAID with chunk sizes varying from 4 to 512kB and it didn't seem to make that much difference either.

Could somebody help?
Top
BitJam
Advocate
Advocate
Posts: 2513
Joined: Tue Aug 12, 2003 4:15 pm
Location: Silver City, NM

  • Quote

Post by BitJam » Sat Apr 21, 2007 2:49 am

Your post was curiously free of numbers. The first suggestion I have for you is to measure your disk throughput. First with hdparm -t and later with a monitor program (I use gkrellm but there are many others).

With my disk, hdparm -t /dev/sda gives a speed of 76 MB/sec. The fastest sustained transfer I've seen with my monitor program is about 50 MB/sec but I usually see much lower numbers. A few simple measurements should give you idea of how much improvement is possible. It will also tell you if you are already close to the theoretical limit.

IMO the key thing will be to make sure your files are each written contiguously to the disk so there is no fragmentation. This will minimize disk seeks and maximize your throughput. This is much much more important than any gains you will get from RAID-0. A long time ago I built a data acquisition system using RAID-0 disks. I wrote my own file system to ensure that every file was written to the disk contiguously.
Top
no_hope
Guru
Guru
User avatar
Posts: 482
Joined: Mon Jun 23, 2003 8:50 pm

  • Quote

Post by no_hope » Sat Apr 21, 2007 4:30 pm

I used dd to compare performances of different configurations, since I think its behavior is almost the same as my workload:
dd if=/dev/zero of=z count=2024000
dd if=z of=/dev/null
with additional operations in-between to try to prevent caching from affecting results. Every test was conducted on freshly made filesystem, so there should be few reasons for fragmentation.

The weird thing is that chunk size and FS tweaks did not have a great effect on performance. That is, by tweaking things, I would either get same or worse performance than the default. I though that it was possible to optimize a fs for large files at the expense of small ones but I wasn't able to do that.

XFS and JFS did seem to on average somewhat outperform ReiserFS and ext3 in reading (the only thing I cared about): about 110 MB/s vs 100 MB/s.
Top
BitJam
Advocate
Advocate
Posts: 2513
Joined: Tue Aug 12, 2003 4:15 pm
Location: Silver City, NM

  • Quote

Post by BitJam » Sat Apr 21, 2007 5:35 pm

It looks like you are probably already doing close to the theoretical limit. The only other thing I could suggest would be for you to look into using an extent filesystem. According to that page XFS and JFS are extent filesystems which would explain why you are getting better performance from them.

Are you able to run hdparm -t on your data disks? If so, how do those numbers compare with the tests you've already done?

My guess is that you are now as optimized as you can get for reading.
Top
Bad Penguin
Guru
Guru
User avatar
Posts: 507
Joined: Wed Aug 18, 2004 8:01 pm
Contact:
Contact Bad Penguin
Website

Re: Need to maximize filesystem read performance at any cost

  • Quote

Post by Bad Penguin » Sun Apr 22, 2007 4:44 am

no_hope wrote:I need to maximize filesystem sequential read performance and I am willing to do unreasonable things as long as I don't have to spend money.

Context: I have a bunch of data files (1-2GB each) that I need to read sequentially and process as I read them. The files need to be processed in multiple passes in different orders, so basically I am sequentially reading a bunch of big files over and over.
If you are I/O bound but have plenty of memory, have you considered copying a file to some sort of ramdisk filesystem and processing it?
Top
Akkara
Bodhisattva
Bodhisattva
User avatar
Posts: 6702
Joined: Tue Mar 28, 2006 12:27 pm
Location: &akkara

  • Quote

Post by Akkara » Sun Apr 22, 2007 9:52 am

Are all your passes using all the data they are reading each time through?

If the passes are reading different parts of the data and skipping some parts leaving them for other passes, it might be faster for your pass1 to read the file once and pre-split the data fields and write out N temp files each containing only the data needed by its corresponding pass. If your application fits this pattern, you'll read the big file once, and write out and then read N smaller files. So depending on how time(read big) + N*time(write temp) + N*time(read temp) compares with N*time(read big), it might be a win.
Top
Bornio
Tux's lil' helper
Tux's lil' helper
User avatar
Posts: 129
Joined: Mon Dec 16, 2002 5:25 pm

  • Quote

Post by Bornio » Sun Jun 17, 2007 10:28 pm

I am curios.
Say you need to make a "grep" on 10gb of text. It will take quiet a while to go over all that text, and you will probably be only limited but the speed of your HD because the CPU will be much faster and will always be waiting for more information.

What happens if your compress, just enough to hold the CPU at top performance without "overflowing".
If our 10gb were compressed to, say, 5gb (and if its text, it can be compressed to 5gb if not much less that), our grep operation will take half the time.

Am I right?
Top
Sujao
l33t
l33t
User avatar
Posts: 677
Joined: Sat Sep 25, 2004 11:24 am
Location: Germany

  • Quote

Post by Sujao » Fri Jun 22, 2007 1:51 pm

Bornio wrote:I am curios.
Am I right?
Nice idea, I'd say yes.
Top
Post Reply

8 posts • Page 1 of 1

Return to “Other Things Gentoo”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic