How do I find out my file move takes so long (on btrfs)?

Letharion · Veteran Joined: 13 Jun 2005 Posts: 1344 Location: Sweden

I started moving a file, about 2 GiB from my SSD to my BTRFS raid-1. This wouldn't normally be a big deal, so I didn't expect it to take very long.

Now, 5 hours later, it still hasn't finished, and I have no idea why.

top shows very high values for "wait", hovering around 70 for long time periods, but going up and down.

java, which is doing the move, takes up about 6 - 7% CPU according to top.

Connecting to the long running process with strace, shows

vaxbrat · l33t Joined: 05 Oct 2005 Posts: 731 Location: DC Burbs

When a copy stalls like that it can often mean that the i/o is being blocked due to problems with the hardware. Look at your messages file to see if any btrfs events happened:

Letharion · Veteran Joined: 13 Jun 2005 Posts: 1344 Location: Sweden

Thanks vaxbrat!

Based on your help, I've tried to continue diagnosing, but unfortunately I'm not sure I'm much wiser, see below.

vaxbrat · l33t Joined: 05 Oct 2005 Posts: 731 Location: DC Burbs

The low stats for btrfs-transact suggest to me that your btrfs raid is just fine but not getting much of anything thrown to it. That 5-10 second blip on the drive light is probably journal flush and commit.

So it sounds like either your SSD is having trouble reading, or something else (file lock?) may be blocking your copy. If you are doing this in a gui, maybe a pop-up really became a pop-under and it's blocking while asking you some stupid question about whether you want to replace something, etc.

For your smartctl you really want to look at the raw values for SMART attributes to spot things like ECC errors re-allocated and offline uncorrectable. For example here's a dump of a brand new WD Red 4tb drive which reports as clean as a whistle:

Letharion · Veteran Joined: 13 Jun 2005 Posts: 1344 Location: Sweden

I see, now I understand better what I'm looking at.

The oldest drive by far, which is one of the raids, looks like this:

vaxbrat · l33t Joined: 05 Oct 2005 Posts: 731 Location: DC Burbs

On your first drive:

vaxbrat · l33t Joined: 05 Oct 2005 Posts: 731 Location: DC Burbs

Also notice:

davidm · Guru Joined: 26 Apr 2009 Posts: 557 Location: US

All versions 3.15.0 - 3.17 rc2 (at least) have a nasty bug causing deadlock type conditions when compression is enabled. 3.14.x also had some sort of variation of this from what I understand but is still more stable. Patches exist and will probably land in a future kernel release. See the mailing list for details. Just thought I'd mention it in case this is what you are running into and it's not hardware related at all.