Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Question about the safety of data
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
axl
Apprentice
Apprentice


Joined: 11 Oct 2002
Posts: 246
Location: Romania

PostPosted: Mon Mar 20, 2017 6:12 pm    Post subject: Reply with quote

NeddySeagoon wrote:
axl,

Its a Scottish word.
It does not mean that you are childlish, it means that you are young compared to the person using the phrase "just a bairn".
I'm 63.


many more to come old timer. i'm not a native english speaker. uhm, sir. i meant sir.

uhm, seem to me there's some drama. i didn't know there will be drama.
Back to top
View user's profile Send private message
depontius
Advocate
Advocate


Joined: 05 May 2004
Posts: 3234

PostPosted: Mon Mar 20, 2017 6:44 pm    Post subject: Reply with quote

This is all interesting reading, especially the last stuff about ext4 defaults.

I've seen the most important word a few times in the thread, but not that often considering the topic - backups.

A few years back, my venerable and ancient 2x40GB RAID1 was finally getting overstuffed, so I bought a pair of 2TB drives for the new RAID1. Somehow I had gotten the impression that btrfs had actually managed to mature, at least enough for a home setup - so I used it. At the same time, I didn't quite trust either these gigantic new drives or btrfs, so I finally put in place a real backup plan. I have a 2TB portable drives. At the time I had two, now I have three. Every night I ping-pong between the two, rsync'ing my RAID to the least-recently written portable. I also implemented offsite backup. One drive is sitting in my cabinet at work. (That's why I'm up to three drives, two to ping-pong and one at work.)

Anyway, I've had UPS, but didn't during that timeframe, and had a power fault. My ext4 partitions came back OK, my XFS (MythTV) partition came back. My btrfs partition utterly died. Not just corrupted - GONE. Running btrfs-scan, or some command like that couldn't even tell that btrfs had ever existed on the drives. What's worse, I didn't catch it immediately upon powering back up - I don't remember why - I think it was evening and I didn't want to power up the client machines since it was bedtime - I just got the servers back up. Anyway, that night's backup backed up nothing onto the portable drive at home. Wiped.

Saved by offsite backup.

After a bit of fussing I reformatted as ext4, restored from the drive at work, and have never looked back.

As for remedial actions... My backup script now makes sure that the source drive is properly mounted, not just the destination drive. It will never back up an empty mount point again. I'm up to three portable drives, so that except for the work-day Mondays and Fridays, when I'm actually transporting drives, there are always two drives plugged in and one at work. As mentioned, I'm back to ext4. Oh, and UPS.

In retrospect, I feel some guilt about wiping the btrfs. Had I the extra space sitting around, I really should have somehow gotten that data to the btrfs developers.

EDIT - to add one other thing, I run the recommended weekly mdadm sync.
_________________
.sigs waste space and bandwidth
Back to top
View user's profile Send private message
axl
Apprentice
Apprentice


Joined: 11 Oct 2002
Posts: 246
Location: Romania

PostPosted: Mon Mar 20, 2017 7:16 pm    Post subject: Reply with quote

there's a strange mix of things.

dealloc. tco. systemd. i know the drama behind systemd. but never heard of this tco or dealloc thing. could anyone please elaborate ?


meanwhile, i kinda hate btrfs.

someone once told me that my posts are like blog posts. and this is exactly what I aimed when i started the thread. talking about my stuff. catching news. maybe some drama. so ... i just want to understand what people are talking about. seems things escalated but i dont understand the fight.


anyway. going back to btrfs. blog style. I mentioned yesterday i was on my way back to raid/xfs. first you got to get rid of btrfs.

so first step was to convert btrfs from raid1 back to single. which took 10+ hours.

I say +10 because i know these drives do a complete run of each other in 8 hours. i mentioned that before. but again. i caught what they were doing, while they were doing it.

so this is the convert from raid1 to single. side by side drive A and drive B.

https://www.youtube.com/watch?v=X_jpViD6wKE

also http://dale.ro/~axl/sda1.png and http://dale.ro/~axl/sdb1.png

then came the step to remove drive b from array. which is weird. btrfs is weird. talk about a device that is a mount point.

anyway.

the end result of removing drive b from mount point took another few hours.

https://www.youtube.com/watch?v=DqG7Quv8o0Q

http://dale.ro/~axl/sda2.png

http://dale.ro/~axl/sdb2.png

there needs to be like a legend. what am i looking at. again.


we are talking about 2 WD reds of 4Tb that were put in a raid1 btrfs configuration. first animation shows the btrfs being reconfigured as a single. second shows drive B being pulled out of the btrfs. i took the long way around. i mentioned that yesterday.

just to be safe.


tomorrow. btrfs 2 xfs. moving data around. and maybe, just maybe, how raid syncs data. meanwhile... what's all the drama about?
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 11184

PostPosted: Tue Mar 21, 2017 1:44 am    Post subject: Reply with quote

steveL wrote:
Well, Tso's article is pretty authoritative in that it comes from the upstream author of ext2/3/4, so the problem is definitely real.
Agreed. I defer to his authority in this area. However, his linked article is fairly old, and I hoped that the situation has improved somewhat since he wrote that post. My complaint was about the linked LWN comment which, while it might be perfectly valid, was so free of useful data that we cannot determine whether he was in an environment that had any hope of benefiting from any recent fixes to the relevant code. If he's running an enterprise distribution with multi-level change control boards, his "current" environment might be years out of date. On the other hand, he might be running Gentoo unstable and using newer code than some of us run. I doubt the latter, but he gave us nothing to use to guess where he is in history.
steveL wrote:
It's fine to provide options that are only safe when you use an UPS. It is not fine to pretend to users that they are getting the same data=ordered treatment as ext3, while doing nothing of the sort in the default setup.
Did anyone authoritative ever claim that ext4 with delalloc was as safe as ext3 with data=ordered? Users may have assumed it, but if no one ever asserted it to be safe, I can see why the kernel developers were a bit snippy that users were suddenly surprised. The kernel developers seem to believe that it was only an accident that users ever got away with this, and then only because the users got lucky. I grant that it would be nice if it was more clearly documented that, in the name of improved performance, some technically incorrect but frequently functional scenarios were no longer functional.

While I have used systems that lack battery backup, I have no sympathy for people who intentionally perform unclean halts, then complain that the unclean halt was unclean. Unclean halts should only occur when your kernel crashes, your hardware crashes, or events outside your control interrupt power. ACPI soft-off has existed for many years, and is a far better choice for halting a machine.
Back to top
View user's profile Send private message
Goverp
Guru
Guru


Joined: 07 Mar 2007
Posts: 504

PostPosted: Tue Mar 21, 2017 10:16 am    Post subject: Reply with quote

Tso's delalloc article is interesting. IIUC, one way of approaching the problem is to consider a (decent) filesystem as composed of two parts: (a) a journal of all update transactions; and (2) a cache of the results of applying all the journalled transactions (the actual files). The problem arises because the cache (file system) has been updated ahead of a syncpoint in the journal, which of course is madness. Many mad things are done in the name of performance.

Which raises a few questions: a) would configuring data journalling in ext4 mitigate the described problem; b) can logging file systems such as f2fs, logfs ,handle such situations better; and c) is there an excessive performance hit ;-)
_________________
Greybeard
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum