View previous topic :: View next topic |
Author |
Message |
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Thu Sep 14, 2017 9:39 am Post subject: LVM cache recurring corruption -- how to reinitialize device |
|
|
Hello all!
Since yesterday I have strange problem with my LVM RAID 5 (HDD) + LVM cache (on SSD) setup. Here are some details:
- this is relatively new minimal setup (new install on 25th of August 2017) on J3455-ITX (4 core Celeron), only cryptsetup, LVM RAID 5 (builtin), 1 x SSD as /dev/sda (rootfs/system is not encypted), 5 x HDD Samsung 1,5 TB encrypted: /dev/sdb1 -> /dev/mapper/crypt1 and so on
- on top of crypt[1-5] devices is LVM RAID 5 (no MD Raid layer)
- added 100 GB SSD cache on encypted /dev/sda4 partition -- referene: https://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/
All was working OK for like 3 weeks. Two days ago I had to replace one HDD as it began to fail.
So:
- I had to uncache the LVM (went ok) -- reference: https://rwmj.wordpress.com/2014/05/23/removing-the-cache-from-an-lv/
- removed one drive
- added new encrypted 1,5 TB drive
- resynced LVM RAID 5
- fsck -- all OK
- LVM, filesystem status -- healthy
- then I reattached SSD cache. All seemed to be working OK, cache was up and running
- first reboot: LVM missing, cache device corrupted
I did cache removal / attach before a few times as a test before the drive was replaced. It went without any errors then.
Since the cache corruption I had to do a manual recovery to uncache the LVM and get access to the data: I had to edit vgcfgbackup by hand and do vgcfgrestore.
This is similar https://www.redhat.com/archives/linux-lvm/2016-December/msg00015.html not a single tool could help me to uncache LVM with corrupt cache.
Anyway, after manual recovery the data was intact, so I tried to do it again.
I did pvremove on ssd, pvreate, vgextend and so on. None of those commands displayed any error message, so I was sure the SSD cache was properly reinitialized
LVM was cached until next reboot (today), when it went missing again. Seems it is not usable now for reason unknown to me.
Is there any way to check / wipe the SSD cache partition (apart from overwriting it with /dev/zero)? |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Thu Sep 14, 2017 5:01 pm Post subject: |
|
|
I don't understand this:
Code: | root@carbon:~# lvcreate -n cache0meta -L 120M vg0 /dev/mapper/luks_cache
Logical volume "cache0meta" created.
root@carbon:~# lvcreate -n cache0 -l 25568 vg0 /dev/mapper/luks_cache
Logical volume "cache0" created.
root@carbon:~# ls /dev/mapper/
control luks_lvm1 luks_lvm3 luks_lvm5 vg0-cache0meta vg0-lvol0_rimage_0 vg0-lvol0_rimage_2 vg0-lvol0_rimage_4 vg0-lvol0_rmeta_1 vg0-lvol0_rmeta_3
luks_cache luks_lvm2 luks_lvm4 vg0-cache0 vg0-lvol0 vg0-lvol0_rimage_1 vg0-lvol0_rimage_3 vg0-lvol0_rmeta_0 vg0-lvol0_rmeta_2 vg0-lvol0_rmeta_4
root@carbon:~# cache_check
No input file provided.
Usage: cache_check [options] {device|file}
Options:
{-q|--quiet}
{-h|--help}
{-V|--version}
{--clear-needs-check-flag}
{--super-block-only}
{--skip-mappings}
{--skip-hints}
{--skip-discards}
root@carbon:~# cache_check /dev/mapper/vg0-cache0
examining superblock
superblock is corrupt
bad checksum in superblock
root@carbon:~# cache_check /dev/mapper/vg0-cache0meta
examining superblock
superblock is corrupt
bad checksum in superblock
root@carbon:~# lvconvert --type cache-pool --poolmetadata vg0/cache0meta vg0/cache0
Using 128,00 KiB chunk size instead of default 64,00 KiB, so cache pool has less then 1000000 chunks.
WARNING: Converting logical volume vg0/cache0 and vg0/cache0meta to cache pool's data and metadata volumes with metadata wiping.
THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert vg0/cache0 and vg0/cache0meta? [y/n]: y
Converted vg0/cache0_cdata to cache pool.
root@carbon:~# ls /dev/mapper/
control luks_lvm1 luks_lvm3 luks_lvm5 vg0-lvol0_rimage_0 vg0-lvol0_rimage_2 vg0-lvol0_rimage_4 vg0-lvol0_rmeta_1 vg0-lvol0_rmeta_3
luks_cache luks_lvm2 luks_lvm4 vg0-lvol0 vg0-lvol0_rimage_1 vg0-lvol0_rimage_3 vg0-lvol0_rmeta_0 vg0-lvol0_rmeta_2 vg0-lvol0_rmeta_4
root@carbon:~# cache_check /dev/mapper/vg0-cache0meta
/dev/mapper/vg0-cache0meta: No such file or directory
root@carbon:~# cache_check /dev/mapper/luks_
luks_cache luks_lvm1 luks_lvm2 luks_lvm3 luks_lvm4 luks_lvm5
root@carbon:~# cache_check /dev/mapper/luks_cache
examining superblock
superblock is corrupt
bad checksum in superblock
|
This is on newly wiped/trimmed SSD partition, with NEW luks key, luksFormat, pvcreate, etc.
If I add the cache to my LVM RAID 5, then it will get b0rked on reboot. |
|
Back to top |
|
|
MageSlayer Apprentice
Joined: 26 Jul 2007 Posts: 252 Location: Ukraine
|
Posted: Fri Sep 15, 2017 9:15 am Post subject: |
|
|
Are you sure your SSD is ok? |
|
Back to top |
|
|
Roman_Gruber Advocate
Joined: 03 Oct 2006 Posts: 3846 Location: Austro Bavaria
|
Posted: Fri Sep 15, 2017 11:59 am Post subject: |
|
|
Quote: | superblock is corrupt
bad checksum in superblock |
Did you checked your cables, connection, power supply, redo the wiring?
OFC latest firmware on the drive. is the drive healthy?
Quote: | /dev/mapper/vg0-cache0meta: No such file or directory |
Looks like it does not exists or is not visible to the operating system.
Sometimes i had to initialize, make it visible to the OS, with vg-scan, vg -ay (or what the commands are, please check manpage!) Sometimes only a reboot did the trick on some sysrescue-cd discs.
--
I never had a broken SSD, sold my 5 year old daily used Plextor SSD recently. I usually sell, replace HDDs out of habbits every seond, third year average. |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Fri Sep 15, 2017 4:51 pm Post subject: |
|
|
SSD seems to be healthy.
/dev/sda2 is 16GB system partition that has no trouble reading, writing, updating.
SMART info is clean, dmesg also, no errors, even crc32.
But I'll convert sda4 to plain ext4 and make some tests with files.
vg0-cache0meta0 is hidden by LVM after it is added to pool as a csche for HDD. Hence you can't check it explicitly. |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Sat Sep 16, 2017 6:44 am Post subject: |
|
|
SDD is OK, I just did long SMART test and 25GB copy and md5 checksum test on BTRFS partition (of course I rebooted the machine in the meantime):
Code: | smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.12.0-1-amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG MZNTD128HAGM-00000
Serial Number: S15YNYAD625624
LU WWN Device Id: 5 002538 50003cf55
Firmware Version: DXT2300Q
User Capacity: 128,035,676,160 bytes [128 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Sep 16 08:29:39 2017 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
(...)
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 4090
12 Power_Cycle_Count 0x0032 096 096 000 Old_age Always - 3664
177 Wear_Leveling_Count 0x0013 095 095 000 Pre-fail Always - 56
179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0
181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0
183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0032 069 048 000 Old_age Always - 31
195 Hardware_ECC_Recovered 0x001a 200 200 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0
235 Unknown_Attribute 0x0012 099 099 000 Old_age Always - 252
241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 10202610775
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4089 -
root@carbon:~# free && sync && echo 3 > /proc/sys/vm/drop_caches && free
total used free shared buff/cache available
Mem: 8021580 232696 137676 5104 7651208 7497636
Swap: 7812092 4608 7807484
total used free shared buff/cache available
Mem: 8021580 232176 7703052 5104 86352 7599356
Swap: 7812092 4608 7807484
[dest dir on SSD, btrfs, after reboot] md5sum *
c7caf4e97cadf52a2489a176284ed8f4 1.mkv
1095527685e2aba668bee2c2958229af 2.mkv
19e2153cce10b4317e2add27747c4356 3.mkv
62b1cac0498a245a69056506e4d6356c 4.mkv
26f9230e3da7158a87c60d526ed7eb26 5.mkv
79e4d0db765a93ac5dee1b2ed1b53e39 6.mkv
a8c438783ee10fe75fcb3ba2cd636238 7.mkv
81e141a09074e7a16756fb472458df9e 8.mkv
706a2ee617186be6796b20d425eb836d 9.mkv
a13e62ddad20cc0e796f7e9a46c09a83 10.mkv
[source dir on HDD] md5sum *
c7caf4e97cadf52a2489a176284ed8f4 1.mkv
1095527685e2aba668bee2c2958229af 2.mkv
19e2153cce10b4317e2add27747c4356 3.mkv
62b1cac0498a245a69056506e4d6356c 4.mkv
26f9230e3da7158a87c60d526ed7eb26 5.mkv
79e4d0db765a93ac5dee1b2ed1b53e39 6.mkv
a8c438783ee10fe75fcb3ba2cd636238 7.mkv
81e141a09074e7a16756fb472458df9e 8.mkv
706a2ee617186be6796b20d425eb836d 9.mkv
a13e62ddad20cc0e796f7e9a46c09a83 10.mkv
|
|
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Sat Sep 16, 2017 8:41 am Post subject: |
|
|
Seems I'm getting onto something:
Code: | root@carbon:~# dd if=/dev/zero of=/dev/mapper/luks_cache status=progress
433003520 bajtów (433 MB, 413 MiB), 9 s, 48,1 MB/s ^C^C^C
root@carbon:~#
root@carbon:~#
root@carbon:~#
root@carbon:~#
root@carbon:~# dd if=/dev/zero of=/dev/mapper/luks_cache bs=8M status=progress
3447717888 bajtów (3,4 GB, 3,2 GiB), 3,00394 s, 1,1 GB/s
dd: błąd zapisu '/dev/mapper/luks_cache': Brak miejsca na urządzeniu
489+0 przeczytanych rekordów
488+0 zapisanych rekordów
4093915136 bajtów (4,1 GB, 3,8 GiB), 3,80779 s, 1,1 GB/s
|
In short, I tried to wipe the encrypted block device (on top of 100 GB /dev/sda4 partition) and the write process failed after just over 4 GB of data with "no space left on device" message.
dmesg has this at the end:
Code: | wrz 16 10:27:27 carbon systemd[1]: Stopped target Encrypted Volumes.
wrz 16 10:27:27 carbon systemd[1]: Stopping Cryptography Setup for luks_cache...
wrz 16 10:27:27 carbon systemd[1]: Stopped Cryptography Setup for luks_cache.
wrz 16 10:27:37 carbon kernel: CMCI storm detected: switching to poll mode
|
Encrypted block device "luks_cache" was simply kicked of the system and I think it is connected to the "CMCI storm" (first time I see this).
https://forums.gentoo.org/viewtopic-p-8115134.html <-- here is similar hardware (Intel Celeron J3355 CPU).
Slower writes to the plain BTRFS were approx. 100 MB/s speed (copying from HDD) and the system handled 25 GB with no problem.
4 GB of high speed writes (~1 GB/s) seems to overwhelm it. Where to look next -- software, hardware?
Is there any kernel switch that can help here (I'm still on 4.12 series on this machine)? |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Sat Sep 16, 2017 1:19 pm Post subject: |
|
|
Writing to raw (unencrypted) sda4 device seems OK, no storm here:
Code: | 107487428608 bajtów (107 GB, 100 GiB), 872,027 s, 123 MB/s
dd: błąd zapisu '/dev/sda4': Brak miejsca na urządzeniu
25630+0 przeczytanych rekordów
25629+0 zapisanych rekordów
107497914368 bajtów (107 GB, 100 GiB), 887,286 s, 121 MB/s
|
|
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Sat Sep 16, 2017 2:23 pm Post subject: |
|
|
OK, the question is:
why raw write to encrypted device is much faster (I suspect some kind of buffer in dm-mapper layer?) than raw write to unencrypted device? |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Sat Sep 16, 2017 4:50 pm Post subject: |
|
|
Small success here: just like in the referenced thread, upgrading the kernel to 4.13.x seems to have solved the "superfast writes" and device kicked out of dm-mapper:
Code: | dd if=/dev/zero of=/dev/mapper/luks_cache bs=4M status=progress
9441378304 bajtów (9,4 GB, 8,8 GiB), 67,0111 s, 141 MB/s
...
|
Write speed seems normal, dmesg reports no storm. |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Sun Sep 17, 2017 6:45 am Post subject: |
|
|
I reinitialized the LVM cache and I'm at loss here:
Code: | root@carbon:~# cache_check /dev/mapper/luks_cache
examining superblock
superblock is corrupt
bad checksum in superblock |
EDIT:
I did a quick test:
Code: | root@carbon:~# lvconvert --type cache --cachepool vg0/cache0 vg0/lvol0
Do you want wipe existing metadata of cache pool vg0/cache0? [y/n]: y
WARNING: Data redundancy is lost with writeback caching of raid logical volume!
Logical volume vg0/lvol0 is now cached.
...
root@carbon:~# lvremove vg0/cache0
Do you really want to remove and DISCARD logical volume vg0/cache0? [y/n]: y
Flushing 0 blocks for cache vg0/lvol0.
Logical volume "cache0" successfully removed
|
No rebooting with cache enabled. |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Sun Sep 17, 2017 5:02 pm Post subject: |
|
|
This is the last episode in this series (I hope) -- or "how I learned to stop worrying and love the cache".
After doing extensive testing on unencrypted device (I even moved the partition to another location by 20 gigabytes, also used smaller size), wiped with zeroes, I came to conclusion that check_cache "superblock corruption" status is probably a bug. Even tried with downgraded to 0.6.1 version.
I disabled the cache_check in lvm.conf and my LVM RAID 5 with BTRFS survived 3 reboots already and btrfsck after each reboot (uncached and cached -- no errors).
Code: | root@carbon:~# ./lvmcache-statistics.sh
-------------------------------------------------------------------------
LVM [2.02.173(2)] cache report of found device /dev/vg0/lvol0
-------------------------------------------------------------------------
- Cache Usage: 4.6% - Metadata Usage: 23.7%
- Read Hit Rate: 27.1% - Write Hit Rate: 65.9%
- Demotions/Promotions/Dirty: 0/5412/0
- Feature arguments in use: metadata2 writeback
- Core arguments in use : migration_threshold 2048 smq 0
- Cache Policy: stochastic multiqueue (smq)
- Cache Metadata Mode: rw
- MetaData Operation Health: ok
root@carbon:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lvol0 vg0 Cwi-aoC--- <5,46t [cache0] [lvol0_corig] 4,68 23,74 0,00 |
Now we wait. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|