- I set the smartcontrol daemon so that it runs a short self test automatically once a day in the early morning:
Code: Select all
DEVICESCAN -a -I 194 -R 5! -W 2,45,50 -s S/../.././03 # Full list at end of file. This is interpreted as: # -a monitor all default attributes # -I 194 ignore normalized att# (temp celsius) values # -R5! report any change for att# as 'critical' # -W 2,45,50 warn of temperature (celsius): # 2 - changes of greater than 2 degrees # informational warning: above 45 # critical warning: above 50 # -s S/../../.././03 # perform short test everyday between 03:00-04:00 - I use a custom script to produce a report on every drive:
Code: Select all
#!/bin/bash # Sript provides meaningful report regarding current status of all attached SHD and SSD # devices. It is assumed SMART is installed and the 'smartd' daemon is running. It is # also assumed /path/to/smartd.conf is configured appropriately. # clear results of previous run # this should probably live in /var/log and be saved/rotated # if so, then run date and time initialized in log name? RPTNAME="smartdisk.rpt" touch $RPTNAME rm $RPTNAME touch $RPTNAME date > $RPTNAME echo " " # Select and process SHD and SSD drives. It is not meaningful to process other types # of block devices. Using 'lsblk' does this perfectly. # -dn these options suppress lsblk header and partition details # TODO Not yet sure what's needed here. Connect a USB card reader and see. # (i.e. - check how other block devices are reported and then filter # if needed) for f in $(lsblk -dn | awk '{print $1}') # We want the report to show: # make, model and serial number info # select status info # all errors # temperature # 'Unkown' - so we can pick up the HE status for helium drives. # Normal value should be '100' # daily self test results # '## Short' - lists prior self test results. SMART enabled drives retain # the last 21 entriess. We will report only the last 5 entries. # 'result' - Health check overall result. This is meaningless without # first running at least one short/long test. do echo " " >> $RPTNAME echo "Processing disk /dev/$f ..." >> $RPTNAME smartctl -a "/dev/$f" | grep -E 'Model|Serial|Version|atabase|Unknown|ogged|eallocated|Spin_Retry|Read_Error|Celsius| [12345] Short' >> $RPTNAME smartctl -H "/dev/$f" | grep 'result' >> $RPTNAME done # close the report echo " " >> $RPTNAME echo "report complete" >> $RPTNAME
Code: Select all
Processing disk /dev/sdb ...
Device Model: MD1TBLSSHD
Serial Number: MD302334551
Firmware Version: 03.01A02
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 104 094 000 Old_age Always - 43
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
# 1 Short offline Completed without error 00% 9142 -
# 2 Short offline Completed without error 00% 9118 -
# 3 Short offline Completed without error 00% 9094 -
# 4 Short offline Completed without error 00% 9070 -
# 5 Short offline Completed without error 00% 9046 -
SMART overall-health self-assessment test result: PASSEDCode: Select all
Processing disk /dev/sdaj ...
Device Model: WL6000GSA6457
Serial Number: WOL240332217
Firmware Version: A3.00F.0
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 3.0 Gb/s (current: 3.0 Gb/s)
1 Raw_Read_Error_Rate 0x002f 200 191 051 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0033 194 194 140 Pre-fail Always - 195
10 Spin_Retry_Count 0x0033 100 253 051 Pre-fail Always - 0
194 Temperature_Celsius 0x0022 117 096 000 Old_age Always - 35
196 Reallocated_Event_Count 0x0032 196 196 000 Old_age Always - 4
SMART Error Log Version: 1
No Errors Logged
# 1 Short offline Completed: read failure 60% 23226 987648
# 2 Short offline Completed: read failure 60% 23202 987648
# 3 Short offline Completed: read failure 60% 23179 987648
# 4 Short offline Completed: read failure 60% 23155 987648
# 5 Short offline Completed: read failure 60% 23131 987648
SMART overall-health self-assessment test result: PASSEDIs it actually possible to extract specific attribute data from Seagate drives anymore? The default smartctl ouput for Seagate drives now looks like:
Code: Select all
# smartctl -a /dev/sdaz
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.16-gentoo] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST6000NM0034
Revision: E005
Compliance: SPC-4
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Formatted with type 2 protection
8 bytes of protection information per logical block
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c500835ec477
Serial number: Z4D1RXYQ0000R535WXLM
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Fri Jan 25 17:40:09 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature: 37 C
Drive Trip Temperature: 60 C
Manufactured in week 14 of year 2015
Specified cycle count over device lifetime: 10000
Accumulated start-stop cycles: 32
Specified load-unload count over device lifetime: 300000
Accumulated load-unload cycles: 1457
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 2524347320
Blocks received from initiator = 2261760576
Blocks read from cache and sent to initiator = 2208709482
Number of read and write commands whose size <= segment size = 2850162024
Number of read and write commands whose size > segment size = 8848917
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 27745.60
number of minutes until next internal SMART test = 36
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 500918108 0 0 500918108 0 1770151.992 0
write: 0 0 0 0 0 1805802.472 0
verify: 64 0 0 64 0 0.000 0
Non-medium error count: 338
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 27731 - [- - -]
# 2 Background short Completed - 27707 - [- - -]
# 3 Background short Completed - 27691 - [- - -]
Long (extended) Self-test duration: 40836 seconds [680.6 minutes]I can tune my script so that I get most of the equivalent information. The resultant Seagate drives look like this:
Code: Select all
Processing disk /dev/sdbk ...
Vendor: SEAGATE
Product: ST6000NM0034
Revision: E005
Serial number: Z4D1P08Y0000R536MR9N
Transport protocol: SAS (SPL-3)
Temperature Warning: Enabled
Current Drive Temperature: 37 C
Drive Trip Temperature: 60 C
read: 963713629 0 0 963713629 0 1746531.708 0
write: 0 0 0 0 0 1801339.923 0
verify: 10336 0 0 10336 0 0.021 0
# 1 Background short Completed - 27755 - [- - -]
# 2 Background short Completed - 27731 - [- - -]
# 3 Background short Completed - 27716 - [- - -]Code: Select all
smartctl -a "/dev/$f" | grep -E 'Vendor:|Product|odel|Serial|Version|Revision|Transport|atabase|Unknown|ogged|eallocated|Spin_Retry|Read_Error|Celsius|Temperature|read:|write:|verify:| [12345] Short| [12345] Background' >> $RPTNAME
