| View previous topic :: View next topic |
| Author |
Message |
Robert S Guru


Joined: 15 Aug 2004 Posts: 412 Location: Canberra Australia
|
Posted: Mon Apr 09, 2012 9:04 pm Post subject: Impending HD failure? |
|
|
I'm starting to get these:
| Code: | Apr 10 00:36:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Currently unreadable (pending) sectors
Apr 10 00:36:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Offline uncorrectable sectors
Apr 10 01:06:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Currently unreadable (pending) sectors
Apr 10 01:06:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Offline uncorrectable sectors
|
What is the best way of testing my HD? I'm currently doing a backup of the entire disk with a view to restoring it on another HD. I'll do a reboot with an automatic fsck when I've finished. Any other suggestions? |
|
| Back to top |
|
 |
Thistled Guru


Joined: 06 Jan 2011 Posts: 433 Location: Scotland
|
Posted: Mon Apr 09, 2012 10:57 pm Post subject: |
|
|
I have been seeing these errors on 2 of my disks since installing gentoo back in 2008.
I thought it may have something to do with dual booting with windoze, as on one occasion I restarted my PC from Windoze str8 to Gentoo and there was a lock on the ntfs disk I was trying to mount. The solution was to shutdown windoze, then boot into Gentoo, and I would subsequently get access to the disk.
I tried defragmenting windoze to see if that would resolve it, but no joy.
The sectors always seem to be of the same size, and no increase over the years.
Just as long as you have made a backup of your important stuff, then I would not worry too much about this.
It sure as hell scared the ***t out of me when I first saw this info, but it has not escalated since the first warning, so I am not too worried. _________________ Whatever you do, do it properly! |
|
| Back to top |
|
 |
Jaglover Advocate


Joined: 29 May 2005 Posts: 3979 Location: Saint Amant, Acadiana
|
Posted: Mon Apr 09, 2012 11:11 pm Post subject: |
|
|
You should run something like this
| Code: | | smartctl --all /dev/sda | grep -e "Reallocated_Sector_Ct" -e "Current_Pending_Sector" -e "Offline_Uncorrectable" -e "UDMA_CRC_Error_Count" -e "Hardware_ECC_Recovered" |
to see if the drive is going bad. In my experience once the error count goes out of hand the drive is going to die soon. _________________ Please learn how to denote units correctly! |
|
| Back to top |
|
 |
srs5694 Guru

Joined: 08 Mar 2004 Posts: 310 Location: Woonsocket, RI
|
Posted: Mon Apr 09, 2012 11:11 pm Post subject: |
|
|
I strongly advise both of you to run a full SMART diagnostic on the disk. This can be done with tools like smartctl (text-mode), GSmartControl (GUI), or Palimpsest Disk Utility (SMART options are buried in a menu somewhere). IIRC, smartctl and Palimpsest are available in portage, but for some reason GSmartControl isn't. You might also be able to run a SMART test using a utility provided by the disk manufacturer, but that's likely to be written for Windows. This might be OK if you dual-boot, but on a Linux-only system, this could be problematic.
Unfortunately, SMART diagnostic results can be difficult to interpret. Some manufacturers put weird values in some fields that make things look worse than they are. Some fields are strangely named, and utilities often provide poor descriptions of what they mean. As a general rule, the GUI tools make the results easier to interpret than do the text-mode tools.
If the SMART tool gives you anything but "passed" for its overall assessment, you should probably replace the disk ASAP. Likewise if individual tests look troubling and you get confirmation from an expert that this reflects a real problem. The whole point of SMART is to detect disks that are just starting to flake out, so that you can replace the hardware before it fails entirely. It's possible to go for days, weeks, or even months with a disk that SMART says is problematic, but such disks are much more likely to go south very quickly than is a disk that gets a clean bill of health from a SMART test.
Edit: I posted just seconds after Jaglover. By "both of you" in my first paragraph, I'm referring to the first two posters. |
|
| Back to top |
|
 |
BillWho Veteran


Joined: 03 Mar 2012 Posts: 1576 Location: US
|
Posted: Mon Apr 09, 2012 11:15 pm Post subject: |
|
|
Robert S,
I've had similar errors on a disk for close to three years now. I have gentoo installed as test and break system so there's nothing important on it.
I saved the output of /usr/sbin/smartctl --log=error /dev/sdb and it still reports the exact same info today.
That disk could live another several years with no problems or it could crash and burn tomorrow.
If you have any critical data on it then for sure back it up - don't take any chances.
Good luck  |
|
| Back to top |
|
 |
Thistled Guru


Joined: 06 Jan 2011 Posts: 433 Location: Scotland
|
Posted: Mon Apr 09, 2012 11:26 pm Post subject: |
|
|
In my case all the disks which are reporting errors
| Code: | Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sda [SAT], 1 Offline uncorrectable sectors
Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sda [SAT], 1 Currently unreadable (pending) sectors
Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors
Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors
Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sdc [SAT], 5 Currently unreadable (pending) sectors |
are disks which were initially installed / utilised by Windoze. (i.e. they are ntfs)
These disks are not mounted at boot time. I mount these disks via nautilus, and they are used / shared between Windoze / Gentoo for
documents, pictures, music etc etc
In my case, I think the unreadable sector errors are because they are not mounted.
I think to ask palimpsest or other programs to repair, will bork my ntfs disks. _________________ Whatever you do, do it properly! |
|
| Back to top |
|
 |
Mad Merlin Veteran

Joined: 09 May 2005 Posts: 1134
|
Posted: Tue Apr 10, 2012 4:58 am Post subject: |
|
|
Those errors are exactly what they sound like, a sector is unreadable on the hard drive. That sector might be part of your swap file (probably won't matter) or it could be part of your /boot/grub/grub.conf (not so good). Reads to that sector will fail. The next write made to that sector will cause the hard drive to transparently remap that sector to another spare sector and everything will be normal again.
Now, hard drives have a relatively small number of spare sectors (think dozens), and eventually it will run out. What happens after that is left as an exercise to the reader. Ideally, you will replace the drive before you are able to find out.
This might sound bad, but bad sectors are a fact of life, just as are dead pixels on your monitor, hard drives will deal with them just fine in small quantities. In general, if you see a small number of offline uncorrectable sectors and that number is not rising over time, the drive is probably fine. If you see a number that's steadily (or quickly) rising over time, toss the drive, it's going to eat your data.
Of course, I would point out that I've seen plenty of drives die completely out of the blue (SMART had no complaints right up until the drive's block device disappeared). Consequently, it's always a good time to test your backups. _________________ Game! - Where the stick is mightier than the sword! |
|
| Back to top |
|
 |
Robert S Guru


Joined: 15 Aug 2004 Posts: 412 Location: Canberra Australia
|
Posted: Tue Apr 10, 2012 8:15 am Post subject: |
|
|
Here's the output.
| Code: | myserver robert # smartctl --all /dev/sda | grep -e "Reallocated_Sector_Ct" -e "Current_Pending_Sector" -e "Offline_Uncorrectable" -e "UDMA_CRC_Error_Count" -e "Hardware_ECC_Recovered"
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
195 Hardware_ECC_Recovered 0x001a 036 024 000 Old_age Always - 77071283
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 3
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 3
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
myserver robert # /usr/sbin/smartctl --log=error /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.2.12-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged
|
My problem is that i'm going overseas for a few weeks soon and I can't afford to have this bomb. It might be easier to bite the bullet and get another HD. |
|
| Back to top |
|
 |
Thistled Guru


Joined: 06 Jan 2011 Posts: 433 Location: Scotland
|
Posted: Wed Apr 11, 2012 12:34 pm Post subject: |
|
|
| Code: | | 195 Hardware_ECC_Recovered 0x001a 036 024 000 Old_age Always - 77071283 |
That particular line does give a little cause for concern.
Like you say, back up all important stuff on /dev/sda and probably would be a good idea to replace said disk.
I spent 4 hours last night going through all my windoze partitions, defragmenting and running scan disks. Windoze reported no problems with the disk / partitions, but as soon as I come back into Gentoo, smartd still throws out warnings.
My situation is more akin to BillWhos', as my errors are in the 1 - 5 range, and have been since the installation of a brand new disk so I am not worrying too much. _________________ Whatever you do, do it properly! |
|
| Back to top |
|
 |
BillWho Veteran


Joined: 03 Mar 2012 Posts: 1576 Location: US
|
Posted: Wed Apr 11, 2012 1:12 pm Post subject: |
|
|
Thistled,
I don't believe that you can attribute the errors to winblows. I have a winblows installation on my disk and no errors are reported with smartctl.
| Code: | Device Boot Start End Blocks Id System
/dev/sda1 63 20482874 10241406 27 Hidden NTFS WinRE
/dev/sda2 * 20484096 336990191 158253048 7 HPFS/NTFS/exFAT
|
| Code: | root@gentoo-gateway bill # /usr/sbin/smartctl --log=error /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.3.0-rc7] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged
|
This is the original installed hd with a vista installation along with a recovery partition and then later upgraded to win7. |
|
| Back to top |
|
 |
srs5694 Guru

Joined: 08 Mar 2004 Posts: 310 Location: Woonsocket, RI
|
Posted: Wed Apr 11, 2012 2:33 pm Post subject: |
|
|
I agree with Mad Merlin: Back up your data and either replace the drive ASAP or be prepared to lose it suddenly.
One more point: SMART tools work with the disk hardware itself to detect problems. As such, SMART works at a much lower level than filesystem drivers. SMART can detect errors in parts of the disk that are unused -- unused parts of a filesystem or even gaps between partitions. Thus, you can spend all day running fsck in Linux or defragmenting files in Windows and there's no guarantee that you'll touch the affected sectors. Likewise if the bad sectors are in the middle of a big file that happens not to be adjusted by a defragment operation.
The best way to ensure that you do something with a sector that's going bad is to do a raw write operation to the whole disk, as in:
| Code: |
dd if=/dev/zero of=/dev/sdb
|
This is, however, a destructive operation -- it zeroes out the entire disk! If your disk holds important data, you obviously don't want to do this. If you replace the disk, though, and you want to discover how bad it is and perhaps salvage some life from the disk in a non-critical capacity, you could do this and see what happens to the SMART test results. If the "pending sectors" count drops to 0, then it could be there were just a handful of bad sectors and the disk will be good for a while longer. If the values skyrocket, OTOH, then you'll know the disk was in bad shape and you replaced it just in time. (The latter happened to me recently, FWIW. Fortunately, the disk was still under warranty, so now I've got a replacement drive waiting to be used.) |
|
| Back to top |
|
 |
Thistled Guru


Joined: 06 Jan 2011 Posts: 433 Location: Scotland
|
Posted: Thu Apr 12, 2012 12:33 am Post subject: |
|
|
I am a little confused by all of this. This 1st disk is my Winblows disk, and is barely used by Linux, but I can mount it if I want to install any apps using Wine.
| Code: | Device Boot Start End Blocks Id System
/dev/sda1 * 63 41945714 20972826 7 HPFS/NTFS/exFAT
/dev/sda2 41945715 265168889 111611587+ 7 HPFS/NTFS/exFAT
/dev/sda3 265168890 488392064 111611587+ 7 HPFS/NTFS/exFAT
|
and smarctl reports the following:
| Code: | /usr/sbin/smartctl --log=error /dev/sda
smartctl 5.42 2011-10-20 r3458 [i686-linux-3.3.1-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 10 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 10 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 9b 4b 4c e2 Error: UNC at LBA = 0x024c4b9b = 38554523
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d8 01 9b 4b 4c e0 08 00:28:29.100 READ VERIFY SECTOR(S) EXT
42 d8 02 9d 4b 4c e0 08 00:28:29.100 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:29.100 READ DMA EXT
42 d8 02 9b 4b 4c e0 08 00:28:24.700 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:24.700 READ DMA EXT
Error 9 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 02 9b 4b 4c e2 Error: UNC at LBA = 0x024c4b9b = 38554523
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d8 02 9b 4b 4c e0 08 00:28:24.700 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:24.700 READ DMA EXT
25 d8 01 00 00 00 e0 08 00:28:24.700 READ DMA EXT
42 d8 04 9b 4b 4c e0 08 00:28:20.100 READ VERIFY SECTOR(S) EXT
42 d8 04 97 4b 4c e0 08 00:28:20.100 READ VERIFY SECTOR(S) EXT
Error 8 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 9b 4b 4c e2 Error: UNC at LBA = 0x024c4b9b = 38554523
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d8 04 9b 4b 4c e0 08 00:28:20.100 READ VERIFY SECTOR(S) EXT
42 d8 04 97 4b 4c e0 08 00:28:20.100 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:20.000 READ DMA EXT
42 d8 08 97 4b 4c e0 08 00:28:15.700 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:15.700 READ DMA EXT
Error 7 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 9b 4b 4c e2 Error: UNC at LBA = 0x024c4b9b = 38554523
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d8 08 97 4b 4c e0 08 00:28:15.700 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:15.700 READ DMA EXT
42 d8 08 8f 4b 4c e0 08 00:28:15.700 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:15.600 READ DMA EXT
42 d8 10 8f 4b 4c e0 08 00:28:11.200 READ VERIFY SECTOR(S) EXT
Error 6 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 9b 4b 4c e2 Error: UNC at LBA = 0x024c4b9b = 38554523
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d8 10 8f 4b 4c e0 08 00:28:11.200 READ VERIFY SECTOR(S) EXT
42 d8 10 7f 4b 4c e0 08 00:28:11.200 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:11.200 READ DMA EXT
42 d8 20 7f 4b 4c e0 08 00:28:06.700 READ VERIFY SECTOR(S) EXT
25 d8 01 00 00 00 e0 08 00:28:06.700 READ DMA EXT
|
For my "main" Linux disk. i.e. Boot Swap and Root:
| Code: | Device Boot Start End Blocks Id System
/dev/sdb1 * 63 417689 208813+ 83 Linux
/dev/sdb2 417690 4401809 1992060 82 Linux swap / Solaris
/dev/sdb3 4401810 312576704 154087447+ 83 Linux |
smartctl reports:
| Code: | /usr/sbin/smartctl --log=error /dev/sdb
smartctl 5.42 2011-10-20 r3458 [i686-linux-3.3.1-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 166 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 166 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ed ed ab ea Error: UNC at LBA = 0x0aabeded = 179039725
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 ea ed ab ea 00 03:00:24.681 READ DMA
27 00 00 00 00 00 e0 00 03:00:24.681 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 03:00:24.623 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 03:00:24.622 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 03:00:21.710 READ NATIVE MAX ADDRESS EXT
Error 165 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ed ed ab ea Error: UNC at LBA = 0x0aabeded = 179039725
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 ea ed ab ea 00 03:00:18.565 READ DMA
27 00 00 00 00 00 e0 00 03:00:15.546 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 03:00:15.546 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 03:00:15.546 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 03:00:21.710 READ NATIVE MAX ADDRESS EXT
Error 164 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ed ed ab ea Error: UNC at LBA = 0x0aabeded = 179039725
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 ea ed ab ea 00 03:00:18.565 READ DMA
27 00 00 00 00 00 e0 00 03:00:15.546 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 03:00:15.546 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 03:00:15.546 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 03:00:15.546 READ NATIVE MAX ADDRESS EXT
Error 163 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ed ed ab ea Error: UNC at LBA = 0x0aabeded = 179039725
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 ea ed ab ea 00 03:00:15.545 READ DMA
ca 00 10 ba 30 9d ea 00 03:00:15.546 WRITE DMA
ca 00 08 82 4b 9e ea 00 03:00:15.546 WRITE DMA
ca 00 08 d2 4b 9e ea 00 03:00:15.546 WRITE DMA
ca 00 08 7a 44 9b ea 00 03:00:15.546 WRITE DMA
Error 162 occurred at disk power-on lifetime: 17827 hours (742 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 bd 82 31 ed Error: UNC at LBA = 0x0d3182bd = 221348541
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 ba 82 31 ed 00 02:17:54.884 READ DMA
27 00 00 00 00 00 e0 00 02:17:54.828 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 02 02:17:54.825 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 02 02:17:51.922 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 02:17:51.854 READ NATIVE MAX ADDRESS EXT
|
and finally, the disk which is a combination of ntfs and ext3, which is my Linux /home partition (sdc2):
| Code: | Device Boot Start End Blocks Id System
/dev/sdc1 63 244187999 122093968+ 7 HPFS/NTFS/exFAT
/dev/sdc2 244188000 349044254 52428127+ 83 Linux
/dev/sdc3 349044255 418718159 34836952+ 7 HPFS/NTFS/exFAT
/dev/sdc4 418718160 488392064 34836952+ 7 HPFS/NTFS/exFAT
|
smartctl reports:
| Code: | /usr/sbin/smartctl --log=error /dev/sdc
smartctl 5.42 2011-10-20 r3458 [i686-linux-3.3.1-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged
|
So what is with all of the Errors which
| Code: | | occurred at disk power-on lifetime |
? _________________ Whatever you do, do it properly! |
|
| Back to top |
|
 |
Jaglover Advocate


Joined: 29 May 2005 Posts: 3979 Location: Saint Amant, Acadiana
|
Posted: Thu Apr 12, 2012 12:49 am Post subject: |
|
|
Alright, have you run self-test on this drive? Did it finish? If the drive is bad the test usually will not accomplish.
Below is a sample of a healthy drive.
| Code: | SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4742 -
|
_________________ Please learn how to denote units correctly! |
|
| Back to top |
|
 |
Thistled Guru


Joined: 06 Jan 2011 Posts: 433 Location: Scotland
|
Posted: Thu Apr 12, 2012 1:00 am Post subject: |
|
|
Well palimpsest reports I have 12 bad sectors on both sdb and sdc
and
| Code: | pig ~ # smartctl --attributes --log=selftest --quietmode=errorsonly /dev/sda
pig ~ # smartctl --attributes --log=selftest --quietmode=errorsonly /dev/sdb
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 33311 221348541
# 2 Extended offline Completed: read failure 90% 33311 221348541
# 3 Short offline Completed: read failure 80% 33310 221348541
# 4 Short offline Completed: read failure 80% 24093 221348541
# 5 Short offline Completed: read failure 80% 23407 221348541
# 6 Short offline Completed: read failure 80% 22024 221348541
# 7 Short offline Completed: read failure 80% 22024 221348541
# 8 Short offline Completed: read failure 80% 22024 221348541
# 9 Short offline Completed: read failure 80% 20657 221348541
#10 Short offline Completed: read failure 80% 19723 221348541
#11 Short offline Completed: read failure 80% 19070 221348541
#12 Short offline Completed: read failure 80% 18120 221348541
pig ~ # smartctl --attributes --log=selftest --quietmode=errorsonly /dev/sdc
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 39657 6163165
# 2 Short offline Completed: read failure 90% 31057 6163165
# 3 Extended offline Completed: read failure 90% 31057 6163165
# 4 Short offline Completed: read failure 90% 30648 6163165
# 5 Short offline Completed: read failure 90% 30648 6163165
# 6 Short offline Completed: read failure 90% 28018 6163165
# 7 Short offline Completed: read failure 90% 12615 250373360
# 8 Extended offline Completed: read failure 90% 12615 250373360
# 9 Short offline Completed: read failure 90% 12615 250373360
#10 Extended offline Completed: read failure 90% 12612 250373360
#11 Short offline Completed: read failure 90% 12612 250373360
#12 Short offline Completed: read failure 90% 12612 250373360
#13 Short offline Completed: read failure 90% 11543 250373360
#14 Extended offline Completed: read failure 90% 10086 250373360
#15 Short offline Completed: read failure 90% 10079 250373360
#16 Short offline Completed: read failure 90% 10079 250373360
|
_________________ Whatever you do, do it properly! |
|
| Back to top |
|
 |
Hu Watchman

Joined: 06 Mar 2007 Posts: 7616
|
Posted: Thu Apr 12, 2012 2:20 am Post subject: |
|
|
| Thistled wrote: | So what is with all of the Errors which | Code: | | occurred at disk power-on lifetime | ? | The drive failed to complete a command that was sent to it by the OS. This is a bad sign. The "disk power-on lifetime" bit is so you can determine whether the error was reported yesterday or last year. The drive tells you how many power-on hours it has accumulated, so you can work out from that how recently an error occurred. |
|
| Back to top |
|
 |
Thistled Guru


Joined: 06 Jan 2011 Posts: 433 Location: Scotland
|
Posted: Wed Apr 18, 2012 10:19 pm Post subject: |
|
|
But this has been like this since the day I bought the disk.
Palimpsest has always reported the current pending sector error.
Like I said in earlier posts, all my important stuff is backed up on my server, I am kind of taking that same approach as BillWho.
| Quote: | Robert S,
I've had similar errors on a disk for close to three years now. I have gentoo installed as test and break system so there's nothing important on it.
I saved the output of /usr/sbin/smartctl --log=error /dev/sdb and it still reports the exact same info today.
That disk could live another several years with no problems or it could crash and burn tomorrow.
If you have any critical data on it then for sure back it up - don't take any chances.
Good luck |
I would not be surprised to discover this is because I am overclocking a 2.77Ghz to 3.16Ghz, as I am fully aware overclocking can put a stress on gear. _________________ Whatever you do, do it properly! |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|