Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
trouble with kernel RAID.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
duby2291
Guru
Guru


Joined: 17 Oct 2004
Posts: 583

PostPosted: Wed Jan 16, 2013 10:46 pm    Post subject: trouble with kernel RAID. Reply with quote

I'm not certain what is going on here. I am having trouble with getting a RAID0 working, and everything seems ok until it locks up and switches to readonly.

Code:
TheBeast gentoo # mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Wed Jan 16 13:02:25 2013
     Raid Level : raid0
     Array Size : 228062208 (217.50 GiB 233.54 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Wed Jan 16 13:02:25 2013
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3


All I can get from smartctrl is garbage... it is the same for sda sdb and sdc

Code:
TheBeast gentoo # smartctl -i -H -c /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.6.11-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

Vendor:               /0:0:0:0
Product:             
User Capacity:        600,332,565,813,390,450 bytes [600 PB]
Logical block size:   774843950 bytes
scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46
>> Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.


Running fsck gets me this...

Code:
TheBeast mnt # fsck /dev/md0
fsck from util-linux 2.21.2
e2fsck 1.42 (29-Nov-2011)
fsck.ext2: Attempt to read block from filesystem resulted in short read while trying to open /dev/md0
Could this be a zero-length partition?


The strangest thing though is that it works fine if I reboot. But sooner or later it will lock up and do this. I don't have any explanation at all. What do you guys think. The drives all check out as good and they seem to work indefinitely as long as they are not in the array. I've already used dd to write zeros to all three drives. then rebuilt the array and I still have the same problem.

EDIT: oh yeah and after this thing happens dmesg fills up with a bunch of weird crap....

Code:
[21597.059294] Result: hostbyte=0x04 driverbyte=0x00
[21597.059295] sd 0:0:0:0: [sda] CDB:
[21597.059295] cdb[0]=0x28: 28 00 00 40 fd c9 00 00 08 00


This is just repeated over and over and some of them say sdb and some say sdc.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54096
Location: 56N 3W

PostPosted: Thu Jan 17, 2013 7:30 pm    Post subject: Reply with quote

duby2291,

Tell us about the PSU (make and part number) and how the drives are wired to the PSU.
Does your PSU have one or two +12v supplies.

The smart data may be useful if you capture it before this happens too, when you can still get useful info from the drives..
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9645
Location: almost Mile High in the USA

PostPosted: Thu Jan 17, 2013 10:58 pm    Post subject: Reply with quote

A bit off topic here: Incidentally this was somewhat of a shock to me.

I took apart a failed Antec Truepower Trio 650 that a friend gave me, where "trio" implies that it had three 12V rails. When I traced the tracks on the bottom of the board, there were indeed three groups of 12V yellow wires... but they all eventually connected together to one common point after the main transformer/rectifier... The only thing that was truly "trio" was there indeed was three monitor points on each of the "rails"... but they were all the same output of the transformer...

Not exactly what I was expecting. But oh well.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
duby2291
Guru
Guru


Joined: 17 Oct 2004
Posts: 583

PostPosted: Thu Jan 17, 2013 11:08 pm    Post subject: Reply with quote

Thanks for the reply guys. This is my power supply

http://www.newegg.com/Product/Product.aspx?Item=N82E16817153136

Also this is something new that only started after the first crash... This message didnt exst until after the first crash happened.

Code:
[    7.214035] md: Autodetecting RAID arrays.
[    7.237521] md: invalid raid superblock magic on sdc3
[    7.237524] md: sdc3 does not have a valid v0.90 superblock, not importing!
[    7.266265] md: invalid raid superblock magic on sda3
[    7.266269] md: sda3 does not have a valid v0.90 superblock, not importing!
[    7.294346] md: invalid raid superblock magic on sdb3
[    7.294349] md: sdb3 does not have a valid v0.90 superblock, not importing!
[    7.294356] md: Scanned 3 and added 0 devices.


Ok and here is the smartctl for all three drives in the array

Code:

TheBeast duby229 # smartctl -i -H -c /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.6.11-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10
Device Model:     ST380815AS
Serial Number:    9QZ8PG5B
Firmware Version: 4.AAB
User Capacity:    80,026,361,856 bytes [80.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Jan 17 18:06:02 2013 Local time zone must be set--see zic m
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  430) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   1) minutes.
Extended self-test routine
recommended polling time:     (  27) minutes.

TheBeast duby229 # smartctl -i -H -c /dev/sdb
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.6.11-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10
Device Model:     ST380815AS
Serial Number:    9QZ8MPZ7
Firmware Version: 4.AAB
User Capacity:    80,026,361,856 bytes [80.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Jan 17 18:06:35 2013 Local time zone must be set--see zic m
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  430) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   1) minutes.
Extended self-test routine
recommended polling time:     (  27) minutes.

TheBeast duby229 # smartctl -i -H -c /dev/sdc
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.6.11-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.9
Device Model:     ST3808110AS
Serial Number:    5LR2PYFD
Firmware Version: 3.AAE
User Capacity:    80,026,361,856 bytes [80.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Jan 17 18:06:53 2013 Local time zone must be set--see zic m
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  430) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   1) minutes.
Extended self-test routine
recommended polling time:     (  27) minutes.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9645
Location: almost Mile High in the USA

PostPosted: Thu Jan 17, 2013 11:46 pm    Post subject: Reply with quote

In recent times my "hack" RAIDs (one 4-disk SATA and PATA RAID5s) a disk "failed" when either power or the data connector loosened due to age or corrosion. It's worth to check. After cleaning the connectors and reattaching, the disk came back up fine. But in your RAID0 case, data is at risk... You may need to force-reattach the disk to the raid or recreate the RAID.

I have one 2-disk RAID1 in an SCA hotswap carrier. This has been much more reliable...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
duby2291
Guru
Guru


Joined: 17 Oct 2004
Posts: 583

PostPosted: Fri Jan 18, 2013 12:59 am    Post subject: Reply with quote

I don't have any data on it. It will be my gentoo / partition if I can manage to get it working properly.

I decided to replace the sata data cables on these drives to make sure that wasnt the problem, and since my PSU is modular I plugged in a different sata power cable in to the three drives. But the problem persists. It can't be the sata data cables or the sata power cables because I just replaced them after I saw your post but the problem remains.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54096
Location: 56N 3W

PostPosted: Sat Jan 19, 2013 10:58 am    Post subject: Reply with quote

duby2291,

Please post the output of
Code:
mdadm -E /dev/sd[abc]3


I should have asked for smartctl -a too, se we get to see the drives internal error log
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
duby2291
Guru
Guru


Joined: 17 Oct 2004
Posts: 583

PostPosted: Sat Jan 19, 2013 7:04 pm    Post subject: Reply with quote

I've just recently rebooted so this is the command taken before the crash has happened....

Code:
TheBeast duby229 # mdadm -E /dev/sd[abc]3
/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9ed72583:59d43431:c5ce8183:d306fc00
           Name : TheBeast:0  (local to host TheBeast)
  Creation Time : Wed Jan 16 13:02:25 2013
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 152042215 (72.50 GiB 77.85 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : fe229a52:be8ba0cb:dd385d09:c7f8f346

    Update Time : Wed Jan 16 13:02:25 2013
       Checksum : b265fd1a - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing)
/dev/sdb3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9ed72583:59d43431:c5ce8183:d306fc00
           Name : TheBeast:0  (local to host TheBeast)
  Creation Time : Wed Jan 16 13:02:25 2013
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 152042215 (72.50 GiB 77.85 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1862ea15:3378842a:f94f3d5e:1944c852

    Update Time : Wed Jan 16 13:02:25 2013
       Checksum : 354e8b18 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing)
/dev/sdc3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9ed72583:59d43431:c5ce8183:d306fc00
           Name : TheBeast:0  (local to host TheBeast)
  Creation Time : Wed Jan 16 13:02:25 2013
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 152042215 (72.50 GiB 77.85 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : ea596e0b:5a37b8b7:a4859408:e2479f5c

    Update Time : Wed Jan 16 13:02:25 2013
       Checksum : 6c347b86 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA ('A' == active, '.' == missing)
TheBeast duby229 #


Code:
TheBeast duby229 # smartctl -a /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.6.11-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10
Device Model:     ST380815AS
Serial Number:    9QZ8PG5B
Firmware Version: 4.AAB
User Capacity:    80,026,361,856 bytes [80.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sat Jan 19 14:05:08 2013 Local time zone must be set--see zic m
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  430) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   1) minutes.
Extended self-test routine
recommended polling time:     (  27) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   253   006    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   098   097   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       924
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   079   060   030    Pre-fail  Always       -       96367497
  9 Power_On_Hours          0x0032   077   077   000    Old_age   Always       -       20605
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       861
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   059   048   045    Old_age   Always       -       41 (Min/Max 40/42)
194 Temperature_Celsius     0x0022   041   052   000    Old_age   Always       -       41 (0 21 0 0 0)
195 Hardware_ECC_Recovered  0x001a   080   065   000    Old_age   Always       -       58184742
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       3
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     20420         -
# 2  Short offline       Completed without error       00%     11183         -
# 3  Short offline       Interrupted (host reset)      80%      8015         -
# 4  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

TheBeast duby229 # smartctl -a /dev/sdb
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.6.11-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10
Device Model:     ST380815AS
Serial Number:    9QZ8MPZ7
Firmware Version: 4.AAB
User Capacity:    80,026,361,856 bytes [80.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sat Jan 19 14:05:10 2013 Local time zone must be set--see zic m
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  430) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   1) minutes.
Extended self-test routine
recommended polling time:     (  27) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   253   006    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   098   097   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       907
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   080   060   030    Pre-fail  Always       -       101245796
  9 Power_On_Hours          0x0032   077   077   000    Old_age   Always       -       20597
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       847
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   061   048   045    Old_age   Always       -       39 (Min/Max 38/41)
194 Temperature_Celsius     0x0022   039   052   000    Old_age   Always       -       39 (0 20 0 0 0)
195 Hardware_ECC_Recovered  0x001a   075   062   000    Old_age   Always       -       147418486
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       2
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     20412         -
# 2  Short offline       Completed without error       00%     11182         -
# 3  Short offline       Interrupted (host reset)      80%      8013         -
# 4  Short offline       Completed without error       00%      2552         -
# 5  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

TheBeast duby229 # smartctl -a /dev/sdc
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.6.11-gentoo] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.9
Device Model:     ST3808110AS
Serial Number:    5LR2PYFD
Firmware Version: 3.AAE
User Capacity:    80,026,361,856 bytes [80.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sat Jan 19 14:05:13 2013 Local time zone must be set--see zic m
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (  430) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   1) minutes.
Extended self-test routine
recommended polling time:     (  27) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   107   079   006    Pre-fail  Always       -       209686992
  3 Spin_Up_Time            0x0003   099   099   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       607
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   082   060   030    Pre-fail  Always       -       184955433
  9 Power_On_Hours          0x0032   068   068   000    Old_age   Always       -       28466
 10 Spin_Retry_Count        0x0013   100   098   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       873
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   098   098   000    Old_age   Always       -       2
190 Airflow_Temperature_Cel 0x0022   065   048   045    Old_age   Always       -       35 (Min/Max 34/36)
194 Temperature_Celsius     0x0022   035   052   000    Old_age   Always       -       35 (0 17 0 0 0)
195 Hardware_ECC_Recovered  0x001a   050   045   000    Old_age   Always       -       102318969
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   196   000    Old_age   Always       -       12
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     28393         -
# 2  Short offline       Interrupted (host reset)      80%     28393         -
# 3  Short offline       Completed without error       00%     22053         -
# 4  Extended offline    Completed without error       00%      4130         -
# 5  Short offline       Completed without error       00%      4129         -
# 6  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54096
Location: 56N 3W

PostPosted: Sat Jan 19, 2013 7:28 pm    Post subject: Reply with quote

duby2291

Code:
# mdadm -E /dev/sd[abc]3
shows
Code:
Version : 1.2

That means you made your raid set with version 1.2 raid superblocks.

Your
dmesg:
[    7.214035] md: Autodetecting RAID arrays.
[    7.237521] md: invalid raid superblock magic on sdc3
[    7.237524] md: sdc3 does not have a valid v0.90 superblock, not importing!
[    7.266265] md: invalid raid superblock magic on sda3
[    7.266269] md: sda3 does not have a valid v0.90 superblock, not importing!
[    7.294346] md: invalid raid superblock magic on sdb3
[    7.294349] md: sdb3 does not have a valid v0.90 superblock, not importing!
[    7.294356] md: Scanned 3 and added 0 devices.
Is quite correct. Unfortunately, kernel raid auto assembly works only with raid superblock Version 0.9.
You have two choices remake the raid set with version 0.9 superblocks, or use mdadm in a initrd to assemble your raid before you mount root.
Remaking the raid will destroy the data it contains.

Where is grub?
Grub must be on a non raided volume or raid1 with version 0.9 superblocks. The reason is that grub just ignores raid at boot time.

Apart from looking a bit old
Code:
   9 Power_On_Hours          0x0032   068   068   000    Old_age   Always       -       28466
your drives look healthy.

None of these things explain the dmesg errors you get in the chroot.

Those drives are SATA II, thats 3Gbit/sec. Do you have a SATA II motherboard?
They should fall back to SATA I but its doesn't always work well. Put your entire dmesg on a pastebin site please.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
duby2291
Guru
Guru


Joined: 17 Oct 2004
Posts: 583

PostPosted: Sat Jan 19, 2013 9:38 pm    Post subject: Reply with quote

Grub will be on sda1... sdb1 and sdc1 exist but won't be used they are only there to keep the partition table identical across all three drives. sd(abc)2 will be swap space with the same priorities.

OK and so if I use genkernel to create a initramfs with raid will that be sufficient? Or should I just recreate the array with the older version superblock?

EDIT: this is my motherboard. It doesnt say specifically but I believe ati sb710 SB does support SATAII
http://www.newegg.com/Product/Product.aspx?Item=N82E16813128397


Last edited by duby2291 on Sat Jan 19, 2013 9:51 pm; edited 2 times in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54096
Location: 56N 3W

PostPosted: Sat Jan 19, 2013 9:42 pm    Post subject: Reply with quote

duby2291,

I've never used genkernel but I believe it makes an initrd containing mdadm if you ask it to.

We have not got to the bottom of your raid issue yet, your entire dmesg may well be useful as well as your lspci output.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
duby2291
Guru
Guru


Joined: 17 Oct 2004
Posts: 583

PostPosted: Tue Jan 22, 2013 5:29 am    Post subject: Reply with quote

Just to update you guys I think the problem has to be with the hardware somehow. First I wrote 0's to all three drives and then I created an array with the bios and installed windows on it. And I had exactly the same problem. I still don't think it is the drives. I think it must be the sata controller on the motherboard.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1990
Location: Poland

PostPosted: Tue Jan 22, 2013 8:17 am    Post subject: Reply with quote

I had the same "600 petabytes" problem some time ago and it was the SATA controller that failed. What was funny, it kinda pretended to work for some time (minutes/hours) just after hard reset.

EDIT: see here, https://forums.gentoo.org/viewtopic-t-896382-highlight-.html
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum