Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED]Identical RAID Cards, Two RAID 10 Arrays, Different?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Crimjob
Tux's lil' helper
Tux's lil' helper


Joined: 04 Dec 2006
Posts: 109

PostPosted: Tue Oct 16, 2018 6:05 am    Post subject: [SOLVED]Identical RAID Cards, Two RAID 10 Arrays, Different? Reply with quote

Hello all!

I am quite stumped here. I have been using this exact hardware on an old install of Windows Server just fine with 22 drives spread over 2 different RAID cards in various RAID 5 configurations. I finally decided to bite the bullet after a bad experience with RAID 5 in the past, and I really didn't feel like paying for Microsoft's latest and greatest, so I decided to go with Gentoo on this box along with re-creating all the RAID arrays using RAID 10 instead of RAID 5.

Things were off to a good start with the first controller, I was able to mirror my two SSD's, and RAID 10 another eight 2TB drives. No problems. I go to fire up the second controller and attached drives, and I have some failures left over from my old RAID 5 configurations. No big deal. I note the serials, shut down, disconnect the second controller and all the drives temporarily, and I ordered some replacements for the failed drives. Replacements show, so naturally I go to build the largest array I can. No dice. Okay maybe I was being greedy, it is an older RAID controller, let's clone the previous configuration. I remove all but 8 drives from the second controller (so both controllers should have 2x RAID10 arrays of eight 2TB drives each). Same result - one controller produces desired results, the other controllers results are quite strange.

I have tried resetting the controller to factory defaults, cloning the configuration of the first card, double checking configurations, lost count of the number of reboots I've done. I cannot seem to figure out what is going on here!

For whatever reason, the RAID 10 array added on the second controller show up as the controllers RAID subgroups within Gentoo. I.E. RAID 10 would create four pairs of mirrored drives and stripe them as four subgroups, I am seeing the mirrors but not the main stripe. Honestly not sure if that's the correct term as I've had a hell of a time finding any further information or troubleshooting steps online.

I have installed the CLI tools for my RAID controller, and was even able to create the RAID 10 array live from the CLI with clean drives, same issue. I have 2x 3ware 9550SXU RAID Contollers, 2x Samsung SSD's, and 20x WD Red 2TB. I have tried various other configurations in terms of number of drives (6, 8, 10, 12) but they all produce ultimately the same results (3 mirrors shown instead of stripe, 4 mirrors shown instead of stripe, 5 mirrors shown instead of stripe, 6 mirrors shown instead of stripe).

My recent work.

Creating the RAID array from CLI
Code:
# tw_cli maint createunit c1 rraid10 p0:1:2:3:4:5:6:7
Creating new unit on controller /c1 ... Done. The new unit is /c1/u0.
Setting default Storsave policy to [balance] for the new unit ... Done.
Setting default Command Queuing policy for unit /c1/u0 to [on] ... Done.
Setting write cache = ON for the new unit ... Done.


Controller 1 inventory (this controller has the issue)
Code:
# tw_cli /c1 show all
/c1 Driver Version = 2.26.02.014
/c1 Model = 9550SXU-12
/c1 Available Memory = 224MB
/c1 Firmware Version = FE9X 3.08.00.029
/c1 Bios Version = BE9X 3.10.00.003
/c1 Boot Loader Version = BL9X 3.02.00.001
/c1 Serial Number = L320705A7500251
/c1 PCB Version = Rev 032
/c1 PCHIP Version = 1.60
/c1 ACHIP Version = 1.90
/c1 Number of Ports = 12
/c1 Number of Drives = 12
/c1 Number of Units = 1
/c1 Total Optimal Units = 1
/c1 Not Optimal Units = 0
/c1 JBOD Export Policy = off
/c1 Disk Spinup Policy = 1
/c1 Spinup Stagger Time Policy (sec) = 1
/c1 Auto-Carving Policy = on
/c1 Auto-Carving Size = 2048 GB
/c1 Auto-Rebuild Policy = on
/c1 Rebuild Rate = 1
/c1 Verify Rate = 1
/c1 Controller Bus Type = PCIX
/c1 Controller Bus Width = 64 bits
/c1 Controller Bus Speed = 100 Mhz

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   OK             -       -       256K    7450.54   ON     OFF

Port   Status           Unit   Size        Blocks   
-----------------------------------------------------
p0     OK               u0     1.82 TB     3907029168
p1     OK               u0     1.82 TB     3907029168
p2     OK               u0     1.82 TB     3907029168
p3     OK               u0     1.82 TB     3907029168
p4     OK               u0     1.82 TB     3907029168
p5     OK               u0     1.82 TB     3907029168
p6     OK               u0     1.82 TB     3907029168
p7     OK               u0     1.82 TB     3907029168
p8     OK               -      1.82 TB     3907029168
p9     OK               -      1.82 TB     3907029168
p10    OK               -      1.82 TB     3907029168
p11    OK               -      1.82 TB     3907029168


Controller 0 inventory (this controller works fine)
Code:
# tw_cli /c0 show all
/c0 Driver Version = 2.26.02.014
/c0 Model = 9550SXU-12
/c0 Available Memory = 224MB
/c0 Firmware Version = FE9X 3.08.00.029
/c0 Bios Version = BE9X 3.10.00.003
/c0 Boot Loader Version = BL9X 3.02.00.001
/c0 Serial Number = L320704A7450278
/c0 PCB Version = Rev 032
/c0 PCHIP Version = 1.60
/c0 ACHIP Version = 1.90
/c0 Number of Ports = 12
/c0 Number of Drives = 10
/c0 Number of Units = 2
/c0 Total Optimal Units = 2
/c0 Not Optimal Units = 0
/c0 JBOD Export Policy = off
/c0 Disk Spinup Policy = 4
/c0 Spinup Stagger Time Policy (sec) = 1
/c0 Auto-Carving Policy = off
/c0 Auto-Carving Size = 2048 GB
/c0 Auto-Rebuild Policy = on
/c0 Rebuild Rate = 1
/c0 Verify Rate = 1
/c0 Controller Bus Type = PCIX
/c0 Controller Bus Width = 64 bits
/c0 Controller Bus Speed = 100 Mhz

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -       167.628   OFF    OFF
u1    RAID-10   OK             -       -       256K    7450.54   ON     OFF

Port   Status           Unit   Size        Blocks   
-----------------------------------------------------
p0     OK               u0     167.68 GB   351651888
p1     OK               u0     167.68 GB   351651888
p2     OK               u1     1.82 TB     3907029168
p3     OK               u1     1.82 TB     3907029168
p4     OK               u1     1.82 TB     3907029168
p5     OK               u1     1.82 TB     3907029168
p6     OK               u1     1.82 TB     3907029168
p7     OK               u1     1.82 TB     3907029168
p8     OK               u1     1.82 TB     3907029168
p9     OK               u1     1.82 TB     3907029168
p10    NOT-PRESENT      -      -           -         
p11    NOT-PRESENT      -      -           -         


Appears Firmware, BIOS, Boot Code etc. are all matching between the cards, block size matches among the drives. RAID controller reports the correct size for both RAID 10 arrays (7450.54GB).

So I try to re-scan SCSI, it picks up four drives for controller 1, but only one for controller 0. I don't understand how this can be? The only difference I can see, is that controller 0's RAID arrays show up as "SCSI IDs", whereas controller 1's RAID arrays show up as "SCSI LUNs" (controller 0 shows ID 00 LUN 00, ID 01 LUN 00, whereas controller 1 shows ID 00 LUN 00, ID 00 LUN 01, etc.). I'll admit I don't completely understand what that means and I'm not sure if it's relevant.

Code:
# rescan-scsi-bus
Host adapter 0 ((null)) found.
Host adapter 1 ((null)) found.
Host adapter 2 (ahci) found.
Host adapter 3 (ahci) found.
Host adapter 4 (ahci) found.
Host adapter 5 (ahci) found.
Host adapter 6 (ahci) found.
Host adapter 7 (ahci) found.
Host adapter 8 (ata_piix) found.
Host adapter 9 (ata_piix) found.
Scanning SCSI subsystem for new devices
Scanning host 0 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
 Scanning for device 0 0 0 0 ...
NEW: Host: scsi0 Channel: 00 Id: 00 Lun: 00
      Vendor: AMCC     Model: 9550SXU-12 DISK  Rev: 3.08
      Type:   Direct-Access                    ANSI SCSI revision: 05
 Scanning for device 0 0 0 1 ...
NEW: Host: scsi0 Channel: 00 Id: 00 Lun: 01
      Vendor: AMCC     Model: 9550SXU-12 DISK  Rev: 3.08
      Type:   Direct-Access                    ANSI SCSI revision: 05
 Scanning for device 0 0 0 2 ...
NEW: Host: scsi0 Channel: 00 Id: 00 Lun: 02
      Vendor: AMCC     Model: 9550SXU-12 DISK  Rev: 3.08
      Type:   Direct-Access                    ANSI SCSI revision: 05
 Scanning for device 0 0 0 3 ...
NEW: Host: scsi0 Channel: 00 Id: 00 Lun: 03
      Vendor: AMCC     Model: 9550SXU-12 DISK  Rev: 3.08
      Type:   Direct-Access                    ANSI SCSI revision: 05
Scanning host 1 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
 Scanning for device 1 0 0 0 ...
OLD: Host: scsi1 Channel: 00 Id: 00 Lun: 00
      Vendor: AMCC     Model: 9550SXU-12 DISK  Rev: 3.08
      Type:   Direct-Access                    ANSI SCSI revision: 05
 Scanning for device 1 0 1 0 ...
OLD: Host: scsi1 Channel: 00 Id: 01 Lun: 00
      Vendor: AMCC     Model: 9550SXU-12 DISK  Rev: 3.08
      Type:   Direct-Access                    ANSI SCSI revision: 05
Scanning host 2 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
Scanning host 3 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
Scanning host 4 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
Scanning host 5 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
Scanning host 6 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
Scanning host 7 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
Scanning host 8 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
 Scanning for device 8 0 0 0 ...
OLD: Host: scsi8 Channel: 00 Id: 00 Lun: 00
      Vendor: TEAC     Model: DVD-ROM DV28EV   Rev: D.AE
      Type:   CD-ROM                           ANSI SCSI revision: 05
Scanning host 9 for  SCSI target IDs  0 1 2 3 4 5 6 7, all LUNs
4 new device(s) found.
0 device(s) removed.


fdisk output is even stranger, showing three 2TB drives and one 1.3TB drive (however it all mostly matches up when you consider actual size of 1.82TB per disk)

Code:
# fdisk -l
Disk /dev/sdb: 7.3 TiB, 7999955402752 bytes, 15624912896 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 50C73AA3-59A7-472D-89A6-37237009B0DC

Device     Start         End     Sectors  Size Type
/dev/sdb1   2048 15624910847 15624908800  7.3T Linux filesystem


Disk /dev/sda: 167.6 GiB, 179989118976 bytes, 351541248 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 24DB6F13-D89B-403F-AEF5-E0E24AD5C787

Device       Start       End   Sectors  Size Type
/dev/sda1     2048      6143      4096    2M BIOS boot
/dev/sda2     6144    268287    262144  128M EFI System
/dev/sda3   268288   1316863   1048576  512M Linux filesystem
/dev/sda4  1316864 351539199 350222336  167G Linux filesystem


Disk /dev/sdc: 2 TiB, 2199023255040 bytes, 4294967295 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdd: 2 TiB, 2199023255040 bytes, 4294967295 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sde: 2 TiB, 2199023255040 bytes, 4294967295 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdf: 1.3 TiB, 1402885637632 bytes, 2740011011 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Has anyone seen anything like this before? Did I screw something up? Did I hit a bug? I'm not sure what else to try, and I'm running out of hair 8O
_________________
"Who are you to judge the life I live? I know I'm not perfect and I don't live to be, but before you start pointing fingers... make sure your hands are clean." ~Bob Marley


Last edited by Crimjob on Tue Oct 16, 2018 10:11 pm; edited 1 time in total
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 17455

PostPosted: Tue Oct 16, 2018 8:30 pm    Post subject: Reply with quote

I don't have experience with these controllers, but here are some thoughts.

I'd try to set Controller 1's policies for Disk Spinup and Auto-Carving to match controller 0, then retry.

Some differences:

Quote:
Controller 1 inventory (this controller has the issue)

/c1 Number of Ports = 12
/c1 Number of Drives = 12
/c1 Number of Units = 1

/c1 Disk Spinup Policy = 1

/c1 Auto-Carving Policy = on

Controller 0 inventory (this controller works fine)

/c0 Number of Ports = 12
/c0 Number of Drives = 10
/c0 Number of Units = 2

/c0 Disk Spinup Policy = 4

/c0 Auto-Carving Policy = off


If that doesn't work, then I'd try matching the physical disk layout to be exactly the same. For testing, 0 SSD on either controller, or 1 SSD per controller.

Or maybe putting an SSD on a motherboard controller only (if available). This keeps the exact physical disk arrangement on both controllers 0 and 1. Or maybe even using a USB drive for the OS just for testing.
_________________
Slowly I turned. Step by step.
Back to top
View user's profile Send private message
Crimjob
Tux's lil' helper
Tux's lil' helper


Joined: 04 Dec 2006
Posts: 109

PostPosted: Tue Oct 16, 2018 10:09 pm    Post subject: Reply with quote

Thank You pjp!

You nailed it!

I swear I printed those "show all" outputs off several times and compared line by line. I must have gotten them mixed up somewhere. I did not reset controller 0 (as it was working fine) but I did reset controller 1, and apparently while I noticed the Auto Carving value remained at 2GB for both cards, for some reason it is by default enabled. With my Windows setup, I only ever had arrays of 2TB on there so I never would have noticed.

I turned off Auto Carving and deleted the RAID 10 array / unit from Controller 1, recreated it and BOOM I have a fully / properly sized RAID 10 array (I was even able to do the 12 drive RAID 10 as I initially planned but thought wouldn't work).

Thank you so much for your assistance! It always seems to be some seemingly minor detail I glaze over at 3AM that catches me :)
_________________
"Who are you to judge the life I live? I know I'm not perfect and I don't live to be, but before you start pointing fingers... make sure your hands are clean." ~Bob Marley
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 17455

PostPosted: Tue Oct 16, 2018 10:12 pm    Post subject: Reply with quote

You're welcome. I'm glad that was actually it.

I've been in similar situations. Often it just takes a second set of eyes that haven't been staring at the same thing for so long.
_________________
Slowly I turned. Step by step.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum