Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
SOLVED: RAID5: compute_blocknr: map not correct
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 5032
Location: almost Mile High in the USA

PostPosted: Fri Jul 10, 2015 3:40 pm    Post subject: SOLVED: RAID5: compute_blocknr: map not correct Reply with quote

Problem turned out to be HARDWARE and not software.
---------------------------------------------------------------------------------

Wondering if anyone has been has been getting these on 2TB disks lately (MDRAID)?

I figure that it may be hardware issue but just wanted to make sure.

I'm trying to run two 2TB disks. I tried these two configurations:
1. Two disk RAID5 with one missing disk (Degraded mode)
2. One disk "degenerate" RAID5 (degraded mode) that I added the second disk (so it becomes a degenerate RAID5 or actually more like a RAID1)

Both ways I tried a 1.2 superblock.

Both of these configurations are getting the error

Code:
compute_blocknr: map not correct


in dmesg, and the machine hangs on disk i/o to the RAID array. I guess I have to use these disks as JBOD for now until I rootcause this, perhaps this is ultimately a hardware issue too... ugh. Memtest86 passes on this machine through at least 1 pass. I suspect the SATA controllers may have issues.

I'll need to incorporate the third disk but not until I get a backup onto the degraded or degenerate array... The third disk is currently the backup disk and I don't want to sacrifice its contents just yet.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?


Last edited by eccerr0r on Sat Dec 05, 2015 9:13 pm; edited 1 time in total
Back to top
View user's profile Send private message
Keruskerfuerst
Advocate
Advocate


Joined: 01 Feb 2006
Posts: 2097

PostPosted: Sun Jul 12, 2015 7:10 am    Post subject: Reply with quote

1. Try to use smartmoontools
2. Try to use the two disks separatly and format them with ext4 and do write check (e.g. with dd)
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2829
Location: Germany

PostPosted: Sun Jul 12, 2015 10:16 am    Post subject: Reply with quote

Which kernel version?

Code:

drivers/md/raid5.c
        chunk_number = stripe * data_disks + i;
        r_sector = chunk_number * sectors_per_chunk + chunk_offset;

        check = raid5_compute_sector(conf, r_sector,
                                     previous, &dummy1, &sh2);
        if (check != sh->sector || dummy1 != dd_idx || sh2.pd_idx != sh->pd_idx
                || sh2.qd_idx != sh->qd_idx) {
                printk(KERN_ERR "md/raid:%s: compute_blocknr: map not correct\n",
                       mdname(conf->mddev));
                return 0;
        }
        return r_sector;


I'd be worried if I got that message from regular RAID usage (degraded or no). Can you show mdadm --examine /dev/sd[xyz]6 for the raid members? Are you trying to access beyond end of device?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 5032
Location: almost Mile High in the USA

PostPosted: Sun Jul 12, 2015 3:42 pm    Post subject: Reply with quote

This was seen on 3.17.8-gentoo-r1. Yes indeed this is a dangerous error.

Code:
/dev/sdh2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : feed:face:dead:beef (faked)
           Name : seagate750G:0  (local to host seagate750G)
  Creation Time : Thu Jul  9 22:03:00 2015
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 3906267136 (1862.65 GiB 2000.01 GB)
     Array Size : 3906267136 (3725.31 GiB 4000.02 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : active
    Device UUID : dead:beef:feed:cafe (faked)

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jul  9 23:41:12 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 1f46a62b - correct
         Events : 3751

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA. ('A' == active, '.' == missing, 'R' == replacing)


Again I cannot discount hardware problems (as in SATA, motherboard) but the hard drives themselves report no errors (they're fairly young - less than 200 hours. No reallocate, no pending, looiks clean.)

I guess I have to test them individually but need a good test based on block numbers or some other method... Something like memtest86 but for hard drives...
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2829
Location: Germany

PostPosted: Sun Jul 12, 2015 4:30 pm    Post subject: Reply with quote

Is it a 32bit system without large block device support?

Doubt it's a hardware issue, this is a software calculation going wrong somehow.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 5032
Location: almost Mile High in the USA

PostPosted: Sun Jul 12, 2015 5:19 pm    Post subject: Reply with quote

That's a good point, yes this is a 32-bit kernel and userland system. I was worried that there was a hardware issue reading incorrect values from the hardware but yes this may very well be an overflow issue somehow, and is a real kernel bug...

I think this machine can run 64-bit code, it might be worth trying to get a 64-bit userland to test... As this machine has only 1GB of RAM I tried the 32 bit kernel to save on pointer memory. (I don't have any extra DDR2 lying around... all of the big DDR2 DIMMs are in my server (running a 64-bit kernel) that I need the memory in. Actually these 2T disks will be put in the server once I validate them!)

Large block support is enabled as far as I can tell:
Code:
CONFIG_BLOCK=y
CONFIG_LBDAF=y
CONFIG_BLK_DEV_BSG=y
# CONFIG_BLK_DEV_BSGLIB is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
# CONFIG_BLK_DEV_THROTTLING is not set
# CONFIG_BLK_CMDLINE_PARSER is not set


EDIT

Looks like dummy1 != dd_idx is the failing assertion in that code by adding a bit more debugging printks.
Hmm. Need to study this some more. The actual weirdness is that when I dump a whole bunch of stuff to the array, the array will hang and no more forward progress happens - livelocked on writing. Probably due to barriers of some sort, I can no longer sync(1) and need to unclean reboot the machine.

I just noticed that this appears to happen after writing 1TB or so to the array. Unsure if this has anything to do with the issue or just so happens to be the point where something else got triggered.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 5032
Location: almost Mile High in the USA

PostPosted: Fri Dec 04, 2015 9:30 am    Post subject: Reply with quote

I just temporarily moved these disks to another (true) 32 bit machine. So far so good, but it's not quite done yet, this machine does not have Gbit Ethernet so I can't copy nearly as fast. (Perhaps I should have also tried sticking this SiL3114 PCI board in the other machine too, alas, I think that board is kind of broken anyway since it was a hardware swapout already.)

Done now, looks like in this case we have a hardware problem. Oh well.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum