Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo KVM guest loses disk access
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
mrray
n00b
n00b


Joined: 02 Feb 2011
Posts: 4

PostPosted: Wed May 01, 2013 6:26 pm    Post subject: Gentoo KVM guest loses disk access Reply with quote

Hi all! (Long time, no see)

Anyways, I have a Gentoo guest running on a KVM host with a coud provider.
The guest resides on an SSD array which gives massive performance on a server that runs semi-I/O intnesive stuff like amavisd-new and some other stuff.

Problem is, however, that the guest keeps losing disk access at random intervals. Can be after 4 weeks, has happened after as little as 8 hours of uptime.
A reboot solves the immediate problem, but it mean I have to be available to do just that and I like (and need) my beauty sleep.

The provider support department have been very forthcoming on this issue and has made some configuration changes to the virtual hardware, but also suggested I head on over here and ask if anyone has seen the same problem witj Gentoo KVM guests.

Here is an enclosed kernel log:
Code:
Apr 20 03:07:21 [kernel] [367916.497177] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 20 03:07:21 [kernel] [367916.497184] ata1.00: failed command: WRITE DMA
Apr 20 03:07:21 [kernel] [367916.497188] ata1.00: cmd ca/00:08:5b:f3:e9/00:00:00:00:00/e2 tag 0 dma 4096 out
Apr 20 03:07:21 [kernel] [367916.497188] res 40/00:01:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Apr 20 03:07:21 [kernel] [367916.497190] ata1.00: status: { DRDY }
Apr 20 03:07:21 [kernel] [367916.502915] ata1: soft resetting link
Apr 20 03:07:21 [kernel] [367916.654706] ata1.01: NODEV after polling detection
Apr 20 03:07:21 [kernel] [367916.655703] ata1.00: configured for MWDMA2
Apr 20 03:07:21 [kernel] [367916.655711] ata1.00: device reported invalid CHS sector 0
Apr 20 03:07:21 [kernel] [367916.655728] sd 0:0:0:0: [sda]
Apr 20 03:07:21 [kernel] [367916.655730] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 20 03:07:21 [kernel] [367916.655731] sd 0:0:0:0: [sda]
Apr 20 03:07:21 [kernel] [367916.655733] Sense Key : Aborted Command [current] [descriptor]
Apr 20 03:07:21 [kernel] [367916.655735] Descriptor sense data with sense descriptors (in hex):
Apr 20 03:07:21 [kernel] [367916.655736] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Apr 20 03:07:21 [kernel] [367916.655740] 00 00 00 00
Apr 20 03:07:21 [kernel] [367916.655743] sd 0:0:0:0: [sda]
Apr 20 03:07:21 [kernel] [367916.655744] Add. Sense: No additional sense information
Apr 20 03:07:21 [kernel] [367916.655746] sd 0:0:0:0: [sda] CDB:
Apr 20 03:07:21 [kernel] [367916.655746] Write(10): 2a 00 02 e9 f3 5b 00 00 08 00
Apr 20 03:07:21 [kernel] [367916.655751] end_request: I/O error, dev sda, sector 48886619
Apr 20 03:07:21 [kernel] [367916.655754] Buffer I/O error on device sda3, logical block 103
Apr 20 03:07:21 [kernel] [367916.655755] lost page write due to I/O error on sda3
Apr 20 03:07:21 [kernel] [367916.655772] ata1: EH complete
Apr 20 03:07:21 [kernel] [367916.829822] REISERFS abort (device sda3): Journal write error in flush_commit_list
Apr 20 03:10:08 [kernel] [367988.070159] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 20 03:10:08 [kernel] [367988.070165] ata1.00: failed command: WRITE DMA
Apr 20 03:10:08 [kernel] [367988.070169] ata1.00: cmd ca/00:08:04:fb:00/00:00:00:00:00/e1 tag 0 dma 4096 out
Apr 20 03:10:08 [kernel] [367988.070169] res 40/00:01:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Apr 20 03:10:08 [kernel] [367988.070171] ata1.00: status: { DRDY }
Apr 20 03:10:08 [kernel] [367988.070288] ata1: soft resetting link
Apr 20 03:10:08 [kernel] [367988.221587] ata1.01: NODEV after polling detection
Apr 20 03:10:08 [kernel] [367988.222453] ata1.00: configured for MWDMA2
Apr 20 03:10:08 [kernel] [367988.222458] ata1.00: device reported invalid CHS sector 0
Apr 20 03:10:08 [kernel] [367988.222474] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [367988.222476] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 20 03:10:08 [kernel] [367988.222478] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [367988.222480] Sense Key : Aborted Command [current] [descriptor]
Apr 20 03:10:08 [kernel] [367988.222483] Descriptor sense data with sense descriptors (in hex):
Apr 20 03:10:08 [kernel] [367988.222484] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Apr 20 03:10:08 [kernel] [367988.222490] 00 00 00 00
Apr 20 03:10:08 [kernel] [367988.222493] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [367988.222495] Add. Sense: No additional sense information
Apr 20 03:10:08 [kernel] [367988.222497] sd 0:0:0:0: [sda] CDB:
Apr 20 03:10:08 [kernel] [367988.222498] Write(10): 2a 00 01 00 fb 04 00 00 08 00
Apr 20 03:10:08 [kernel] [367988.222504] end_request: I/O error, dev sda, sector 16841476
Apr 20 03:10:08 [kernel] [367988.222507] Buffer I/O error on device sda2, logical block 2097152
Apr 20 03:10:08 [kernel] [367988.222508] lost page write due to I/O error on sda2
Apr 20 03:10:08 [kernel] [367988.222527] ata1: EH complete
Apr 20 03:10:08 [kernel] [368034.360717] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 20 03:10:08 [kernel] [368034.360723] ata1.00: failed command: WRITE DMA
Apr 20 03:10:08 [kernel] [368034.360727] ata1.00: cmd ca/00:08:14:fd:10/00:00:00:00:00/e2 tag 0 dma 4096 out
Apr 20 03:10:08 [kernel] [368034.360727] res 40/00:01:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Apr 20 03:10:08 [kernel] [368034.360729] ata1.00: status: { DRDY }
Apr 20 03:10:08 [kernel] [368034.360848] ata1: soft resetting link
Apr 20 03:10:08 [kernel] [368034.512510] ata1.01: NODEV after polling detection
Apr 20 03:10:08 [kernel] [368034.513452] ata1.00: configured for MWDMA2
Apr 20 03:10:08 [kernel] [368034.513456] ata1.00: device reported invalid CHS sector 0
Apr 20 03:10:08 [kernel] [368034.513471] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [368034.513473] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 20 03:10:08 [kernel] [368034.513474] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [368034.513476] Sense Key : Aborted Command [current] [descriptor]
Apr 20 03:10:08 [kernel] [368034.513478] Descriptor sense data with sense descriptors (in hex):
Apr 20 03:10:08 [kernel] [368034.513479] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Apr 20 03:10:08 [kernel] [368034.513483] 00 00 00 00
Apr 20 03:10:08 [kernel] [368034.513485] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [368034.513486] Add. Sense: No additional sense information
Apr 20 03:10:08 [kernel] [368034.513488] sd 0:0:0:0: [sda] CDB:
Apr 20 03:10:08 [kernel] [368034.513489] Write(10): 2a 00 02 10 fd 14 00 00 08 00
Apr 20 03:10:08 [kernel] [368034.513494] end_request: I/O error, dev sda, sector 34667796
Apr 20 03:10:08 [kernel] [368034.513502] Buffer I/O error on device sda2, logical block 4325442
Apr 20 03:10:08 [kernel] [368034.513503] lost page write due to I/O error on sda2
Apr 20 03:10:08 [kernel] [368034.513520] ata1: EH complete
Apr 20 03:10:08 [kernel] [368084.435738] ata1.00: limiting speed to MWDMA1:PIO2
Apr 20 03:10:08 [kernel] [368084.435744] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 20 03:10:08 [kernel] [368084.435748] ata1.00: failed command: WRITE DMA
Apr 20 03:10:08 [kernel] [368084.435753] ata1.00: cmd ca/00:08:b4:10:39/00:00:00:00:00/e2 tag 0 dma 4096 out
Apr 20 03:10:08 [kernel] [368084.435753] res 40/00:01:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Apr 20 03:10:08 [kernel] [368084.435755] ata1.00: status: { DRDY }
Apr 20 03:10:08 [kernel] [368084.435876] ata1: soft resetting link
Apr 20 03:10:08 [kernel] [368084.587551] ata1.01: NODEV after polling detection
Apr 20 03:10:08 [kernel] [368084.588471] ata1.00: configured for MWDMA1
Apr 20 03:10:08 [kernel] [368084.588476] ata1.00: device reported invalid CHS sector 0
Apr 20 03:10:08 [kernel] [368084.588493] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [368084.588494] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 20 03:10:08 [kernel] [368084.588496] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [368084.588497] Sense Key : Aborted Command [current] [descriptor]
Apr 20 03:10:08 [kernel] [368084.588501] Descriptor sense data with sense descriptors (in hex):
Apr 20 03:10:08 [kernel] [368084.588502] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Apr 20 03:10:08 [kernel] [368084.588508] 00 00 00 00
Apr 20 03:10:08 [kernel] [368084.588511] sd 0:0:0:0: [sda]
Apr 20 03:10:08 [kernel] [368084.588512] Add. Sense: No additional sense information
Apr 20 03:10:08 [kernel] [368084.588514] sd 0:0:0:0: [sda] CDB:
Apr 20 03:10:08 [kernel] [368084.588515] Write(10): 2a 00 02 39 10 b4 00 00 08 00
Apr 20 03:10:08 [kernel] [368084.588524] Buffer I/O error on device sda2, logical block 4653750
Apr 20 03:10:08 [kernel] [368084.588526] lost page write due to I/O error on sda2
Apr 20 03:10:08 [kernel] [368084.588548] ata1: EH complete


I originally thought it to be a hardware issue, but the provider seems to think different and I just have to go with it since I don´t have access to the hardware...
Any ideas?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Wed May 01, 2013 9:09 pm    Post subject: Reply with quote

mrray,

It looks like a HDD or HDD data cable issue.
The SMART error log would be useful.

I guess the storage your provider shows your KVM as sda is spread over a lot of physical devices.
I'm surprised to see your storage appear as /dev/sda too. That suggests you are working through the emulated hardware that KVM provides.
The virtio driver is faster but your provider may not want to use that. Your block devices would be /dev/vda ... then.

Bugs in KVM cannot be ruled out. Search the kernel bugtracker.

Will your KVM provider provide storage access via virtio?
That may help narrow down the problem.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
mrray
n00b
n00b


Joined: 02 Feb 2011
Posts: 4

PostPosted: Thu May 02, 2013 9:00 am    Post subject: Reply with quote

NeddySeagoon wrote:
mrray,

It looks like a HDD or HDD data cable issue.
The SMART error log would be useful.


I installed smartmontools just now, so I will see what I can cough up and keep you posted.

NeddySeagoon wrote:
I'm surprised to see your storage appear as /dev/sda too. That suggests you are working through the emulated hardware that KVM provides.
The virtio driver is faster but your provider may not want to use that. Your block devices would be /dev/vda ... then.

Bugs in KVM cannot be ruled out. Search the kernel bugtracker.

Will your KVM provider provide storage access via virtio?
That may help narrow down the problem.


I have to do a bit of guesswork here as I have very limited experience with the KVM hypervisor, but htis VM was converted from its original XEN to KVM when I migrated it onto SSD storage.

I am quite sure the provider can give me access via Virtio, but wouldn't that require a lot of work on my part? Gentoo and virtio do not seem to be good friends, at least that is what I deduced from a quick google search?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum