Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
qemu kvm slowdown with hdd activity and host tasks.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo on AMD64
View previous topic :: View next topic  
Author Message
vexatious
Tux's lil' helper
Tux's lil' helper


Joined: 24 Aug 2010
Posts: 77

PostPosted: Thu May 07, 2015 9:54 pm    Post subject: qemu kvm slowdown with hdd activity and host tasks. Reply with quote

Kernel 4.0.1-rc2 with config based on Fedora core 21 kernel-3.19.2-2.01 (extra options turned on for KVM and GPU with vfio-pci: pci-stub and KVM-AMD). Guest is "Windows 7 Home Premium x64".

Athlon FX 6300, 16GB ram and r9 290X.

Performance doesn't seem right considering others claiming 95% native performance https://www.youtube.com/watch?v=37D2bRsthfI . I seem to be getting considerable CPU overhead and about %40-%50 native performance in "Metal Gear Solid Ground Zeroes".

I'm using the following commands:
Code:

# Huge pages equals total guest memory in megabytes divided by two plus 30 megabytes.
HP="2560"
echo $HP > /proc/sys/vm/nr_hugepages

# Set host CPU governor
GOV="ondemand"
for CPU in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do \
echo "$GOV" > $CPU ; \
done

#echo -n 5 > /sys/devices/system/cpu/cpufreq/$GOV/sampling_down_factor
echo 20 > /sys/devices/system/cpu/cpufreq/$GOV/up_threshold

# Set I/O scheduler for target device (usually hard drive)
SCHED="noop"
HD="sdb"
echo ${SCHED} > /sys/block/${HD}/queue/scheduler

# Bind physical hardware (will place script content here later)
sh vfio-bind 0000:01:00.0 0000:01:00.1 0000:00:14.2

qemu-system-x86_64 -realtime mlock=on \
-mem-path /mnt/hugepages -mem-prealloc -no-hpet \
-machine accel=kvm,kernel_irqchip=on,mem-merge=off \
-balloon virtio -watchdog-action none \
-enable-kvm -m 5G -cpu kvm64,hv_vapic,hv_time,hv_relaxed,hv_spinlocks=0x1000 \
-smp sockets=1,cores=6,threads=1 -rtc base=localtime,clock=host \
-net user -net nic,model=virtio -parallel none -serial none \
-device usb-host,hostbus=5,hostaddr=2 -usbdevice mouse \
-drive file=/mnt/gentoo/win_7.img,index=0,format=raw,aio=threads,id=drive0,cache=unsafe,if=virtio,copy-on-read=on \
-cdrom /home/user/Downloads/virtio-win-0.1-100.iso \
-device vfio-pci,host=01:00.0,multifunction=on,x-vga=off \
-device vfio-pci,host=01:00.1 \
-device vfio-pci,host=00:14.2


"Metal Gear Solid Ground Zeroes" video games gets about 10-40 frames per second in Qemu KVM. Native it gets about 45-60+ FPS. Other video games don't suffer so badly but still nowhere near %95 others claim.

I think I have a deliberate kernel configuration wrong or I used bad cflags somewhere.

Help would be greatly appreciated!
_________________
Gentoo
Slackware


Last edited by vexatious on Wed Jul 22, 2015 9:32 am; edited 6 times in total
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2926
Location: Germany

PostPosted: Sat May 09, 2015 2:02 pm    Post subject: Reply with quote

How about this? http://www.linux-kvm.org/page/Tuning_KVM

i.e. -cpu host, a block device instead of file as disk, and I'd probably also stop with the ballooning.

start with fixed resources see how far that gets you... that's how I got best performance out of KVM (but I'm not using GPU pass)
Back to top
View user's profile Send private message
vexatious
Tux's lil' helper
Tux's lil' helper


Joined: 24 Aug 2010
Posts: 77

PostPosted: Wed May 13, 2015 11:06 am    Post subject: Reply with quote

Thanks frostschutz

Haven't had time to try a block device. I've been too lazy and it's kind of hard and time consuming exporting a raw IMG to a block device, especially since I have almost no room on both of my hdds. Tried your other suggestion but no immediate gains.

Also changed disk image to use virtio-scsi. According to "Wiki" it should provide the best performance with least overhead.
WIKI
Quote:
Virtio-scsi is designed to be the next gen replacement for the current virtio-blk driver. It provides direct access to SCSI commands and bypasses the QUMU emulator to talk directly to the target SCSI device loaded in the host’s kernel.

vhost-scsi KVM Kernel on Host
Is able to bypass the second level AIO and O_Direct overhead by using LIO, this helps performance.
No changes to guest virtio-scsi LLD
Currently does not support Live Migration



I'm using the following command now:
Quote:
qemu-system-x86_64 -nodefaults \
-monitor none -vga std -realtime mlock=on \
-machine accel=kvm,kernel_irqchip=on,mem-merge=off \
-balloon none -watchdog-action none -mem-path /mnt/hugepages -mem-prealloc \
-enable-kvm -m 5G -cpu host,hv_vapic,hv_time,hv_relaxed,hv_spinlocks=0x1fff \
-smp sockets=1,cores=6 -rtc base=localtime,clock=host \
-net user -net nic,model=virtio -parallel none -serial none \
-device usb-host,hostbus=5,hostaddr=2 -usbdevice mouse \
-device virtio-scsi-pci,id=scsihd0,num_queues=6 \
-drive file=/mnt/gentoo/win_7.img,id=hd0,format=raw,aio=native,cache=none,cache.direct=on,if=none \
-device scsi-hd,bus=scsihd0.0,drive=hd0 \
-device vfio-pci,host=01:00.0,multifunction=on,x-vga=off,x-req=on \
-device vfio-pci,host=01:00.1 \
-device vfio-pci,host=00:14.2


Noticed in "Metal Gear Solid Ground Zeroes", a huge difference if I turn down 'shadow' quality in graphics options. If I turn down all other options game performs a little better but mostly the same, but If I change shadow quality to anything under 'extra high' ('high' or lower), game performs much better and similar to native Windows. Wondering why this is, and why does the 'shadow' quality affect KVM performance so much? Shouldn't it affect the AMD 290X gpu? CPU usage remains around 60% either way with shadow quality 'extra high' or less, which I find odd. Why doesn't cpu usage climb with 'extra high' shadow quality?

"Evil Within" runs more similar to native in comparison. Hovers around 40-60FPS unless there's a large environment. Game also exhibits constant HDD swapping. Also using IMG on 5400 RPM HDD. I think it's safe to say the virtio-scsi-pci doesn't cause a CPU bottleneck (CPU usage remains 2-10% with Windows still booting). Neither does X-data plane or virtio (at least nothing noticeable).

Is "Metal Gear Solid Ground Zeroes" just showing a rare KVM bottleneck?

On the bright side, I ran PCMark and got faster performance on first two video transcoding tests, on kvm.

I'll keep trying I guess to see what else might improve the mysterious "Metal Gear Solid Ground Zeroes" performance issues. I think "Evil Within" could perform better as well (data loading is blazing fast with unsafe cache, despite using IMG on 5400 RPM hdd!).

Any further help would be stupendous!

edit

Here are my PCMARK7 results:
Quote:
PCMARK7 NATIVE
2015-04-30T05:46:05.1856775-07:00 [Workload Result] Video playback 23.95 fps
2015-04-30T05:46:47.1986792-07:00 [Workload Result] Video transcoding - downscaling 2056.95 kB/s


PCMARK7 KVM (audacious running in host (resampling=best), chromium and few tabs open, steam in background on kvm guest)
2015-05-13T05:58:47.7936000-07:00 [Workload Result] Video playback 24.53 fps
2015-05-13T05:58:47.8116000-07:00 [Workload Result] Video transcoding - downscaling 2648.96 kB/s


Oddly, qemu uses about 10% cpu while idle. I'm not sure if it's polling something, or there's a timing conflict.

Thanks for any help!

edit

Found something seemingly odd and very unfortunate IMO. If I use kernel boot parameter, isolcpus, qemu can't seem to use unused cores with taskset or without. I get the impression qemu strictly allows virtual cpus to work only as very high level software threads, instead of passing real cores with full virtualization at the kernel level. With taskset, I had to use 'chrt -r1' command like so:
Code:
chrt -r 1 taskset -c 0-4 qemu blahblahqemuoptions
. This caused random system lock-ups and freezing on my end during qemu initializing, and it only worked twice by pure luck (couldn't test gpu passthrough). I think it's very unfortunate to have such a high level virtual cpu subsystem based on software threads, as this seems the case.

Oddly, I ended dropping virtual cpu cores from 6 to 4. Games run basically the same, with no noticeable loss. I'm going to use this for now since 6 cores makes no difference except less cpu % consumption on the four cores. Haven't been using nested page tables all along either (kvm_amd npt=0).

Sad state so far on my end with qemu and KVM, but hopefully someone more intelligent and skilled can help me out with this CPU performance problem.

Thank you.
_________________
Gentoo
Slackware
Back to top
View user's profile Send private message
vexatious
Tux's lil' helper
Tux's lil' helper


Joined: 24 Aug 2010
Posts: 77

PostPosted: Sat May 23, 2015 11:32 pm    Post subject: Reply with quote

After more careful analyzing, I'm starting to think there's a problem with passthrough or qemu, and not the CPU overhead.

Seems only "Metal Gear Solid Ground Zeroes" shows this. If I run gpu benchmarks, or other games, there doesn't seem to be speed problems. Only with the "Metal Gear" game.

I've executed many games and benchmarks suggesting CPU speed isn't really an issue (although maybe suspect). GPU seems to perform great on almost every game and benchmark I've tried. Only exception is MGSGZ.

I'm thinking this is either a deliberate driver problem or a passthrough issue, or both. Maybe a combination of cpu virtualization and passthrough affecting AMD drivers with this particular MGSGZ game... ?

Still seems odd host shows %10 cpu use with Windows guest idle IMO. Also the fact I can reduce cpu cores to four and speed is basically the same in MGSGZ and pretty much everything else in Windows guest.

My only clue is a message in /var/log/syslog showing the following:
Code:
ACPI Warning: SystemIO range 0x0000000000000B00-0x0000000000000B07 conflicts with OpRegion 0x0000000000000B00-0x0000000000000B0F (\SOR1) (20150410/utaddress-254)


Catalyst 14.12 and latest 15.3 betas seem the same performance wise, and issue is the same.
_________________
Gentoo
Slackware
Back to top
View user's profile Send private message
vexatious
Tux's lil' helper
Tux's lil' helper


Joined: 24 Aug 2010
Posts: 77

PostPosted: Sun Jun 07, 2015 7:11 am    Post subject: Reply with quote

Went back to qemu-2.0.0 and using x-data plane. Qemu 2.1.0 and later seem to have broken x-vga=on option and it sometimes makes it more confusing to use (have to disable primary std gpu in Windows or vfio GPU remains unusable). Bios doesn't show on vfio GPU, and it's impossible to grab mouse-keyboard with vga=none. I've reverted to Qemu-2.0.0 since it doesn't suffer those issues and performance seems quite the same while being more straight forward.

Here's some benchmarks results with my 290X under Qemu-2.0.0 and using raw img on 5400RPM hdd.

Star Swarm Performance Demo:
Quote:
MANTLE
Test Duration: 360 Seconds
Total Frames: 21673

Average FPS: 60.20
Average Unit Count: 4421
Maximum Unit Count: 5761
Average Batches/MS: 1190.89
Maximum Batches/MS: 4533.11
Average Batch Count: 21016
Maximum Batch Count: 164239

Quote:
D3D
Test Duration: 360 Seconds
Total Frames: 9239

Average FPS: 25.66
Average Unit Count: 4010
Maximum Unit Count: 5476
Average Batches/MS: 409.83
Maximum Batches/MS: 738.75
Average Batch Count: 17538
Maximum Batch Count: 133219



Benchmark results with same hardware, but Qemu-2.3.0 used.

Final Fantasy XIV: Heavensward Benchmark
Quote:
x-dataplane cache=none aio=threads

FINAL FANTASY XIV: Heavensward Benchmark
Tested on: 5/15/2015 7:33:15 PM
Score: 8904
Average Frame Rate: 74.981
Performance: Extremely High
-Easily capable of running the game on the highest settings.
Loading Times by Scene
Scene #1 10.920 sec
Scene #2 38.935 sec
Scene #3 14.204 sec
Scene #4 16.610 sec
Scene #5 15.449 sec
Scene #6 9.724 sec
Total Loading Time 105.843 sec

Quote:
xdata-plane cache=unsafe aio=threads

FINAL FANTASY XIV: Heavensward Benchmark
Tested on: 5/15/2015 7:14:22 PM
Score: 8826
Average Frame Rate: 73.289
Performance: Extremely High
-Easily capable of running the game on the highest settings.
Loading Times by Scene
Scene #1 9.428 sec
Scene #2 29.420 sec
Scene #3 11.545 sec
Scene #4 12.253 sec
Scene #5 7.454 sec
Scene #6 4.839 sec
Total Loading Time 74.939 sec

Quote:
xdataplane aio=native cache=unsafe cache.direct=on

FINAL FANTASY XIV: Heavensward Benchmark
Tested on: 5/16/2015 2:15:20 AM
Score: 8695
Average Frame Rate: 71.376
Performance: Extremely High
-Easily capable of running the game on the highest settings.
Loading Times by Scene
Scene #1 11.281 sec
Scene #2 38.984 sec
Scene #3 14.218 sec
Scene #4 16.818 sec
Scene #5 15.378 sec
Scene #6 6.823 sec
Total Loading Time 103.503 sec

Quote:
if=virtio cache=unsafe aio=native

FINAL FANTASY XIV: Heavensward Benchmark
Tested on: 5/15/2015 6:00:45 PM
Score: 9415
Average Frame Rate: 78.424
Performance: Extremely High
-Easily capable of running the game on the highest settings.
Loading Times by Scene
Scene #1 10.408 sec
Scene #2 39.687 sec
Scene #3 14.652 sec
Scene #4 16.990 sec
Scene #5 14.920 sec
Scene #6 10.704 sec
Total Loading Time 107.363 sec

Quote:
if=virtio cache=none aio=native

FINAL FANTASY XIV: Heavensward Benchmark
Tested on: 5/15/2015 6:28:49 PM
Score: 9233
Average Frame Rate: 76.114
Performance: Extremely High
-Easily capable of running the game on the highest settings.
Loading Times by Scene
Scene #1 29.868 sec
Scene #2 40.226 sec
Scene #3 14.444 sec
Scene #4 16.408 sec
Scene #5 15.078 sec
Scene #6 7.084 sec
Total Loading Time 123.109 sec


Found FFXIV benchmark quite interesting and useful. Seems aio=native (which needs cache.direct=on) bottlenecks unsafe cache. To benefit from unsafe cache I have to use aio=threads, which is also true with multiqueue and virtio-scsi and Qemu-2.0.0 without virtio-scsi multiqueue. Total loading time decreases to about 33.0 sec if I run FFXIV benchmark a second time with unsafe cache and aio=threads. Unsafe cache seems quite useful for reducing HDD wear-tear and improving performance significantly, if same data is fetched repeatedly (very useful for games like "Evil Within", or programs that need repeated data access). X-data plane seems to give slightly better performance, compared to if=virtio.

Star Swarm benchmarks seem right, compared to others' results I've found.

Not sure how to benchmark MGSGZ, but I still haven't resolved performance problem when "Shadow quality" is maxed. I still think it's a vfio issue since GPU seems underworked and fan doesn't spin up, but fan spins up immediately with other titles. Actually had upgraded to Qemu-2.3.0 before this thread, thinking it would resolve the issue but it doesn't, so I might as well revert to 2.0.0 to avoid the other problems.

Here's a quote with patch that's supposed to revert the vga=none behavior to allow mouse-keyboard grab in Qemu-2.1.0 (and probably newer). Someone mentioned behavior was changed on purpose since it seemed strange, which I find odd with primary vfio gpu.

"Benedikt Morbach": http://lists.gnu.org/archive/html/qemu-devel/2014-08/msg01976.html
Quote:
after the recent ui rework it didn't show a gfx window if no emulated
graphics card was attached. This prevented input grab from working for
the vfio gpu passthrough use case.

Hack around this by always creating at least one gfx tab.
---

I'm not quite sure how much of a hack this is, but it at least matches
the old behaviour of creating at least one gfx console and works for me.
So if anyone wants to give this a shot until someone comes up with a
proper fix, here it is.


ui/gtk.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/ui/gtk.c b/ui/gtk.c
index 2345d7e..4251fd3 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -1639,9 +1639,11 @@ static GSList *gd_vc_gfx_init(GtkDisplayState *s,
VirtualConsole *vc,
Error *local_err = NULL;
Object *obj;

- obj = object_property_get_link(OBJECT(con), "device", &local_err);
- if (obj) {
- vc->label = g_strdup_printf("%s", object_get_typename(obj));
+ if (con) {
+ obj = object_property_get_link(OBJECT(con), "device", &local_err);
+ if (obj) {
+ vc->label = g_strdup_printf("%s", object_get_typename(obj));
+ }
} else {
vc->label = g_strdup_printf("VGA");
}
@@ -1742,7 +1744,7 @@ static GtkWidget *gd_create_menu_view(GtkDisplayState *s,
GtkAccelGroup *accel_g
/* gfx */
for (vc = 0;; vc++) {
con = qemu_console_lookup_by_index(vc);
- if (!con || !qemu_console_is_graphic(con)) {
+ if (vc > 0 && (!con || !qemu_console_is_graphic(con))) {
break;
}
group = gd_vc_gfx_init(s, &s->vc[vc], con,
--
2.0.4


"Gerd Hoffman" http://lists.gnu.org/archive/html/qemu-devel/2014-08/msg04227.html
Quote:
On Do, 2014-08-07 at 00:22 +0200, Benedikt Morbach wrote:
> I think one of those gtk patches broke mouse/keyboard grab for my
> Windows 8 vfio/vga-passthrough setup in 2.1.0 and I was instructed on
> IRC to report that here.
>
> With 2.0.0 I got a black qemu window with "This VM has no graphic
> display device", which I could click on to get a mouse grab.

Yes, this is (intentionally) gone in 2.1.
No vga -> no graphic display.

> If I press Ctrl-Alt-g or use the corresponding menu entry, the gtk
> window grabs the input devices and the titlebar changes to "press
> Ctrl-Alt-g to release grab", but none of the input reaches the vm.

It shouldn't allow the grab in the first place.

But, yes, any input (grab being active or not) is only routed to the
guest in case a graphic display tab is the active one. You don't want
the guest see the stuff you are typing into the qemu monitor.

> If I drop the "-vga none" I can get a mouse-grab, but the passed-through
> gpu won't work, so this is no option for me.

Worth trying: '-vga none -device secondary-vga'.


How does your setup look like? Two gfx cards, one for the host, one for
the guest? Then have qemu running on the host display, let qemu grab
the input and feed mouse (kbd too?) input to the guest that way?

It's a bit strange, but I think we don't have a better way to do it
(without additional physical devices). In case you have a spare usb
mouse you can simply plug it in and use usb-host to assign it to the
guest, which is probably more comfortable than grabbing/ungrabbing when
switching between host and guest.


cheers,
Gerd


edit

Seems patch isn't needed (couldn't patch latest git anyway). Using 'vga=none -device secondary-vga' works with mouse-kbd grab in qemu-2.3.0. Still get unknown vga controller in Windows but I just disable it and everything works same otherwise, including bios post on vfio gpu (vfio gpu didn't post without -device secondary-vga option). I'll try to post any new discoveries I find towards fixing my issues.
_________________
Gentoo
Slackware
Back to top
View user's profile Send private message
vexatious
Tux's lil' helper
Tux's lil' helper


Joined: 24 Aug 2010
Posts: 77

PostPosted: Tue Jun 09, 2015 9:34 pm    Post subject: Reply with quote

More careful analyzing shows very bad dpc latency.

Using dpc latency checker for Windows 7 shows I'm getting around 500us idle, and random spikes up to 2000000+us with HDD access! High dpc latency appears truly related to slowdowns in games like MGSGZ (fps lowered when dpc latency spikes up). Even though HDD access seems quite fine, resulting dpc latency seems to cause much overhead for games and CPU critical applications.

Natively I'm getting about 50us, even during gameplay and HDD access! That is a huge difference and I find qemu with virtio-scsi-pci and vfio with raw IMG much worse than I thought, and unfortunate.

Looking at previous FF Heavensward benchmarks posted shows worse FPS with x-dataplane, despite faster loading times and supposedly bypassing qemu according to comments from opensuse articles.

Going to try physical hdd...

edit

Physical HDD performance gives over double transfer speed in FF Heavensward benchmark, without cache. Unsafe cache produced about same speed, but didn't test second run with unsafe cache. Lost benchmark results so I guess take results with grain of salt. Unfortunately, DPC latency resulted with same bad results.

Otherwise, I've found setting CPU governor to "Performance" lowers DPC latency considerably, and is more similar to native. Still not as good but a lot better.

Performance governor on Linux causes Windows guest DPC latency to drop considerably while idle; from around 500us, to 35-120us. Seems good but still get random spikes to red zone which isn't good.

DPC latency is better when CPU governor is performance, but still an issue. If I minimize a window and hover mouse over button in taskbar, pop-up causes large spikes in the red zone (10000+us or more). Browsing web in internet explorer causes huge DPC latency spikes too. I'm guessing a relation to Linux HDD I/O handling or cpu governor, or combination of both. Large latency with HDD activity, whether physical HDD or raw img in qemu.

edit

Tried ionice but no luck. Tried all four options (none, idle, best-effort, realtime) with highest priority and no change. Seems whatever I/O activity responsible for HDD access, causes huge slowdowns in Qemu. Tried compiling stuff in foreground host with make -j7, and also causes major slowdowns in qemu almost to the point of freezing. Other programs like vlc, ffmpeg, audio/video encoding, or games in foreground host don't slow down when compiling with make -j7, unless cpu's are 100% taxed which still seems to run smooth. Changing priority in Qemu to -20 doesn't appear to change problem. I/O HDD access deliberately slows down Qemu no matter which niceness or ionice options used.

Qemu Wiki:
Quote:
Since QEMU issues read() requests in userspace, Linux normally uses the page cache. The Linux page cache is not coherent across multiple nodes so the only way to safely access storage coherently is to bypass the Linux page cache via cache=none or the QEMU iscsi block driver (using iscsi:// URIs).


Appears the only way to avoid userspace requests is cache=none or iscsi block driver. If I use cache=none I still get the DPC latency spikes and slowdown. Get best performance with unsafe cache. Kernel space threads provides best performance according to most articles, but qemu doesn't appear to work with this. My next best option is probably vfio with old raid card and cross fingers qemu's seabios handles it, which I absolutely don't want. Don't understand why such huge slowdowns with hdd access, since I can't seem to take advantage of qcow or img... Certainly prefer slower data transfers over bad latency and CPU idle time or slowdown.

Going to try iscsi to hopefully fix this issue.
_________________
Gentoo
Slackware
Back to top
View user's profile Send private message
vexatious
Tux's lil' helper
Tux's lil' helper


Joined: 24 Aug 2010
Posts: 77

PostPosted: Fri Jul 17, 2015 6:06 am    Post subject: Reply with quote

After some time I've encountered some things.

Not sure how but no longer have performance issue with MGSGZ. Re-installed "Windows 7 x64 HP" and used "PC" (I440FX) chipset. Disabled unnecessary services and installed catalyst 15.7 drivers. Not sure which made a difference but MGSGZ performance seems identical to native, at least compared to catalyst beta 15.3 native. Also using qcow2 instead. Can use ondemand governor with MGSGZ without performance loss. I'm guess re-installing Windows fixed a lot since I changed chipset a lot, i440fx to q35, and regedits, but could be catalyst 15.7.

I've found qemu doesn't actually use realtime priority, unless you specify a realtime scheduler with chrt or similar. Either the realtime option in qemu doesn't work, it's bugged or intended to behave this way. I think it's bugged. When qemu actually does use realtime scheduling, it fixes problem with qemu slowing down a lot when host is multitasking. With chrt, this way, I can do make -j7 while playing MGSGZ without noticeable slowdown, and still smooth performance while multitasking on host :).

Using this now:
Code:

#!/bin/bash
# How many cpu cores to use.
CORES="6"

# HPAGES equals total guest memory in megabytes divided by $PAGESIZE (2MB or 1024MB). Some folks recommend 30mb free page space.
M="4096"
HPAGES="4"
echo $HPAGES > /proc/sys/vm/nr_hugepages

# Set host CPU governor
GOV="ondemand"
for CPUS in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do
echo ${GOV} > $CPUS
done
#echo -n 5 > /sys/devices/system/cpu/cpufreq/$GOV/sampling_down_factor
echo 30 > /sys/devices/system/cpu/cpufreq/$GOV/up_threshold

chrt -R -f 1 \
qemu-system-x86_64 -nodefconfig -nodefaults -no-user-config \
-no-hpet -balloon none -realtime mlock=on \
-smp cores=$CORES,threads=1 -vga none -device secondary-vga \
-serial none -parallel none \
-m "$M"M -mem-path /mnt/hugepages -mem-prealloc \
-cpu host,hv_vapic,hv_time,hv_relaxed,hv_spinlocks=0x1fff \
-machine pc,accel=kvm,kernel_irqchip=on,mem-merge=off \
-usb -device usb-host,hostbus=5,hostaddr=2 \
-device usb-ehci,id=ehci -device usb-host,hostbus=1,hostaddr=3,bus=ehci.0 \
-net user -net nic,model=virtio -rtc base=localtime,clock=host,driftfix=none \
-device virtio-scsi-pci,id=scsi,num_queues=$CORES \
-drive file=/mnt/qcow2/win_7.qcow2,format=qcow2,cache=none,if=none,id=drive0 \
-device scsi-hd,bus=scsi.0,drive=drive0 \
-device vfio-pci,host=01:00.0,multifunction=on,x-vga=on \
-device vfio-pci,host=01:00.1,multifunction=on \
-device vfio-pci,host=00:14.2,multifunction=on

# Set host CPU governor
for CPUS in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do
echo "conservative" > $CPUS
done


Unfortunately DPC latency hasn't improved, and HDD access can still use a large amount of CPU. Natively I get around 15-30ms DPC latency during gameplay, but qemu around 35-500ms with spikes as high as 2,000,000ms. Here's a picture with a short example. I used "Evil Within" since this game does HDD swapping extensively, being built around IDTECH5 code (same used in game "Rage"):
http://s17.postimg.org/jiv78benj/dpc.png

Hopefully in future qemu will have a complete virtio chipset virtual platform with drivers, and better dpc latency.
_________________
Gentoo
Slackware
Back to top
View user's profile Send private message
timofonic
n00b
n00b


Joined: 13 Sep 2005
Posts: 16

PostPosted: Mon Dec 07, 2015 1:42 am    Post subject: Reply with quote

Any news about this?

I don't need it specially for games, but high end CAD, electronics and probably some data crunching software.

I want to use a laptop for that. having an Intel Core i/ 5700HQ and Nvidia GTX 960M, chip HM68 chipset.

Can I use the Intel GPU while using the geForce on a Windows VM and output the graphics into the same screen?

Thanks in advance!

Kind regards.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo on AMD64 All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum