Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[Solved] FFMPEG Screencast slow, but Arch chroot not??
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Multimedia
View previous topic :: View next topic  
Author Message
veridiam
n00b
n00b


Joined: 15 Feb 2015
Posts: 7

PostPosted: Sun Feb 15, 2015 10:02 pm    Post subject: [Solved] FFMPEG Screencast slow, but Arch chroot not?? Reply with quote

Hey everyone, I'm looking for some help on this issue which is boggling my mind.

I'm running the following ffmpeg script to capture my display;
Code:
ffmpeg -f x11grab -r 60 -s 1920x1080 -i $DISPLAY -vcodec libx264 -preset ultrafast -threads 0 output.mkv


I ran that script all the time when I had arch linux installed, and never had any issues. I moved to Gentoo this week (love it so far) but now I can't get it to record any faster than about 19-20 frames per second.

After a lot of re-emerging and messing with USE flags, I decided to chroot into my arch linux USB key and run its version of ffmpeg and libx264. Guess what? Full 60 fps! So now I know it's not a kernel or system load problem. I ran the output to /dev/null as well to check if it was the drive, but no change in performance. I also temporarily replaced both the ffmpeg binary in /usr/bin and the libx264.so.142 in /usr/lib64 with the binaries from the arch install, with no effect on performance.

Here's the difference in output
https://www.diffchecker.com/vg0u7uuz
Left side is my Gentoo install, right side is the Arch chroot. There aren't many differences besides the compile flags and the frame rate.

Any idea what could be going on? I'm truly stumped.

Edit:
I should probably post my make.conf info
Code:
CFLAGS="-march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"
ACCEPT_KEYWORDS="~amd64"
CPU_FLAGS_X86="aes avx mmx sse sse2 sse3 sse4_1 sse4_2 ssse3"
USE="X crypt dbus device-mapper firmware-loader gtk gtk2 -qt4 -qt5 -kde \
     gnome-keyring multilib networkmanager pulseaudio ssl udev \
     udisks unicode bash-completion threads -libav ffmpeg python avahi \
     openssl sdl xscreensaver vdpau opencl ${CPU_FLAGS_X86}"
VIDEO_CARDS="nvidia"
MAKEOPTS="-j5"


Solution:

TLDR: Use mmxext


Last edited by veridiam on Mon Feb 16, 2015 3:46 am; edited 3 times in total
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Sun Feb 15, 2015 10:22 pm    Post subject: Reply with quote

ffmpeg might have run faster, for example, if you'd compiled it with the sse3 support your CPU probably has.

Correct your USE flags instead of blindly copying and pasting from /proc/cpuinfo and try again.
Back to top
View user's profile Send private message
veridiam
n00b
n00b


Joined: 15 Feb 2015
Posts: 7

PostPosted: Sun Feb 15, 2015 10:35 pm    Post subject: Reply with quote

Ant P. wrote:
ffmpeg might have run faster, for example, if you'd compiled it with the sse3 support your CPU probably has.

Correct your USE flags instead of blindly copying and pasting from /proc/cpuinfo and try again.


Thanks for the advice (I wasn't aware the cpu could support features outside of /proc/cpuinfo), I updated the USE flags above. I just finished recompiling @world with SSE3 support, but it's still recording at 19 fps.

The odd thing is that if the arch binary had been compiled with any extra cpu features, it would have performed better when I copied the binary over to the Gentoo install.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Sun Feb 15, 2015 10:42 pm    Post subject: Reply with quote

It also looks like you enabled USE=xcb. That affects x11grab's internals and it might run better without.

Edit: ...and PIC. That's a security feature, which will kill performance.
Back to top
View user's profile Send private message
veridiam
n00b
n00b


Joined: 15 Feb 2015
Posts: 7

PostPosted: Sun Feb 15, 2015 11:18 pm    Post subject: Reply with quote

Ant P. wrote:
It also looks like you enabled USE=xcb. That affects x11grab's internals and it might run better without.

Edit: ...and PIC. That's a security feature, which will kill performance.


Thanks for the help Ant P. I really appreciate it. I'll disable both of those.

I did notice though after rooting around in the arch libraries that the arch version of libavcodec.so.56.13.100 is compiled with vaapi, which is a libva. I'm almost certain that not using vaapi is the big performance killer.

Will report back after a recompile

Edit:

Nope, but I'm getting about 23-24 fps now instead of 19.

Here's emerge --info ffmpeg
Code:
media-video/ffmpeg-2.5.4::gentoo was built with the following:
USE="X aac alsa bluray bzip2 cpudetection encode fontconfig gsm hardcoded-tables iconv libass mp3 network opengl openssl pulseaudio rtmp samba schroedinger sdl speex theora threads truetype v4l vaapi vdpau vorbis vpx x264 x265 xvid zlib -aacplus (-altivec) -amr -amrenc (-armv5te) (-armv6) (-armv6t2) (-armvfp) -bindist -bs2b -cdio -celt -debug -doc -examples -faac -fdk -flite -frei0r -fribidi -gme -gnutls -iec61883 -ieee1394 -jack -jpeg2k -ladspa -libcaca -libsoxr -libv4l -lzma (-mips32r2) (-mipsdspr1) (-mipsdspr2) (-mipsfpu) -modplug (-neon) -openal -opus -oss -pic -quvi -ssh -static-libs -test -twolame -wavpack -webp -xcb -zvbi" ABI_X86="64 -32 -x32" CPU_FLAGS_X86="avx mmx sse sse2 sse3 sse4_1 sse4_2 ssse3 -3dnow -3dnowext -avx2 -fma3 -fma4 -mmxext -xop" FFTOOLS="aviocat cws2fws ffescape ffeval ffhash fourcc2pixfmt graph2dot ismindex pktdumper qt-faststart trasher"
Back to top
View user's profile Send private message
veridiam
n00b
n00b


Joined: 15 Feb 2015
Posts: 7

PostPosted: Mon Feb 16, 2015 12:58 am    Post subject: Reply with quote

Just substituted libav for ffmpeg, and still I can't seem to record any faster.

Here's what my CPU usage looks like:
https://i.imgur.com/G1iuTVK.png

Most of the cores are hovering around 20%
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Mon Feb 16, 2015 1:31 am    Post subject: Reply with quote

veridiam ...

as you seem to be encoding H.264 you might also look at the useflags on media-libs/x264 (ie, threads, opencl, -pic, cpu_flags_x86_sse, etc). Not sure about opencl ... as I've not used it.

Also, an ldd of Arch's ffmpeg will give some clue as to what it's linked against, or not.

best ... khay
Back to top
View user's profile Send private message
veridiam
n00b
n00b


Joined: 15 Feb 2015
Posts: 7

PostPosted: Mon Feb 16, 2015 1:41 am    Post subject: Reply with quote

khayyam wrote:
veridiam ...

as you seem to be encoding H.264 you might also look at the useflags on media-libs/x264 (ie, threads, opencl, -pic, cpu_flags_x86_sse, etc). Not sure about opencl ... as I've not used it.

Also, an ldd of Arch's ffmpeg will give some clue as to what it's linked against, or not.

best ... khay


Thanks for the response khay, much appreciated.

There aren't many use flags for x264, but I have experimented and found no improvements.

Here's the info:
Code:
media-libs/x264-0.0.20140308::gentoo was built with the following:
USE="interlaced opencl threads -10bit -pic -static-libs" ABI_X86="64 -32 -x32" CPU_FLAGS_X86="sse"
CFLAGS="-march=native -O2 -pipe -msse -mfpmath=sse"
CXXFLAGS="-march=native -O2 -pipe -msse -mfpmath=sse"


and here's the ldd of ffmpeg on arch[code]

Edit: Smarter way of doing this
https://www.diffchecker.com/fyf55zsx
Left is gentoo, right is arch chroot

I also just noticed this in portage

* QA Notice: Package triggers severe warnings which indicate that it
* may exhibit random runtime failures.
* /var/tmp/portage/media-video/ffmpeg-2.5.4/work/ffmpeg-2.5.4/libavcodec/atrac3plus.c:1784:52: warning: array subscript is below array bounds [-Warray-bounds]
* /var/tmp/portage/media-video/ffmpeg-2.5.4/work/ffmpeg-2.5.4/libavcodec/libx264.c:658:32: warning: the address of ‘val’ will always evaluate as ‘true’ [-Waddress]

* Please do not file a Gentoo bug and instead report the above QA
* issues directly to the upstream developers of this software.
* Homepage: http://ffmpeg.org/
Back to top
View user's profile Send private message
veridiam
n00b
n00b


Joined: 15 Feb 2015
Posts: 7

PostPosted: Mon Feb 16, 2015 3:16 am    Post subject: Reply with quote

So I tried this:

(arch linux mounted at /mnt)
Code:

LD_LIBRARY_PATH=/mnt/usr/lib64 ffmpeg -f x11grab -r 60 -s 1920x1080 -i $DISPLAY+1920,0 -vcodec libx264 -preset ultrafast -threads 0 output.mkv

And this got me the 60fps. It's definitely some library file.

Anyone know a way of pinpointing the bottleneck lib? I've tried copying over libavcodec.so, libavformat.so, libavdevice.so, libavutil.so, libavfilter.so, libavresample.so, but still no effect.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Mon Feb 16, 2015 3:17 am    Post subject: Reply with quote

Quote:
Code:
media-video/ffmpeg-2.5.4::gentoo was built with the following:
USE="X aac alsa bluray bzip2 cpudetection encode fontconfig gsm hardcoded-tables iconv libass mp3 network opengl openssl pulseaudio rtmp samba schroedinger sdl speex theora threads truetype v4l vaapi vdpau vorbis vpx x264 x265 xvid zlib -aacplus (-altivec) -amr -amrenc (-armv5te) (-armv6) (-armv6t2) (-armvfp) -bindist -bs2b -cdio -celt -debug -doc -examples -faac -fdk -flite -frei0r -fribidi -gme -gnutls -iec61883 -ieee1394 -jack -jpeg2k -ladspa -libcaca -libsoxr -libv4l -lzma (-mips32r2) (-mipsdspr1) (-mipsdspr2) (-mipsfpu) -modplug (-neon) -openal -opus -oss -pic -quvi -ssh -static-libs -test -twolame -wavpack -webp -xcb -zvbi" ABI_X86="64 -32 -x32" CPU_FLAGS_X86="avx mmx sse sse2 sse3 sse4_1 sse4_2 ssse3 -3dnow -3dnowext -avx2 -fma3 -fma4 -mmxext -xop" FFTOOLS="aviocat cws2fws ffescape ffeval ffhash fourcc2pixfmt graph2dot ismindex pktdumper qt-faststart trasher"

Drop USE=cpudetection, it's not needed on Gentoo.

In ffmpeg-2.5.3/configure are these lines:
Code:
mmxext_deps="mmx"
sse_deps="mmxext"
sse2_deps="sse"
sse3_deps="sse2"
ssse3_deps="sse3"
sse4_deps="ssse3"
sse42_deps="sse4"


You don't have mmxext enabled, so you don't have anything after it enabled.

Quote:
Code:
media-libs/x264-0.0.20140308::gentoo was built with the following:
USE="interlaced opencl threads -10bit -pic -static-libs" ABI_X86="64 -32 -x32" CPU_FLAGS_X86="sse"
CFLAGS="-march=native -O2 -pipe -msse -mfpmath=sse"
CXXFLAGS="-march=native -O2 -pipe -msse -mfpmath=sse"


Code:
            -interlaced       enable interlaced encoding support, this can decrease encoding speed by up to 2%

Turn that off unless you're encoding h264 for analog SDTVs.

Remove the -msse and -mfpmath=sse from CFLAGS, those are defaults on amd64.

You apparently have USE=graphite on GCC (since that ldd output shows it), so make use of that:
Code:
CFLAGS="-march=native -O2 -pipe -floop-interchange -floop-strip-mine -floop-block -ftree-vectorize"
Back to top
View user's profile Send private message
veridiam
n00b
n00b


Joined: 15 Feb 2015
Posts: 7

PostPosted: Mon Feb 16, 2015 3:45 am    Post subject: Reply with quote

Ant P. wrote:

In ffmpeg-2.5.3/configure are these lines:
Code:
mmxext_deps="mmx"
sse_deps="mmxext"
sse2_deps="sse"
sse3_deps="sse2"
ssse3_deps="sse3"
sse4_deps="ssse3"
sse42_deps="sse4"


You don't have mmxext enabled, so you don't have anything after it enabled.


You genius, this was it. I didn't realize dependencies were strung like that.

I'll look into everything else for additional optimization, but that one change shot performance straight up.
I'll mark the thread solved, hopefully some future googler will find useful info here.

Thanks!
Back to top
View user's profile Send private message
haarp
Guru
Guru


Joined: 31 Oct 2007
Posts: 535

PostPosted: Tue Feb 17, 2015 4:06 pm    Post subject: Reply with quote

This is weird. If an user explicitly enables a certain instruction set via USE, why does an ebuild silently drop that decision just because another instruction set is not enabled? At the very least, an useful message should be displayed that this has happened.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Tue Feb 17, 2015 4:37 pm    Post subject: Reply with quote

The ebuild doesn't drop it, ffmpeg does, and it *does* display useful messages (which, of course, everyone is conditioned to ignore).
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Multimedia All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum