Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Portage & Programming
  • Search

Fixing slow ninja 1.9.0 when invoked incorrectly from make

Problems with emerge or ebuilds? Have a basic programming question about C, PHP, Perl, BASH or something else?
Post Reply
Advanced search
1 post • Page 1 of 1
Author
Message
mmogilvi
n00b
n00b
Posts: 64
Joined: Fri May 13, 2011 3:13 am

Fixing slow ninja 1.9.0 when invoked incorrectly from make

  • Quote

Post by mmogilvi » Thu Mar 26, 2020 4:56 am

Summary:

If you recently noticed the new version of ninja (1.9.0) as invoked from GNU make suddenly taking a lot longer to build things in your environment, then the basic cause is a lack of a '+' symbol prefix in the makefile when invoking ninja.

-----

Details:

It took a bit of research to figure out what was happening, so I thought I would document it where it might help others:

Gentoo's >=dev-util/ninja-1.8.2-r1 appears to actually be a fork that includes a new feature to be able to hook into GNU make's recursive jobserver management protocol as a client of the jobserver: https://www.gnu.org/software/make/manua ... #Job-Slots

Overall this seems like a good idea. Modern large applications often tie together lots of third party libraries, and it is unrealistic for them all to use the same build system. Also, some web searches find various issues with ninja's default behavior running out of RAM on machines with lots of CPU cores but insufficient memory to run many very large C++ compilation units at once, and this integration makes it much more trivial to know how to work around such RAM limits in a recursive build environment (just invoke make with a smaller -j N option, and you don't even have to know if ninja is used under the hood).

However, there is a big gotcha that anyone invoking ninja from a GNU makefile needs to be aware of: When using the jobserver-aware version of ninja, the recipe that invokes ninja needs to start with a plus sign ('+'), as described in the above link. Otherwise things break slightly, as described below.

I encountered this when a nightly cron job that rebuilds and tests a large proprietary application suddenly took a lot longer to finish: Normally it finishes by 8:00 or 9:00 in the morning (worst case), but after updating ninja it was taking until the evening (nearly 24 hours).

After some investigation, I determined the culpret was when the cron job was building Qt's webengine component (currently Qt 5.9.9, although I think other versions would have the same problem):
  • Qt's generated Makefile (qtwebengine/src/core/Makefile.gn_run) would invoke ninja without the plus sign.
  • Make would close the jobserver protocol pipes normally inherited by the child process, since it did not identify ninja as something that could use them (no plus). But it would still pass the magic jobserver argument in the MAKEFLAGS environment variable.
  • Since they are closed/unused, ninja would re-use the jobserver file descriptors for its own files (.../.ninja_log and .../.ninja_dep, according to /proc/PID/fd).
  • But ninja would ALSO try to use those file descriptors for accessing job tokens. Ninja does not appear to detect and handle the second error condition bullet (closed file descriptors) mentioned in the above link.
  • Ninja would work, but it would only compile one file at a time. Since this ninja invocation involves building nearly 19000 commands, only utilizing one core slows it down by hours.
  • Also, ninja itself appears to be busy polling using an entire CPU core doing essentially nothing. I'm guessing it is doing a select() (or equivalent) on a plain file (instead of a pipe), which always returns immediately. I also suspect that the descriptor's file offset is always at the end of the file, so it never reads any "false" job tokens, nor corrupts the files that it trying to interpret as a jobserver. (This speculation is based on external behavior and looking through /proc/PID, not examining the code.)
Possible Fixes
  • Quick and dirty: Revert to ninja-3.8.2 (no jobserver) temporarily until you can setup better fixes in your environment.
  • Ideally, patch the makefile to use the plus sign. However, I'm not sure how to do that in Qt webengine, where the makefile is actually generated from a qt .pro file. How to add the plus sign without breaking other non-Makefile generation options?
  • For Qt, it works to set the NINJAFLAGS environment variable with an appropriate -j option. The generated makefile passes that variable as additional command line options into ninja. This is how the dev-qt/qtwebengine ebuilds avoid this problem. Maybe other project build scripts have similar environment variable hooks?
Possible Upstream Improvements in Various Projects?

Perhaps someone would be interested in asking upstream projects to make this easier?
  • The jobserver support in the ninja fork probably ought to verify that the indicated file descriptors are actually usable pipes early when the process starts up, and recover from "not pipes" in a more graceful or easily debugged fashion. (Print an error message and either give up completely, or fallback on "default" behavior without the jobserver.)
  • The Qt webengine generated makefile ought to invoke ninja with the plus sign.
  • And similar in any other project where a makefile is invoking ninja.
  • I'm not sure, but maybe GNU make could be a bit more graceful as well: Perhaps use something distinct in MAKEFLAGS ("--jobserver-auth-CLOSED"?) if it is closing the file descriptors because it doesn't think the child is a recursive make, to avoid misinterpretation? Perhaps consider changing it to default to leaving the file descriptors open for all children unless asked otherwise, even thought this risks breaking things if some children scripts/executables somewhere are doing something strange with explicitly numbered file descriptors?
Top
Post Reply
1 post • Page 1 of 1

Return to “Portage & Programming”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic