View previous topic :: View next topic |
Author |
Message |
netsplit n00b

Joined: 10 Jun 2024 Posts: 20
|
Posted: Fri Jul 11, 2025 5:40 pm Post subject: Speeding up complilation on Raspberry Pi 5 bare metal |
|
|
Obviously it's never gonna be blazing but so far I've noticed:
By default the kernel is in power saving mode. It can be set to ondemand with echo "ondemand" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. You can also change the default in your kernel config.
MAKEOPTS -ln seems to be overly aggressive compared to amd64. Setting it forced many gcc builds to a single thread for me. Unsetting it seems to gave fixed that. However now it seems I might need to investigate water cooling because it throttles from the heat now lol |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55432 Location: 56N 3W
|
Posted: Fri Jul 11, 2025 6:29 pm Post subject: |
|
|
netsplit,
For cooling you need the official Pi5 fan assisted cooler. That keeps the Pi out of thermal throttling, unless you put the Pi and cooler into a case that restricts the airflow.
With MAKEOPTS="-j4" and --jobs=1 that will let you build Chromium in only 32h.
You could also try cross distcc and ccache. Both have their drawback as they don't work for everything.
-- edit --
Try Code: | # vcgencmd get_throttled && vcgencmd measure_temp && vcgencmd measure_clock arm
throttled=0x0
temp=65.9'C
frequency(0)=2400030464 | to see what's going on and Code: | # genlop -c
Currently merging 1 out of 1
* www-client/chromium-139.0.7258.31
current merge time: 2 hours, 20 minutes and 28 seconds.
ETA: 1 day, 4 hours, 31 minutes and 22 seconds.
| That estimate is about 2 hors short. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
netsplit n00b

Joined: 10 Jun 2024 Posts: 20
|
Posted: Sat Jul 12, 2025 12:02 am Post subject: |
|
|
NeddySeagoon wrote: | netsplit,
For cooling you need the official Pi5 fan assisted cooler. That keeps the Pi out of thermal throttling, unless you put the Pi and cooler into a case that restricts the airflow.
With MAKEOPTS="-j4" and --jobs=1 that will let you build Chromium in only 32h.
|
When updates finish I'll try building Chromium. A 32 hour build sounds like an experience.
I tried getting distcc working but ran into problems. Also tried setting up an arm64 vm, and discovered Qemu has some issues with threading. The Pi seems to build fast enough. One
thing that seems to make a difference is the storage medium. It builds a lot faster off an nvme drive than a usb stick.
Quote: |
Try Code: | # vcgencmd get_throttled && vcgencmd measure_temp && vcgencmd measure_clock arm
throttled=0x0
temp=65.9'C
frequency(0)=2400030464 | to see what's going on and |
Code: |
raspberrypi ~ # vcgencmd get_throttled && vcgencmd measure_temp && vcgencmd measure_clock arm
throttled=0x0
temp=67.5'C
frequency(0)=2400037120
|
Seems like it's doing okay. I don't have the raspberry pi heatsink fan but I have a 3rd party one. At some point it's going to be installed in my car and I live in a sunny place so it'll probably need better cooling, at least when the car starts. |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55432 Location: 56N 3W
|
Posted: Sat Jul 12, 2025 9:59 am Post subject: |
|
|
netsplit,
Chromium on Pi5 needs both -mcpu and -march unset to avoid build failures.
The bug has been reported to Gentoo but not yet upstream.
It needs more work first.
NVMe is a lot faster than USB. Especially if your USB is bulk mode only. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
netsplit n00b

Joined: 10 Jun 2024 Posts: 20
|
Posted: Sat Jul 12, 2025 3:04 pm Post subject: |
|
|
NeddySeagoon wrote: | netsplit,
Chromium on Pi5 needs both -mcpu and -march unset to avoid build failures.
The bug has been reported to Gentoo but not yet upstream.
It needs more work first.
NVMe is a lot faster than USB. Especially if your USB is bulk mode only. |
Thanks for the tips!
I think the storage medium was the biggest surprise, but it makes sense. Even on an NVMe (that I've verified is running at PCIe 3.0) I'm seeing CPU usage percentages are ranging between 25% to 75% per build thread. Assuming the CPU usage percents are accurate, the CPU isn't the current bottleneck. When I'll do the chromium build I'll try setting the build environment to use a ram drive. |
|
Back to top |
|
 |
netsplit n00b

Joined: 10 Jun 2024 Posts: 20
|
Posted: Wed Jul 16, 2025 3:40 am Post subject: |
|
|
Just a follow up, tmpfs was actually worse. 38.25 hour build. Oddly the build threads had higher CPU use. It seems the RAM file system had more CPU overhead then I would have guessed. I also suspect there was some swap usage which would have negated the point of tmpfs.
Here's the setup:
pi 5, 16gb ram
/etc/fstab
Code: |
#size=10G would fail for not enough disk space
tmpfs /var/tmp/tmpfs tmpfs size=102401M,uid=portage,gid=portage,mode=775 0 0
|
/etc/portage/env/tmpfs.conf
Code: |
PORTAGE_TMPDIR="/var/tmp/tmpfs"
|
/etc/portage/package.env
Code: |
www-client/chromium tmpfs.conf
|
Code: |
[ebuild R ] www-client/chromium-138.0.7204.92:0/stable::gentoo USE="cups hangouts official proprietary-codecs pulseaudio qt6 rar screencast system-harfbuzz system-png system-zstd wayland -X -bindist -bundled-toolchain -custom-cflags -debug -ffmpeg-chromium -gtk4 (-headless) -kerberos -pax-kernel (-pgo) (-selinux) (-system-icu) -test -vaapi (-widevine)" L10N="-af -am -ar -bg -bn -ca -cs -da -de -el -en-GB -es -es-419 -et -fa -fi -fil -fr -gu -he -hi -hr -hu -id -it -ja -kn -ko -lt -lv -ml -mr -ms -nb -nl -pl -pt-BR -pt-PT -ro -ru -sk -sl -sr -sv -sw -ta -te -th -tr -uk -ur -vi -zh-CN -zh-TW" LLVM_SLOT="20 -19" 0 KiB
|
Results:
Code: |
raspberrypi ~ # genlop -t chromium
* www-client/chromium
Tue Jul 15 07:07:49 2025 >>> www-client/chromium-138.0.7204.92
merge time: 1 day, 14 hours, 14 minutes and 10 seconds.
|
So tmpfs added an extra 6 hours lol. |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55432 Location: 56N 3W
|
Posted: Wed Jul 16, 2025 9:31 am Post subject: |
|
|
netsplit,
That result does not surprise me. If you have the RAM to build in RAM, the kernel cache will do it anyway.
Building in tmpfs saves writes that will never be read. That may be a good thing for SSDs.
Reads/writes are all DMA, so time savings from not setting up DMA will be too small to measure.
If you have swap, the content of tmpfs can be moved to swap under pressure of RAM.
When you don't have swap, there is no home on disk fbr dynamically allocated RAM, so the kernel has to 'swap' in other ways.
It can drop clean pages, then reload them later. This includes code that it will execute real soon now.
It can write 'dirty' pages out, so that they are clean, then drop/reload them.
All my Pis have either 8G of swap, for RAM <= 8G or 16G on the 16G ones.
It may not be used but it makes it easy to spot when things are being pushed a bit hard.
Then it's time to reduce MAKEOPTS on a per package basis. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Hu Administrator

Joined: 06 Mar 2007 Posts: 23652
|
Posted: Wed Jul 16, 2025 1:19 pm Post subject: |
|
|
netsplit wrote: | Here's the setup:
pi 5, 16gb ram
/etc/fstab Code: | #size=10G would fail for not enough disk space
tmpfs /var/tmp/tmpfs tmpfs size=102401M,uid=portage,gid=portage,mode=775 0 0 |
| Normalizing that size: Code: | $ numfmt --to=iec --from=iec 102401M
101G | You told the kernel to allow up to ~101G in the tmpfs, but you only had 16G of real RAM to use, with that split between the tmpfs pages and ordinary usage. As you and Neddy noted, swapping is bad for performance, and the configuration you set here makes it easy to overload the tmpfs to the point that swapping is needed. Typical guidance for C++ heavy programs is to plan for 2GiB RAM per compiler process, so even if your 10G had been sufficient disk space and you had only emerge running, you could not count on using more than ~3 concurrent compiler processes. I'm assuming you bumped to 101G without measuring exactly how much you needed, but even if bumping to 20G had been sufficient, that would have left you with at most negative 2 compilers running, if you wanted to avoid swapping. You need at least positive 1 compilers running to make forward progress, and ideally you want enough compiler processes to saturate every CPU core. Chromium is well known to be huge and slow to build. I think the Pi 5 is just not powerful enough to build Chromium in tmpfs in a reasonable time, and arguably not powerful enough to build Chromium in reasonable time at all. |
|
Back to top |
|
 |
netsplit n00b

Joined: 10 Jun 2024 Posts: 20
|
Posted: Wed Jul 16, 2025 2:43 pm Post subject: |
|
|
NeddySeagoon wrote: | netsplit,
That result does not surprise me. If you have the RAM to build in RAM, the kernel cache will do it anyway.
Building in tmpfs saves writes that will never be read. That may be a good thing for SSDs.
Reads/writes are all DMA, so time savings from not setting up DMA will be too small to measure.
If you have swap, the content of tmpfs can be moved to swap under pressure of RAM.
When you don't have swap, there is no home on disk fbr dynamically allocated RAM, so the kernel has to 'swap' in other ways.
It can drop clean pages, then reload them later. This includes code that it will execute real soon now.
It can write 'dirty' pages out, so that they are clean, then drop/reload them.
All my Pis have either 8G of swap, for RAM <= 8G or 16G on the 16G ones.
It may not be used but it makes it easy to spot when things are being pushed a bit hard.
Then it's time to reduce MAKEOPTS on a per package basis. |
The swap usage was indeed not surprising when I discovered just how much space Chromium wants to build. Still tried it. Swap had 32GB available so I'm certain it had enough at least. It shouldn't ever need more than 48gb (32+16) total memory. Was mostly just testing things to learn. The actual paging is quite interesting. Thank you for shedding more light on it. I used to believe setting swapiness to 0 was ideal because it'd prevent swap usage. It might have been better on old systems where using the page file would cause freezes, but 0 in hindsight probably wasn't ideal.
Hu wrote: | netsplit wrote: | Here's the setup:
pi 5, 16gb ram
/etc/fstab Code: | #size=10G would fail for not enough disk space
tmpfs /var/tmp/tmpfs tmpfs size=102401M,uid=portage,gid=portage,mode=775 0 0 |
| Normalizing that size: Code: | $ numfmt --to=iec --from=iec 102401M
101G | You told the kernel to allow up to ~101G in the tmpfs, but you only had 16G of real RAM to use, with that split between the tmpfs pages and ordinary usage. As you and Neddy noted, swapping is bad for performance, and the configuration you set here makes it easy to overload the tmpfs to the point that swapping is needed. Typical guidance for C++ heavy programs is to plan for 2GiB RAM per compiler process, so even if your 10G had been sufficient disk space and you had only emerge running, you could not count on using more than ~3 concurrent compiler processes. I'm assuming you bumped to 101G without measuring exactly how much you needed, but even if bumping to 20G had been sufficient, that would have left you with at most negative 2 compilers running, if you wanted to avoid swapping. You need at least positive 1 compilers running to make forward progress, and ideally you want enough compiler processes to saturate every CPU core. Chromium is well known to be huge and slow to build. I think the Pi 5 is just not powerful enough to build Chromium in tmpfs in a reasonable time, and arguably not powerful enough to build Chromium in reasonable time at all. |
I meant to do 10 GB + 1 MB because 10G was erroring out with an error of need 10 gigs of disk space, so assumed it was a greater than check (but hindsight perhaps emerge checks for gibibytes, and tmpfs works in gigabytes). Anyway silly math goof aside, lucky for me tmpfs didn't try to take all that, and just took ram needed for actual virtual storage. I was attempting to use ram storage to saturate the CPU cores, because it seemed a single lane PCIe 3.0 bus wasn't enough. The problems you noted prevented that. Thank you for catching my goof with tmpfs.
Anyway I agree with your conclusion. if I ever need Chromium on Raspberry Pi 5 I'll use a bin package. |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|