The title would have been «Understanding the relation between separate ASM files and optimisations».
I'm asking this in the context of cross-compiling for the AVR platform but it's about a behaviour I noticed more generally.
In projects that combine C, C++ and assembly files (.S), I noticed no optimisation are applied to the latter. What I'm referring to is, for instance compared with inline assembly instructions with asm volatile, in which case GCC may happen to reorder instructions, change registers, inline functions or even eliminate useless instructions/code paths, that doesn't seem to happen with external assembly files. Assuming my observations are correct (and I am not so sure about that, hence I'm asking), is there a way to apply optimisations to external assembly source files?
As an illustration, here are my compilation flags, as to the relevant parts:
Code: Select all
CFLAGS = -Os -fshort-enums -Wno-error=narrowing -ffreestanding -mrelax -ffunction-sections -fdata-sections -flto -pipe
ASFLAGS = -x assembler-with-cpp $(CFLAGS) # <-- copy CPU, F_CPU and -g if present
Code: Select all
#include <avr/io.h>
#if defined(SPMCSR) && (defined(RSIG) || defined(SIGRD))
#if !defined(SIGRD)
#define SIGRD RSIG
#endif
#define __BOOT_SIGROW_READ (_BV(SPMEN) | _BV(SIGRD))
#define __SPMCSR _SFR_IO_ADDR(SPMCSR)
#define __SIGADDR 0
.section .text
;
; Arguments: r25:r24 = pointer to the data structure
; (See https://www.avrfreaks.net/forum/gcc-calling-conventions)
;
; Problems:
; 1) implies a `call`,
; 2) not inlined, ever,
; 3) not optimised in general!...
;
.global _read_signature
_read_signature:
movw r26, r24 ; Transfer the input pointer to the X register pair (destination)
ldi r30, lo8(__SIGADDR)
ldi r31, hi8(__SIGADDR)
ldi r24, __BOOT_SIGROW_READ
out __SPMCSR, r24 ; Prepare loading from program memory
lpm r0, Z+
st X+, r0
adiw r30, 0x1 ; Advance the source pointer one byte further as signature bytes
; are located at even adresses.
out __SPMCSR, r24 ; Let's repeat this two more times (3 bytes to read/store)
lpm r0, Z+
st X+, r0
adiw r30, 0x1
out __SPMCSR, r24
lpm r0, Z
st X, r0
ret
#endif
Thanks in advance anyway for your insight.

