It is as if:
were compiled as:
struct s f(int x)
As a demonstration of the default way in which structures are passed and
returned consider the following code:
void f(struct s *result, int x)
This code is available in the examples directory as
two_ch.c. It can be compiled to produce Assembly Language source
by using the following command:
typedef struct two_ch_struct
{ char ch1;
char ch2;
} two_ch;
two_ch max( two_ch a, two_ch b )
{ return (a.ch1>b.ch1) ? a : b;
}
Where -li and -apcs 3/32bit can be omitted if armcc has been
configured appropriately already.
armcc -S two_ch.c -li -apcs 3/32bit
Here is the code which armcc produces:
The STMDB instruction saves the arguments onto the stack, together with
the frame pointer, stack pointer, link register and current pc value (this
sequence of values is the stack backtrace data structure).
max
MOV ip,sp
STMDB sp!,{a1-a3,fp,ip,lr,pc}
SUB fp,ip,#4
LDRB a3,[fp,#-&14]
LDRB a2,[fp,#-&10]
CMP a3,a2
SUBLE a2,fp,#&10
SUBGT a2,fp,#&14
LDR a2,[a2,#0]
STR a2,[a1,#0]
LDMDB fp,{fp,sp,pc}
a2 and a3 are then used as temporary registers to hold the the required part of the strucures passed, and a1 as a pointer to an area in memory in which the resulting struct is placed - all as expected.
For a basic explanation of register naming and usage under the APCS, see Register usage under the ARM procedure call standard. Detailed information can be found in C language calling conventions.
Whereas the structure used in the previous example is not integer-like:
struct
{ unsigned a:8, b:8, c:8, d:8;
}
union polymorphic_ptr
{ struct A *a;
struct B *b;
int *i;
}
Integer-like structs are returned by returning the struct's contents in a1
rather than a pointer to the struct's contents. Thus a1 is not needed to
pass a pointer to a result struct in memory, and is instead be used to
pass the first argument.
struct { char ch1, ch2; }
For example, consider the following code:
We would expect arguments a and b to be passed in registers a1 and a2, and
since half_word_struct is integer-like we expect the result structure to
be passed back directly in a1, (rather than a1 being used to return a
pointer to the result half_words_struct).
typedef struct half_words_struct
{ unsigned field1:16;
unsigned field2:16;
} half_words;
half_words max( half_words a, half_words b )
{ half_words x;
x= (a.field1>b.field1) ? a : b;
return x;
}
The above code is available in the examples directory as half_str.c. It can be compiled to produce Assembly Language source by using the following command:
Where -li and -apcs 3/32bit can be omitted if armcc has been
configured appropriately already.
armcc -S half_str.c -li -apcs 3/32bit
Here is the code which armcc produces:
Clearly the contents of the half_words structure is returned
directly in a1 as expected.
max
MOV a3,a1,LSL #16
MOV a3,a3,LSR #16
MOV a4,a2,LSL #16
MOV a4,a4,LSR #16
CMP a3,a4
MOVLE a1,a2
MOV pc,lr
As we have seen, this will result in a pointer to the structure being passed in a1, which will then be dereferenced to store the values returned.
For some applications in which such a function is time critical, the overhead involved in "wrapping" and then "unwrapping" this structure can be significant. However, there is a way to tell the compiler that a structure should be returned in the argument registers a1 - a4. Clearly this is only useful for returning structures which are no larger than 4 words.
The way to tell the compiler to return a structure in the argument registers is to use the keyword "__value_in_regs".
Multiplication - Returning a 64-bit result
To illustrate how to use __value_in_regs, let us consider writing a
function which multiplies two 32-bit integers together and returns the
64-bit result.
The way this function must work is to split the two 32-bit numbers (a, b) into high and low 16-bit parts,(a_hi, a_lo, b_hi, b_lo). The four multiplications a_lo * b_lo, a_hi * b_lo, a_lo * b_hi, a_hi * b_lo must be performed, and the results added together, taking care to deal with carry correctly.
Since the problem involves dealing with carry correctly, coding this function in C will not produce optimal code (see 64 Bit integer addition for more details). Therefore we will want to code the function in ARM Assembly Language. The following code performs the algorithm just described:
; On entry a1 and a2 contain the 32-bit integers to be multiplied (a, b)
; On exit a1 and a2 contain the result (a1 bits 0-31, a2 bits 32-63) mul64
Clearly this code is fine for use with Assembly language modules, but in
order to use it from C we need to be able tell the compiler that this
routine returns its 64-bit result in registers. This can be done by
making the following declarations in a header file:
MOV ip, a1, LSR #16 ; ip = a_hi
MOV a4, a2, LSR #16 ; a4 = b_hi
BIC a1, a1, ip, LSL #16 ; a1 = a_lo
BIC a2, a2, a4, LSL #16 ; a2 = b_lo
MUL a3, a1, a2 ; a3 = a_lo * b_lo (m_lo)
MUL a2, ip, a2 ; a2 = a_hi * b_lo (m_mid1)
MUL a1, a4, a1 ; a1 = a_lo * b_hi (m_mid2)
MUL a4, ip, a4 ; a4 = a_hi * b_hi (m_hi)
ADDS ip, a2, a1 ; ip = m_mid1 + m_mid2 (m_mid)
ADDCS a4, a4, #&10000 ; a4 = m_hi + carry (m_hi')
ADDS a1, a3, ip, LSL #16 ; a1 = m_lo + (m_mid<<16)
ADC a2, a4, ip, LSR #16 ; a2 = m_hi' + (m_mid>>16) + carry
MOV pc, lr
The Assembly Language code above, and the declarations above together with
a test program are all in the examples directory, as the files:
mul64.s, mul64.h, int64.h and multest.c. To
compile, assemble and link these to produce an executable image suitable
for armsd first set your current directory to examples, and
then execute the following commands:
typedef struct int64_struct
{ unsigned int lo;
unsigned int hi;
} int64;
__value_in_regs extern int64 mul64(unsigned a, unsigned b);
Where somewhere is the directory in which the semi-hosted C
libraries reside (eg. the lib directory of the ARM Software Tools
Release). Note also that -li and -apcs 3/32bit can be
omitted if armcc and armasm (and armsd below) have
been configured appropriately.
armasm mul64.s -o mul64.o -li
armcc -c multest.c -li -apcs 3/32bit
armlink mul64.o multest.o
multest can then be run under armsd as follows:
> armsd -li multest
A.R.M. Source-level Debugger, version 4.10 (A.R.M.) [Aug 26 1992]
ARMulator V1.20, 512 Kb RAM, MMU present, Demon 1.01, FPE, Little
endian.
Object program file multest
armsd: go
Enter two unsigned 32-bit numbers in hex eg.(100 FF43D)
To convince yourself that __value_in_regs is being used try removing it
from mul64.h, recompile multest.c, relink multest,
and rerun armsd. This time the answers returned will be incorrect,
as the result is no longer expected to be returned in registers, but
instead in a block of memory (ie. the code now has a bug).
12345678 10000001
Least significant word of result is 92345678
Most significant word of result is 1234567
Program terminated normally at PC = 0x00008418
0x00008418: 0xef000011 .... : > swi 0x11
armsd: quit
Quitting
>