The outputs are created from:
The details of how a stub is initialised at load time or run time (so that a call to a stub entry point becomes a call to the corresponding library function) are system-specific. The linker provides a general mechanism for attaching code and data to both the library and the stub to support this. In particular:
Alternatively, in support of more protected systems, the patching code can simply be a call to a system service which locates the matching library and patches the entry vector.
The patching of shared library entry vectors by the loader at load time is not directly supported. However, it would be a relatively simple extension to AIF to support this. In general, it is considered more efficient to patch on demand in systems with multiple shared libraries.
The user-specified parameter block mechanism allows fine control over, and diagnosis of the compatibility of a stub with a version of its shared library. This supports a variety of approaches to foreverness, without mandating foreverness where it would be inappropriate. This issue is discussed in Versions, compatibility and foreverness.
The shared library addressing architecture
The central issue for shared objects is that of addressing their clients'
static data.
On ARM processors, it is very difficult, and/or inefficient to avoid the use of address constants when addressing static data, particularly the static data of separately compiled or assembled objects; (an address constant is a pointer which has its value bound at link time-in effect, it is an execution-time constant).
Typically, in non-reentrant code, these address constants are embedded in each separately compiled or assembled code segment, and are, in turn, addressed relative to the program counter. In this organisation, all threadings of the code address the same, link-time bound static data.
In a reentrant object, these address constants (or adcons) are collected into a separate area (in AOF terminology called a based area) which is addressed via the sb register. When reentrant objects are linked together, the linker merges these adcon areas into a single, contiguous adcon vector, and relocates the references into the adcon vector appropriately (usually by adjusting the offset of an LDR ..., [sb, offset] instruction). The output of this process is termed a link unit.
In this organisation, it is possible for different threadings of the code to address different static data, and for the binding of the code to its data to be delayed until execution time, (an excellent idea if the code has to be committed to ROM, even if reentrancy is not required).
When control passes into a new link unit, a new value of sb has to be established; when control returns, the old value must be restored. A call between two separately linked program fragments is called an inter link unit call, or inter-LU call. The inter-LU calls are precisely the calls between a shared library's stub and the library's matching entry points.
Because an LDR instruction has a limited (4KB) offset, the linker packs adcons into the low-address part of the based-sb area. It is a current restriction that there can be no more than 1K adcons in a client application (but this number seems adequate to support quite large programs using several megabytes of sharable code).
The linker places the data for the inter-LU entry veneers immediately after the adcon vector (still in the based-sb area). If the stub is reentrant (to support linking into other shared libraries), then the inter-LU entry data consists of:
A reentrant function called via a function pointer or from a non-reentrant caller, must have its sb value loaded pc-relative, as there is no sb value relative to which to load it. In turn, this forces the entry veneer to be part of the client's private data (or there could be no reentrancy).
When you ask for a read-only copy of a data area to be included in a shared library, the linker checks it is a simple, initialised data area. The following cannot be included in a shared library:
Names containing $$ are reserved to the implementors of the ARM software development tools, so these linker-invented area names cannot clash with any area name you choose yourself.
This allows up to 32K entry veneers to be addressed, (V1 and V2 are
jointly relocated by the linker and support a word offset in the range
0-65K). The corresponding inter-LU data is:
FunctionName
ADD ip, sb, #V1 ; calculate offset of veneer
data
; from sb
ADD ip, ip, #V2
LDMIA ip, {ip, pc} ; load new-sb and pc
values
Both of these values are created when the stub is patched, as introduced
above and described in detail below.
DCD new-sb ; sb value for called link
unit
DCD entry-point ; address of the
library entry point
The inter-LU code for an indirect or non-reentrant inter-LU call is:
Again, the data values are created when the stub is patched.
FunctionName
ADD ip, pc, #0 ; ip = pc+8
LDMIA ip, {ip, pc} ; load new-sb and pc
values
DCD new-sb ; sb value for called link
unit
DCD entry-point ; address of the library
entry point
Base ; sb points here
Note the assumption that a stack has been created before any
attempt is made to access the shared library. Note also that the word
preceding End is initialised to the address of End.
End
STMFD sp!, {r0-r6,r14} ; save work
registers and lr
LDR r0, End-4 ; load address of End
B |__rt_dynlink| ; do the dynamic
linking...
DCD Params - Base ; offset to
sb-value
Params
__rt_dynlink
,
referred to above, can be implemented as described in this section.
On entry to __rt_dynlink
, a copy of the pointer is saved to
the code/parameter block at the end of the inter-LU data area, and a bound
is calculated on the stub size (the entries are in index order).
Then it is necessary to locate the matching library, which the following
fragment does in a simple system-specific fashion. Note that in a library
which contains no read-only static data image, r0+16 identifies the user
parameter block (at the end of the inter-LU data area); if the library
contains an initialising image for its static data then r0+24 identifies
the user parameter block.
|__rt_dynlink|
MOV r6, r0
LDR r5, [r6, #-8] ; max-entry-index
ADD r5, r5, #1 ; # max entries in stub
MOV r4, ip ; resume index
Here, the library location function is shown as a SWI which takes as its argument in r0 a pointer to the user paramter block and returnsthe address of the matching External Function Table in r0:
R0 now points to the EFT, which begins with the number of entries in it. A
simple sanity check is that if there are fewer entries in the library than
in the stub, it has probably been patched incorrectly.
ADD r0, r6, #24 ; stub parameter block
address
SWI Get_EFT_Address ; are you there?
BVS Botched ; system-dependent
If the shared library contains data to be copied into the stub then check
the length to copy:
LDR ip, [r0] ; #entries in lib
CMPS ip, r5 ; >= #max entries in stub?
BLT Botched ; no, botched it...
Checking the stub data length and library data length match is a naive,
but low-cost, way to check the library and the stub are compatible. Now
copy the static data from the library to the stub:
LDR ip, [r6, #16] ; stub data length
BIC ip, ip, #3 ; word aligned, I
insist...
ADD r3, r6, #4
LDR r3, [r3, r5, LSL #2] ; library
data length
CMPS r3, ip
BNE Botched ; library and stub
lengths differ
Then initialise the entry vectors. First, the sb value is computed for the
callee:
LDR r3, [r6, #20] ; stub data
destination
SUB r2, r0, ip ; library data
precedes the EFT
01 SUBS ip, ip, #4 ; word by word copy
loop
LDRGE r1, [r2], #4
STRGE r1, [r3], #4
BGE %B01
If there is no static data in the library then #24 above becomes #16.
LDR ip, [r6, #12] ; length of
inter-LU data area
ADD r3, r6, #24 ; end of data area...
SUB r3, r3, ip ; start of data area
= sb value
Then the following loop works backwards through the EFT indices, and backwards through the inter-LU data area, picking out the indices of the EFT entries which need to be patched with an sb, entry-point pair. Ip still holds the index of the entry which caused arrival at this point, which is the index of the entry to be retried after patching the stub. The corresponding retry address is remembered in r14, which was saved by the code fragment at the end of the inter-LU data area before it branched to __rt_dynlink. A small complication is that the step back through a non-reentrant stub may be either 8 bytes or 16 bytes. However, there can be no confusion between an index (a small integer) and an ADD instruction, which has some top bits set.
Finally, when the vector has been patched, the failed call can be
retried:
LDR r2, [r6, #-8]! ; index of stub
entry
00 SUB ip, r5, #1 ; index of the lib
entry
CMPS ip, r2 ; is this lib entry in
the stub?
SUBGT r5, r5, #1 ; no, skip it
BGT %B00
CMPS r2, r4 ; found the retry index?
MOVEQ lr, r6 ; yes: remember retry
address
LDR ip, [r0, r5, lsl #2] ; ; entry
point offset
ADD ip, ip, r0 ; entry point address
STMIA r6, {r3, ip} ; save {sb, pc}
LDR r2, [r6, #-8]! ; load index and
decrement r6...
TST r2, #&ff000000 ; ... or if
loaded an instr?
LDRNE r2, [r6, #-8]! ; ...load index
and decrement r6
SUBS r5, r5, #1 ; #EFT entries left...
BGT %B00
MOV ip, lr ; retry address
LDMFD sp!, {r0-r6, lr} ; restore saved
regs
LDMIA ip, {ip, pc} ; and retry the call
Versions, compatibility and foreverness
The mechanisms described so far are very general and, of themselves, give
no guarantee that a stub and a library will be compatible, unless the stub
and the library were the complementary components produced by a single
link operation.
Often, in systems using shared libraries, stubs are bound into applications which must continue to run when a new release of the library is installed. This requirement is especially compelling when applications are constructed by third party vendors or end users.
The general requirements for compatibility are as follows:
Because the addresses of library entry points are not bound into a stub until run-time, the only foreverness guarantees a library must give are:
For libraries which export only code, and which make no use of static data, compatibility is straightforward to manage. Use of static data is more hazardous, and the direct export of it is positively lethal.
If a static data symbol is exported from a shared library, what is actually exported is a symbol in the library's stub. This symbol is bound when the stub is linked into an application and, from that instant onwards, cannot be unbound. Thus the direct export of a data symbol fixes the offset and length of the corresponding datum in the shared library's data area, forever (i.e. until the next incompatible release).
The linker does not fault the direct export of data symbols because the ARM shared library mechanism may not be being used to build a shared library, but is instead being used to structure applications for ROM. In this case a prohibition could be irksome. Those specifying or building genuine shared libraries need to be aware of this issue, and should generally not make use of directly exported data symbols. If data must be exported directly then:
In practice, it is rare for the direct export of static data to be genuinely necessary. Often a function can be written to take a pointer to its static data as an argument, or a function can be used to return the address of the relevant static data (thus delaying the binding of the offset and size of the datum until run-time, and avoiding the foreverness trap). It is only if references to a datum are frequent and ubiquitous that direct export is unavoidable. For example, a shared library implementation of an ANSI C library might export directly errno, stdin, stdout and stderr, (and even errno could be replaced by (*__errno()), with few implications).
Describing a shared library to the linker
A shared library description consists of a sequence of lines. On all
lines, leading white space (blank, tab, VT, CR) is ignored.
If the first significant character of a line is a semicolon (';') then the line is ignored. Lines beginning with ';' can be used to embed comments in a shared library description. A comment can also follow a \ which continues a parameter block description.
If the first significant character of a line is > then the line gives the name and parameter block for the library. Such lines can be continued over several contiguous physical lines by ending all but the last line with ''. For example:
The first word following the > is the name of the file to hold the
shared library binary image; the argument to the linker's -Output option
is used to name the stub. Following tokens are parameter block entries,
each of which is either a quoted string literal or a 32-bit integer. In
the parameter block, each entry begins on a 4-byte boundary.
> testlib \ ; the name of the library image file
"testlib" \ ; the text name of the library -> parameter
block
101 \ ; the library version number
0x80000001
Within a quoted string literal, the characters '"' and '' must be preceded by '' (the same convention as in the C programming language). Characters of a string are packed into the parameter block in ascending address order, followed by a terminating NUL and NUL padding to the next 4-byte boundary.
An integer is written in any way acceptable to the ANSI C function strtoul() with a base of 0. That is, as an optional '-' followed by one of:
A line beginning with a '+' describes input data areas to be included, read-only, in the shared library and copied at run time to place holders inthe library's clients. The general format of such lines is a list of object(area) pairs instructing the linker to include area area from object object:
If object is omitted then any object in the input list will match.
For example:
+ object ( area ) object ( area ) ...
instructs the linker to include all areas called C$$data, whatever
objects they are from.
+ (C$$data)
If area is omitted too, then all sutitable input data areas will be included in the library. This is the most common usage. For example:
Finally, a '+' on its own excludes all input data areas from the
shared library but instructs the linker to write zero length and address
or offset words immediately preceding the stub and library parameter
blocks, for uniformity of dynamic linking.
+ ()
All remaining non-comment lines are taken to be the names of library entry points which are to be exported, directly or via function pointers. Each such line has one of the following three forms:
The first form names a directly exported global symbol: a direct entry
point to the library, or the name of an exported datum (deprecated).
entry-name
entry-name()
entry-name(object-name)
The second form names a global code symbol which is exported indirectly via a function pointer. Such a symbol may also be exported directly.
The third form names a non-exported function which, nonetheless, is exported from the library by being passed as a function argument, or by having its address taken by a function pointer. To clarify this, suppose the library contains:
...and that f1 is to be exported directly. Then a suitable
description is:
void f1(...) {...}
void f2(...) {...}
static void f3(...) {...} /* from object module o3.o */
static void (*fp2)(...) = f2;
void (*pf3)(...) = f3;
Note that f2 and f3 have to be listed even though they are
not directly exported, so that function variable veneers can be created
for them.
f1
f2()
f3(o3)
pf3 /* deprecated direct export of a datum */
f3 must be qualified by its object module name, as there could be several non-exported functions with the same name (each in a differently named object module). Note that the module name, not the name of the file containing the object module, is used to qualify the function name.
If f2 were to be exported directly then the following list of entry points would be appropriate:
Unless all address-taken functions are included in the export list, the
linker will complain and refuse to make a shared library.
f1
f2
f2()
f3(o3)
pf3
-------------------------------------------------------- Symbol |Definition -------------------------------------------------------- EFT$$Offset |Offset of the External Function Table |from the beginning of the shared |library; -------------------------------------------------------- EFT$$Params |Offset of the shared library's |parameter block from its beginning; -------------------------------------------------------- $$0$$Base |The (relocatable) address of the |zero-initialised place holder in the |stub; -------------------------------------------------------- SHL$$data$$Base |Offset of the start of the read-only |copy of the data from the beginning of |the shared library; -------------------------------------------------------- SHL$$data$$Size |Size of the shared library's data |section, which is also the size of the |place holder in the stub. --------------------------------------------------------
EFT$$Offset and EFT$$Params are exported to the stub and may be referred to in following link steps; the others exist only while the shared library is being constructed.