Obviously the most effort-consuming task is porting between two entirely different hardware environments, running different operating systems with different compilers. Because many users of the ARM C compiler will face just this situation, this section deals with the issues that the user should be aware of when porting software to or from the ARM C system environment. In outline:
If code is to be used on a variety of different systems, there are certain issues that should be borne in mind to make porting an easy and relatively error-free process. It is essential to identify practices which may make software system-specific, and to avoid them. In the remainder of this section, we document the general portability issues for C programs.
A common non-portable assumption is embedded in the use of hexadecimal constant values. For example:
In non-ANSI dialects of C there are pitfalls with argument passing. Consider, for example:
int i = 0xffff; /* -1 if sizeof(int) == 2;
65535 if sizeof(int) == 4... */
and the (careless) invocation of f():
int f(x)
long int x;
{...}
If sizeof(int) == sizeof(long int), all will be well; otherwise there may be catastrophe.
f(1); /* f(1L) was intended/required */
A dual problem afflicts the format string of the printf() family, even in ANSI C. For example:
Again, if sizeof(int) != sizeof(long) we have dangerous nonsense.
long int l1, l2, l3;
...
printf("L1 = %d, L2 = %d, L3 = %d\n", l1, l2, l3);
/* "...%ld...%ld...%ld..." is intended/required */
Another common assumption is about the signedness of characters, especially if chars are expected to be 7-bit quantities rather than 8-bit ones. For example, consider:
Note that declaring i to be unsigned int doesn't help (it merely causes ch = tr_tab[i] to index a very long way off the other end of the array!).
static char tr_tab[256] = {...};
...
int i, ch;
...
i = fgetc(f); /*should be i = (unsigned char) fgetc(f) */
ch = tr_tab[i]; /* WRONG if chars are signed... */
In non-ANSI dialects of C there is no way to explicitly declare a signed char, so plain chars tend to be signed by default (as with the ARM C compiler in -pcc mode). In ANSI C, a char may be plain, signed or unsigned, so a plain char tends to be whatever is most natural for the target (unsigned char on the ARM).
This code will only work on a machine with 'little-endian' byte order.
unsigned a;
char *p = (char *)&a;
unsigned w = AN_ARBITRARY_VALUE;
while (w != 0) /* put w in a */
{ *p++ = w; /* or, maybe, w byte-reversed... */
w >>= 8;
}
The best solution to this class of problems is either to write code which does not rely on byte order, or to have separate code to deal appropriately with the different byte orders.
The values of holes created by alignment restrictions are undefined, and you should not make assumptions about these values. Strictly, two structures with identical members, each having identical values, will only be found to be equal if field-by-field comparison is used; a byte-by-byte, or word-by-word, comparison need not indicate equality.
In practice, this can be a real problem for both auto structs and structs allocated dynamically using malloc. If byte-by-byte comparability of such structures is required, they must be zeroed using memset() before assigning field values.
Padding may also have implications for the space required by a large array of structs. For example:
may require 40KB, 60KB or 80KB depending on the size and alignment of ints and shorts (assume a short occupies 2 bytes, 2-byte aligned; then consider a 2-byte int, a 4-byte int 2-byte aligned, and a 4-byte int 4-byte aligned).
#define ARRSIZE 10000
typedef struct
{ int i;
short s;
} ELEM;
ELEM arr[ARRSIZE];
The problem is further compounded when taking the difference of two pointers by performing a subtraction. When the difference is large, this approach is full of potential errors. ANSI C defines a type ptrdiff_t, which is capable of reliably storing the result of subtracting two pointer values of the same type; a typical use of this mechanism would be to apply it to pointers into the same array.
Although the difference between any two pointers of similar type may be meaningful in a flat address space, only the difference between two pointers into the same object need be meaningful in a segmented address space.
Finally, there are problems of evaluation order with address arithmetic. Consider:
Now suppose this latter expression were:
long int base, offset;
char *p1, *p2;
....
offset = base + (p2 - p1); /*intended effect */
In a flat address space without holes the expressions are equivalent. In a segmented address space, (p2 - p1) may well be a valid offset within a segment, whereas (base + p2) may be an invalid address. If, in the second case, the validity is checked before subtracting p1, then the expression will fault. This latter class of problem will be familiar to MS-DOS programmers, but alien to those whose main experience is of Unix.
offset = (base + p2) - p1;
it is unclear whether the call is f(3, 3) or f(4, 3).
i = 3;
f(i, i++);
Of course, it is in general unwise for argument expressions to have side effects, for many reasons.
File names and file-name processing are common sources of non-portability which are often surprisingly painful to deal with. Again, the best approach is to localise all such processing.
Binary data files are inherently non-portable. Often the only solution to this problem may be the use of some portable external representation.