If you have a component (application, module) written in C then it’s relatively easy to add assembler routines to the code and make use of them. But if you have an assembler component then adding C routines to it is a more complicated prospect. There isn’t any de-facto method for adding C to an assembler component – each situation tends to be a little bit different, and so the suitability of different approaches will vary as well. This page aims to be a mostly informal guide to the problems that you’ll face and some of the ways to work around them.
Note that this guide is mainly concerned with using code compiled by the ROOL DDE/Norcroft – other compilers are likely to have their own requirements.
(WORK IN PROGRESS)
First, a summary of the key problems you’ll face:
Now details about each problem and how to solve it.
There’s no easy solution to this one. If you only need a small amount of functionality, consider writing your own implementations of the code (or ‘borrow’ the CLib implementations if the license conditions of the component you’re working on allow it). If you need to use large amounts of CLib functionality – or if the component isn’t very large (e.g. one file assembler module) – consider just rewriting everything in C.
If you need to use large amounts of CLib functionality, and rewriting the component in C isn’t an option, you might be able to write your own stubs to allow for linking against CLib. However that’s beyond the scope of this guide.
In a linked program (whether that program is an application or a module), any references to code or data are typically represented in the form of absolute addresses. For applications this is fine, because all applications are loaded and executed from a base address of &8000. But modules are by definition relocatable, and so some address relocation is required. There are two address relocation schemes employed by compiled C code:
The linker will automatically generate a special __RelocCode function which will perform pointer fixup on any pointers contained within the image. For a standard C module this is called by the module header/stubs produced by CMHG, as part of the module’s initialisation entry point. For assembler modules you’ll need to call __RelocCode manually.
(TODO warn about register clobbering)
Note that because ROM modules are statically linked to a fixed address, ordinarily no __RelocCode function is generated. However there’s no need for you to switch in/out calling of the function based on your build config – if the linker spots a reference to __RelocCode then it will generate a dummy function that just does nothing.
Technically this issue of pointer fixup applies to both C and assembler modules – if you write an assembler module which contains absolute addresses within its image then you can and should use __RelocCode to fixup those pointers at initialisation time.
The compiler and linker organise writable data so that it’s stored in one block of memory, similar to how assembler modules manage their writable workspace. In an assembler module, R12 is typically used as a pointer to the base of this workspace, and indexed addressing is used to access individual variables. For C, things are a bit different:
Norcroft uses the following strategy to solve these problems:
Standard modules actually use two relocation offsets: The library relocation offset (used for relocation CLib’s internal workspace) and the module/client relocation offset (used for relocation your programs workspace). These are at SL-540 and SL-536 respectively (and SL itself is nominally set at 560 bytes above the actual end of the stack). But when calling C from assembler, there are three important things to realise:
In practical terms, this means that you can use the following approach when calling C from assembler:
By storing the relocation offset in your workspace, you also have a handy way of getting at your module workspace if the C code calls back into the assembler:
SUB R12,SL,#:INDEX:CRelocOffset). You could even take this one step further and store the relocation offset as the first entry in your workspace, and use R10 as your workspace pointer instead of R12
Note that even if your program marks all static-storage variables (i.e. ‘static’ variables and any variables defined outside of a function) as ‘const’ you’ll probably find that the compiler has stored some of them in either the read-write or zero-init areas of the image and is attempting to use runtime relocation offsets to access them. This is because the -zM switch changes the way the compiler assigns variables to areas (non-module code will have all const data placed in the read-only data region), although the exact reason why this change occurs is unknown (there’s some related discussion here)
Related to the above, if you’re not using the standard CLib stubs then your module won’t have any workspace allocated for its non-const C data. However it should be quite straightforward to do this yourself:
There are actually two problems here – the first is that you don’t have a pointer to the assembler workspace, the second is that you don’t know the layout of the workspace.
For a module, if you’re using the “point SL at assembler workspace” trick mentioned above then you could write a small C function which uses inline assembler to convert SL back to a workspace pointer. Or you could have a small assembler function which does the same.
Another alternative (mainly used in HAL code) is to use the __global_reg storage specifier to permanently bind a register to a workspace pointer. But this will reduce the number of registers which the compiler is able to use for general operations.
Lastly, you could ensure that any C functions which require assembler workspace access are explicitly given pointers to the workspace (or to bits of the workspace which they need) – although this could involve refactoring lots of code, it may make things easier when dealing with the second problem of not knowing the layout of the assembler workspace.
The RISC OS build system has a Hdr2H tool which is able to convert definitions found in assembler ‘header’ files to C equivalents. However it’s mainly geared towards numeric constants, rather than struct layouts – after all, there is no formal method for declaring a struct with typed members using objasm.
So although you might be able to use Hdr2H to get some details of your assembler workspace into C, it’s far from a complete solution. Existing “C in assembler” components where the C needs access to the assembler workspace seem to fall back to just having two copies of the workspace defintion, one in C and one in assembler, and requiring developers to manually keep the definitions in sync. Rearranging the workspace so that all the members which C requires are at the start can help with this – the C workspace struct can then avoid defining the assembler-only members.
A more reliable solution to this problem might be a H2Hdr tool which is able to parse (simple) C structure definitions and convert them to equivalent assembler definitions. Or, since objasm’s macros are quite powerful, it might possible to implement a macro-based solution (place all the structure and constant definitions in their own file, using a special syntax; use different macros in C and assembler to interpret that file and produce the appropriate definitions).
Another solution – albeit a very disruptive one for existing projects – would be to avoid using a fixed workspace definition altogether. All your workspace becomes exported C variables, and the assembler code imports those variables and applies the necessary relocation offsets. Objasm macros would ease the implementation of such a system, but it’s probably a bit too nasty to be worth implementing.
Compared to some of the other problems listed here, this one’s pretty straightforward to solve. But if you find that lots of places need to call to lots of (different) bits of C then it might be become a bit of a hassle.
Before calling a C function:
After calling a C function:
If lots of places need to call into the same C function, it’s probably worth creating a small assembler wrapper function which performs the above actions.
Using C in the HAL or kernel is in some ways more straightforward than using it from modules. The HAL and kernel aren’t relocatable, so a lot of the problems related to code/data relocation go away.
Don’t specify -zM to the compiler. Instead, specify an APCS variant that lacks software stack checking (e.g. -APCS 3/32bit/nofp/noswst). To provide the C code with access to the HAL workspace, declare the HAL workspace as a struct, and in a common header use
__global_reg(6) halworkspace_t *sb; to declare that v6 (i.e. SB in the HAL calling standards) is a pointer to the workspace struct.
Because the HAL calling standards have been based on APCS, this means that you can implement HAL functions directly in C – no assembler wrappers are required. It also means assembler HAL code can call into C HAL code (and vice-versa) without any special interfacing required. HAL C code can also call OS entry points without any special interfacing. However, you do need to make sure that if a piece of code (whether that’s C or assembler) uses HAL workspace, that it’s never called from C code which had the __global_reg definition compiled in. Otherwise SB the intermediate function may have modified SB from its original value.
Because the HAL image isn’t relocatable there’s no need to call __RelocCode on startup.
Because we aren’t specifying -zM, const data should be correctly placed in the const data section. However, read-write or zero-init data will still be problematic; the initial read-write data will still be output as part of the image, but there won’t be any runtime code for relocating references to it, and the ROM image will be read-only in memory so there’s no way the HAL can write to it. So you will need to make sure that all non-const data required by the C code is placed in the HAL workspace.
Currently there isn’t any C code in the kernel, so there’s nothing to use as a reference. However, it’s possible to theorise how C code could be introduced.
Similar to the HAL, the code should be compiled with -APCS 3/32bit/nofp/noswst (and no -zM). Unlike the HAL or modules where the writable workspace can be at any location, the kernel generally uses workspace locations that are fixed at build time. This avoids the need for using __global_reg to reserve a register for pointing to the workspace. Instead, #defines could be used to cast (hardcoded) addresses to pointers to the various workspace structs.
Calling the HAL from C code would require a small piece of glue logic to set up SB correctly – the inline assembler syntax should be adequate for this.
Because the C compiler cannot generate read-only position-independent code, extreme care will be needed if using C prior to MMU activation – none of the code/data will be at the location that the binary expects it to be at. Generally, anywhere where the compiler obtains the address of some code/data will suffer from problems – e.g. getting a pointer to some const data or getting a pointer to a function.
Although HAL functions can be implemented in C, implementing HAL devices in C (whether in the HAL or a module) is a bit trickier.
HAL functions are called with SB set to the HAL workspace pointer, but HAL devices aren’t. For assembler HAL devices this is usually tackled by storing a pointer to the workspace in the HAL device’s structure (which will always be passed in to the device calls in R0/A1). For C HAL devices, if
__global_reg(6) halworkspace_t *sb; is used to bring in SB, you might be tempted to simply assign ‘sb’ to the workspace pointer at the start of the device function. This will correctly set up sb (i.e. R9) to point to the workspace, but the compiler won’t restore the old value of R9 on exit. So care will be needed to save and restore the value of the register/variable for each of your device entry points.
For device entry points which only use a1-a4 for receiving arguments it should be possible to write a small objasm macro that generates a stub assembler function to wrap the C function. Entry points which require stacked arguments will require more work – the arguments may have to be copied to below the R9 & R14 values the stub pushes onto the stack.
Of course if you’re not going to call any HAL code which requires sb, you could always forgo defining and setting the workspace pointer.
If -zM is in use then you’ll almost certainly need to use assembler stub wrappers for all the device entry points.
For “C in assembler” modules these would load the SL value from the device structure and set FP to 0 as described in previous sections.
For regular C modules, it may be tempting to use a CMHG veneer to handle setting up the relocation offsets on the stack (with an assembler stub to first get the module workspace pointer into R12) – but this will have the downside that (in C) all your device entry points will be accepting their arguments via _kernel_swi_regs rather than via standard function arguments. It will also prevent you from having any functions which use stacked arguments – since the CMHG veneers don’t expose the original stack pointer. So instead, you’re likely to want to use custom assembler veneers to set up the C relocation offsets. Similar to the case of C HAL devices in the HAL, it should be straightforward to use objasm macros to generate the required code for you.