Reentrancy in C modules
Richard Walker (2090) 416 posts |
Hi all, How do you code a reentrant module SWI handler in C? Without implementing anything specific for reentrancy, my SWIs generate Aborts and FPA exceptions if reentered. Is there a guideline that should be followed, or specific practices/C functions etc. that should be avoided? I have been studying EtherUSB but didn’t spot anything. For convenience, my code is here https://github.com/riscos-richard/USBJoystick The code talks to USB devices with DeviceFS and Buffer Manager, using UpCallV as a cue to read the buffers. It implements some Joystick_ SWIs, and a few others with the help of UKSWIV. It also uses ByteV to emulate the Acorn I/O podule interface (OSByte 128 and ADVAL in BASIC). Here is the SWI handler Here is an example SWI, Joystick_Read Here is my UpCallV handler I can provide a zip archive if that would help. Cheers, |
Jeffrey Lee (213) 6046 posts |
Floating point use from modules is a bit icky because the veneers that CMHG generates don’t make any effort to set the environment up correctly. Essentially you need to save & restore the FPSR & F0-F3 around any code that uses floating point (preferably outside the functions that use FP, to ensure the compiler doesn’t reorder things) https://www.riscosopen.org/viewer/view/castle/RiscOS/Sources/Lib/AsmUtils/s/modulefp?rev=1.2 Note that the modulefp code linked to above says that “It cannot be called from interrupt routines” – I assume this is because FPEmulator isn’t re-entrant itself (i.e. an IRQ may have happened while FPEmulator was in the middle of emulating an instruction). However I believe the creating your own FPEmulator context will allow you to work around that, since those docs explicitly mention using FP from interrupt handlers. So instead of using the modulefp code to swap contexts you’ll be using FPEmulator. (But if your module itself is re-entrant then you run into the problem that your module may need multiple FP contexts to handle the re-entrancy, or swap the FPEmulator context and use modulefp to allow for nesting within that context) There’s also the note in the above doc that “floating point instructions will enable interrupts if they are handled by the emulator”, so that will place some limits on when & where you’re allowed to use FP. It looks like you’re only using floats in a handful of places, so replacing it with fixed-point math may be the easier solution to the FP problems. Other than that, there’s the usual thing of making sure the functions/SWIs your code is calling are suitably re-entrant – which may be a bit tricky since we don’t have up-to-date documentation for that kind of thing. And of course if there are data structures in your code which need protection then you’ll have to work out the best way of dealing with that. The old-fashioned way would be to enforce atomicity by disabling interrupts and avoiding functions that will cause re-entrance, the newer way would be to use proper synchronisation primitives like those offered by SyncLib (although SyncLib itself is also a bit primitive, so mistakes may hard lock your machine) |
Jon Abbott (1421) 2599 posts |
Reading between the lines, avoid the FPA and external functions? I suspect in this case its FPEmulator that’s causing the issues as it will almost certainly be reentered. If FPEmulator is also enabling IRQ, that’s going to potentially be an issue for games that have previously disabled it for whatever reason. The source of the reentrancy are two fold, games that query the Joystick under IRQ and ADFFS which queries the Joystick state to set the Econet registers every VSync (for RTFM.) I mitigated ADFFS by simply avoiding the state check if Joystick_Read was already active, but I guess we can’t control when FPA instructions are used in games so need to avoid all use of the FPA. |
Rick Murray (539) 13405 posts |
Which means if you’re writing a module in C, avoid using floats. There’s no fixed point library, the compiler generates FPA instructions. It’s useful to know a few tricks to get around the need for floats, often for speed reasons. For instance when working out a percentage, say, if you multiply by 100 first and then do the division, it’ll lose some accuracy (not important if you’re only looking for a number between 0 and 100) but does mean the entire calculation can be performed using integers. Stuff like that… |
Richard Walker (2090) 416 posts |
Chaps, Thanks for the comments. I was getting yo the same point – remove my floats. Certainly from Jeffrey’s suggestions, that sounds like the simplest place to start. The main use is a scaling algorithm so that your joystick axes values are translated to the fixed ranges used by the legacy APIs. For example, an axis may report between -512 and +511 (zero for centre) and I have to mould that to 0 to 65535 or -127 to 127. I have an int-only version, but it isn’t as accurate, which is why I used floats in the first place. I will knock-up an into build. I’ll be back… |
Rick Murray (539) 13405 posts |
Um… Correct me if I’m wrong, but isn’t converting +/-512 to +/-127 a simple shift operation? As for the other range, add 512 to give a number between 0 and 1023 and then use a shift to scale it up to 0-64K. No floats necessary here! |
Rick Murray (539) 13405 posts |
Jon – I notice you quote Interdictor and its sequel as using floating point. I began wondering how many games actually wanted to use the FPA, and how many simply contained FP ops as that is how the C compiler handled floating point numbers? |
John Williams (567) 768 posts |
Rick – for one who claims to be “disnumerate” (or whatever!) you’ve got a very clear grasp of binary shifting – way out of my ball park! I admire your expertise and your general interpretation of events! And I hope Mom’s doing OK! |
Rick Murray (539) 13405 posts |
Funny. Binary shifting is easy. Proper maths is hard. ;-) Essentially you are doubling or halving a number by moving its bits to the left (multiply) or the right (divide). Let’s pick the number 44, my age. In binary that looks like %00101100 (a 32, an 8, and a 4). If we shift two places to the left, the number becomes %10110000 as the bits have all moved two positions leftwards. To fill in the gap, two zero bits are added to the rightmost place (the ones I have indicated in bold). It is all based around powers of two, so one can use a binary shift to multiply 13 by 8 (2×2×2 so shift 3 places to the left) but you can’t multiply 13 by 7… but, wait, you can cheat! Wanna do a blindingly quick multiply by 7? Simply shift three places to the left (multiplies by eight) and then subtract the original number from the result. Division… Is a little bit harder. ;-) |
Steve Pampling (1551) 7932 posts |
Back in the 70’s the then neighbour’s son joined the navy barely able to add two numbers. They decided he’d make a good radar tech and decided to teach him more maths. Family,friends and neighbours all decided the navy were insane. They taught him binary, which he did on his fingers initially and then in his head. Way faster than pushing buttons on a calculator. Seems base 10 wasn’t his mind set. |
Jon Abbott (1421) 2599 posts |
Consider yourself corrected, the range comes from the HID descriptor, so it could be anything.
We’re talking the early days here, so I don’t think there was much choice, Norcroft didn’t have internal float routines as far as I know. The games generally come with FPEmulator, so they must have known it was being used. |
Richard Walker (2090) 416 posts |
Rick, yes, as Jon has pointed out, the values can be anything (signed int between 0×0000 and 0xFFFF I think). I just used an off-hand example. I tried out an integer version earlier, and it scales ok-ish if it is scaling up, but utterly fails to scale down. I think I need to detect the need to scale down and divide instead of multiply – or something! |
Jon Abbott (1421) 2599 posts |
We could use Berkley SoftFloat or code up multiply and divide internally? |
Steffen Huber (91) 1945 posts |
I heard good things about libfixmath, which is MIT-licensed, and said to be very portable. |
Jeffrey Lee (213) 6046 posts |
You’ve got the following code: axis->acorn_slope_8 = (float)ACORN_AXIS_VALUES_8 / (float)range; axis->acorn_slope_16 = (float)ACORN_AXIS_VALUES_16 / (float)range; axis->adc_slope = (float)ADC_AXIS_VALUES / (float)range; x = (int) ((float)ACORN_AXIS_MIN_VALUE_8 + ax->acorn_slope_8 * (float)x); y = (int) ((float)ACORN_AXIS_MIN_VALUE_8 + ay->acorn_slope_8 * (float)y); x = (int) ((float)ACORN_AXIS_MIN_VALUE_16 + ax->acorn_slope_16 * (float)x); y = (int) ((float)ACORN_AXIS_MIN_VALUE_16 + ay->acorn_slope_16 * (float)y); value = (int) ((float)ADC_AXIS_MIN_VALUE + a->adc_slope * (float)value); I.e. in all cases you’re just calculating Unless you’re planning on adding a lot more code, plugging in an extra library to deal with those calculations would be a bit of a waste of time. The naive approach would be to just perform the computation exactly as listed above (i.e. postponing the divide until after the multiply). Potentially you’d need to promote the intermediate result to a 64bit value to avoid overflow for large Or, you could still precompute a scale factor, but instead of using floats use a 64bit fixed point int (32 whole bits, 32 fraction bits, 64bit to avoid issues with large int64_t acorn_slope_8 = ((int64_t)ACORN_AXIS_VALUES_8) << 32) / range; x = (int) (ACORN_AXIS_MIN_VALUE_8 + ((acorn_slope_8 * x) >> 32)); Or with rounding: int64_t acorn_slope_8 = (((int64_t)ACORN_AXIS_VALUES_8) << 32) + (range >> 1)) / range; x = (int) (ACORN_AXIS_MIN_VALUE_8 + ((acorn_slope_8 * x + 0x80000000ll) >> 32)); |
Rick Murray (539) 13405 posts |
Ah, the bit that was neglected to be mentioned. ;-) I would still be inclined to add in some code to spot specific cases (such as the ones quoted above) and use addition/bit-shift to do the maths, reverting to the way Jeffrey has suggested for anything else. For example: Input -256 to 255 → Output -1024 to 1023 can be performed with a simple <<2 (highest range will actually be -1024 to 1020, but a bias of +3 could be added to +ve if it makes any real difference in practice). For example: Input 0 to 400, 200 is centre. Is weird. Will need “proper maths”. :-) Aim to optimise when possible… |
Richard Walker (2090) 416 posts |
Ah, I have spotted these extra responses a bit too late! I have just coded up,and posted to JASPP, a version without floats. I think it is close to what Jeffrey is suggesting. I realised my uber-precision wasn’t necessary, and I could achieve reducing the range by a divide (rather than multiply by <1). We shall see! The comments are all appreciated. Although I am a developer by day, it is in a completely different universe, and there are so many new things to be aware of! Regarding optimisations, I am generally of the view that the rule is ‘do not do it’, but if you are an expert then, ‘do not do it yet’! :) |
Rick Murray (539) 13405 posts |
I prefer to think of it as “don’t do something complicated if something simple will do”. ;-) |
Richard Walker (2090) 416 posts |
Ah yes, that angle too. I try to tell myself that the computer can execute pretty much anything quickly enough, but the developer cannot. So I try to ‘optimise’ for the developer reading and understanding it (they are slow and expensive – microprocessors are fast and cheap!). Granted, once the concept works, and you can measurably demonstrate an execution performance issue, then by all means, code up something which runs faster (but is no doubt harder to read). That’s where comments come in! By the way, I would totally accept that my principles may not have been adhered to with USBJoystick: it’s been a bit of a learning curve for me. :) |