FP support
Stuart Swales (8827) 1257 posts |
It might help if we can persuade our compiler-maintaining friends to have the compiler predefine a suitable macro if -apcs /softfp is specified, much as it already does for __APCS_32, __APCS_NOFP etc. |
Stuart Swales (8827) 1257 posts |
If you think that this is a worthwhile thing, I will press on with it. It may be that implementing more of the calls which still go to the SharedCLibrary (which always executes FPA) with VFP alternatives might further help RiscOSM. It can only improve overall! |
Sprow (202) 1129 posts |
Try |
Matthew Phillips (473) 690 posts |
It certainly gives a worthwhile speed improvement at very little cost in terms of altering source code. |
Chris Evans (457) 1614 posts |
There in lies the problem. Which group is the majority? I’m sure that most RISC OS users who regularly use it are doing so on an Iyonix or earlier, though many of them do not update their software. Which group is larger I don’t know! |
Stuart Swales (8827) 1257 posts |
Thanks – that one wasn’t anywhere near where
If they’re not going to update their software, they won’t ever get to see any improvements! |
Matthew Phillips (473) 690 posts |
Just looking at the practicalities of issuing software in two forms, with/without the softfp solution. Obviously we could just offer two downloads, but if we wanted to offer a single download that automatically loaded the correct !RunImage, what is the easiest way from a !Run file of working out whether VFP is supported? RMEnsure VFPSupport and note whether it’s available? Or can a machine with no VFP still have a VFPSupport module? I see in apcs_softpcs.c that you try VFPSupport_CheckContext, so if it is necessary to go a bit further than RMEnsure to be able to be sure, perhaps a bit of BASIC or a Utility that checks that SWI and sets a system variable for the !Run file to pick up would be sensible. |
Stuart Swales (8827) 1257 posts |
The VFPSupport module will refuse to initialise unless suitable hardware is present, so I’d use that. If in the future someone was crazy enough (I hope not, it would be a hell of a lot of work for very little benefit) to implement a VFPSupport module to support emulation of VFP instructions on non-VFP hardware systems, it would might then sensible to do further checks. [Sorry not to respond earlier but the forum seems a bit wonky at present in indicating new content for me.] |
Stuart Swales (8827) 1257 posts |
Trying to drag another thread (https://www.riscosopen.org/forum/forums/1/topics/16885?page=1#posts-128438) on topic, I shall repost these points here: We need to prod some vendors (are you listening?) into updating their VFPSupport modules: to at least 0.16 (28 Jun 2021) [which adds elementary function acceleration using VFP: and ideally to 0.17 (03 Nov 2021) [pow bug fix with NaNs: It’s not just RiscOSM that benefits from newer VFPSupport, BASIC VI VFP does too:
OK, so I got round to building the current VFPSupport 0.17 module from the ROOL source for RPi and ARMX6: https://www.croftnuisk.co.uk/coltsoft-downloads/other/vfpsupport-0_17.zip Though, as before, I would recommend using an updated nightly ROM if possible.
It would be feasible for someone to hook into the same set of function calls emitted by the Norcroft C compiler with -apcs /softfp (the clue is in the PCS name!) that my VFP library does, and provide a software floating point implementation of the basic ops such as _dadd in a different library. That’d probably help code under emulation as it’s just then plain old ARM code that the dynamic recompiler can have a go at just as with any other code, without the overhead of taking undefined instruction exceptions as used by existing FPA code with FPEmulator. It would also help on non-VFP systems that don’t have FPA hardware (that’s most of them – does anyone at all on here have a system with hardware FPA that’s still in daily use?). |
Matthew Phillips (473) 690 posts |
I’ve run into a problem when using variables of type int64_t and converting to double: int64_t i; for (i = 3; i>=-3; i--) printf("%lld = %f", i, (double) i); This is giving me: This is not what I want! Does your apcs_softpcs code support int64_t? |
Stuart Swales (8827) 1257 posts |
Ooh. The Norcroft C compiler doesn’t generate code to go via any of the (a) int64_t isn’t unsigned, so I’d have expected the compiler to generate a call to (b) This could be handled by a suitable casting function instead – I’d best get on and concoct one! [Edit: I had written “Unfortunately It would help if that bit of the SharedCLibrary source code (in s.longlong) had some comments…] [Edit2: It might be a while till a compiler fix is available; are the places where you cast int64_t to double easy to find? I thought an assembler veneer might be necessary but a couple of lines of C ought to do the trick.] |
Stuart Swales (8827) 1257 posts |
Here’s some code that shows a zero-overhead workaround for this bug:
Compile with To use in your own code, plop the |
Matthew Phillips (473) 690 posts |
Took a bit of reading through code identifying all the uses of int64_t types. Most of them are ID numbers, which only have addition and subtraction performed on them, and timestamps, which are actually all positive but stored in int64_t values because there is delta compression applied. Turns out there was only one offending use in the whole code, so I just amended as follows: #if defined(__SOFTFP__) extern double _ll_sto_d(int64_t i64); double latDeg = _ll_sto_d(lat * Granularity + LatOffset) / 1e9; #else double latDeg = ( (double)(lat * Granularity + LatOffset) ) / 1e9; #endif Thank you very much for your help. I have identified and reported two or three compiler bugs over the years, and they always give you a sinking feeling. I don’t think I’d have been able to make head or tail of this one as I don’t know anything about all these internal functions. I will report the problem to ROOL. |
Matthew Phillips (473) 690 posts |
I’ve encountered a problem when compiling the Parson library with the /softfp option. I’ve been using this library very successfully with applications that don’t do much floating point, so I’ve never tried to compile with -apcs /softfp before. I get the following error: “c.parson”, line 2017: Fatal internal error: FP op 66 unknown Internal inconsistency: either resource shortage or compiler fault. If you cannot alter your program to avoid this failure, please contact your supplier The line in question looks like this:
Is the fabs function not supported by your apcs_softpcs library, perhaps? It’s included in the header file, defined as __apcs_softpcs__d_abs(x) so I would have expected it to work. |
Stuart Swales (8827) 1257 posts |
I do recall some funny with In the meantime, does it help to hack:
to:
as the actual f.p. operator it uses for subtraction depends on the order it’s chosen to evaluate – sometimes it does f.p. equivalent of RSB for convenience. Having not seen the Norcroft source for 34 years I don’t know the ins-and-outs of the actual error reported. Mind you, I managed to get a Fatal internal error: FP Internal inconsistency yesterday on something unrelated, and that was NOT using -apcs /softfp, just normal FPA!… [Edit: isn’t ‘the Parson library’ linked above an earlier unmaintained fork of https://github.com/kgabis/parson ? I can compile the latter with my current development apcs_softpcs using -apcs /softfp, so that’s a start!] |
Rick Murray (539) 13440 posts |
That’s the compiler’s way of pleading for a little respect, begging for VFP support so it can do the job properly. ;) |
Matthew Phillips (473) 690 posts |
Quite right: looks like I’ve not kept up to date. I downloaded the maintained version, but I still get the same error from Norcroft C compiler 5.89 [18 Feb 2022]. These were my flags:
Same problem if I use I’ll have a go at hacking the code a bit. Could always avoid fabs altogether by looking for the value being greater than -0.000001 and smaller than 0.000001 or whatever. [Edit: your suggestion did not help, but changing to check for |
Matthew Phillips (473) 690 posts |
Spoke too soon. Although the library compiles as a library, once I try to link to the application I get errors about the following symbols not being defined:
Don’t know why as your softfloat header seems to define them. (I have to say, I don’t understand how your code works at all!) |
Matthew Phillips (473) 690 posts |
Well, I’ve got it all working. There wasn’t really a problem with fabs() at all. I had failed to include apcs_softpcs.h at the right point. Stupid mistake! I managed to work round the problem with ffltu, fmul and ffixu by reasoning that there must be a cast from a float to an unsigned int happening (_ffixu) and then located this line:
changing this to
avoided the problem. |
Stuart Swales (8827) 1257 posts |
Glad you got it sorted. The newer library does have more comprehensive support for float as well as double, I think there were a couple of oddballs missing that I found when I extended the test coverage. |
Stuart Swales (8827) 1257 posts |
Returning to this, I foolishly tried to get PipeDream using the C99 complex type together with /softfp (previously bodged to use my own implementation of complex numbers under C89). Is there any documentation on the |
Rick Murray (539) 13440 posts |
Do you mean this? https://gitlab.riscosopen.org/RiscOS/Sources/Lib/RISC_OSLib/-/blob/master/s/cxsupport |
Stuart Swales (8827) 1257 posts |
That sort of thing, Rick, but as they are compiler-internal functions they aren’t necessarily APCS-compliant, so I need to know which ARM registers the arguments are passed in when in /softfp mode. Having a quick scan of the C compiler binary shows that there are more complex support functions potentially emitted by it that aren’t covered by the SCL (like float and double support, where SCL supports none of the /softfp required functions). e.g. |
Rick Murray (539) 13440 posts |
So those functions are output by the compiler itself, rather than being library functions? Could you write test code that uses them, then just look to see what is actually output? Hmm, I wonder if it would work to give the compiler the -S option to get it to write assembler as text, rather than a binary file for the linker…? |
Stuart Swales (8827) 1257 posts |
Indeed they are – many normally done as built-in functions that created inline code (compiler doesn’t need to call a library function to add/subtract two complex numbers using FPA as it’s just (LDFD, LDFD, ADFD/SUFD)(x2), but it does need a helper function to multiply them – _cxd_mul). It’s easy enough to create code that produces uses of many of these functions, but there’s no way to ensure that you’ve got full coverage without a specification (they aren’t all in the same place, there are at least two blocks of these functions, maybe more) as I found when adding double support – I got most of them first time round but missed a couple. As an aside, here’s a good one for the ‘compilers are better than humans’: ; r0 and r1 point to two complex numbers (pairs of IEEE doubles) ; r2 points to where I want the result MOV r7,r2 ; it's already made room on the stack as the ; second complex parameter needs passing on the stack LDM r1,{r1-r3,r12} MOV lr,sp STM lr,{r1-r3,r12} ; oh, what's matter with STM sp,{r1-r3,r12} ; first complex parameter is passed in registers ; as (r0,r1) real part and (r2,r3) imaginary part MOV r3,r0 LDMIB r3,{r1-r3} LDR r0,[r0] ; er, again, LDM r0,{r0-r3} replaces all three??? BL _cxd_add STM r7,{r0-r3} ; store result (yay) |
Cameron Cawley (3514) 144 posts |
Do you have a brief summary of what the current status of the apcs_softpcs library is, and what is still left to do? |
Stuart Swales (8827) 1257 posts |
The apcs_softpcs library has coverage of double (and float) operators, compiler support functions and standard library functions (to C99). No complex double support yet (none planned, I think it would only be me who’d use it). Note that use of /softfp with Norcroft C is not officially supported on RISC OS, but ROOL have been very helpful in fixing compiler bugs in this area. The current stable library sources can be downloaded from: http://croftnuisk.co.uk/exports/apcs_softpcs_20231218.zip Here’s a snippet from the top-level ReadMe to save downloading: What is it? apcs_softpcs provides support for the Norcroft C compiler for RISC OS when used with the softpcs procedure call standard, avoiding use of FPA floating point instructions in the compiled code, with ARM registers used to return floating point values from functions (and also to pass in floating point parameters, as already happens with APCS-32). double values are returned (and double parameters are passed) in ARM registers in FPA word order. apcs_softpcs provides functions to support all the basic floating point operators that may be emitted by the compiler (e.g. VFP instructions are used to implement all of those basic operators for both double and single precision. Note that no separate long double support is needed for these as it is equivalent to double on ARM Norcroft C. However, if VFP is not present on the system on which an application created with apcs_softpcs is run, apcs_softpcs will automatically fall back to using FPA instructions to implement these operators. apcs_softpcs also provides functions which wrap those standard C library functions which return double (or float) values in floating point register VFP instructions (and even just ARM instructions in some cases) are used to implement many of these functions without using the C library fallback, see the ReadMeMore file. If a new enough VFPSupport module (0.15 on) is present on the system on which an application created with apcs_softpcs is run, apcs_softpcs will use the elementary function tables provided by the VFPSupport module to implement the transcendental functions (cos, exp, and friends) for both double and single precision, otherwise falling back to using the C library code as before. VFPSupport module 0.17 or later is recommended. Performance? I obtained a 5-6 times speed improvement when inverting large matrices in PipeDream using VFP arithmetic provided by apcs_softpcs support compared to a standard FPA build using FPEmulator on the same system (ARMX6). If VFP is not available on the system on which an application created with apcs_softpcs is run, there is a small performance penalty for falling back to using FPA instructions for compiler support functions, especially for systems with FPA floating point hardware (that’s very few of them!). This also applies to standard C library math functions. |
Colin Ferris (399) 1755 posts |
Did someone mention FPE used in Draw! A use for Stuart’s Lib :-) |
Stuart Swales (8827) 1257 posts |
The correct thing to do is to eliminate the (repeated, can be sensible for transform setup) FP use there for the win. |
Matthew Phillips (473) 690 posts |
RiscOSM is routinely compiled with the apcs_softpcs library and the application !Run file will automatically select the appropriate !RunImage for your machine. You can tell if you are running the VFP version by checking the Info box to see if the version field includes VFP. Stuart was very helpful getting this working, and ROOL sorted out a compiler bug we encountered. |