RISC OS Open: Forum: FP support

Oct 9, 2021 5:04pm

#if defined(thing) is much more flexible – you can do #if defined(thing) || ANOTHER_THING rather than contorted nested #ifdef.

It might help if we can persuade our compiler-maintaining friends to have the compiler predefine a suitable macro if -apcs /softfp is specified, much as it already does for __APCS_32, __APCS_NOFP etc.

Oct 9, 2021 9:10pm

Stuart Swales (8827) 1257 posts

I was wanting to put similar conditions in my application so as to be able to compile soft float and FPE versions, and I was wondering whether I am missing something.

If you think that this is a worthwhile thing, I will press on with it. It may be that implementing more of the calls which still go to the SharedCLibrary (which always executes FPA) with VFP alternatives might further help RiscOSM. It can only improve overall!

Oct 10, 2021 6:54am

Sprow (202) 1129 posts

to have the compiler predefine a suitable macro if -apcs /softfp is specified

Try __SOFTFP__. Same caveats as previously given about using switches found by reading ARM’s compiler manual and experimenting.

Oct 10, 2021 1:44pm

Matthew Phillips (473) 690 posts

If you think that this is a worthwhile thing, I will press on with it.

It certainly gives a worthwhile speed improvement at very little cost in terms of altering source code.

Oct 10, 2021 4:02pm

Chris Evans (457) 1614 posts

Aim to please the majority.

There in lies the problem. Which group is the majority? I’m sure that most RISC OS users who regularly use it are doing so on an Iyonix or earlier, though many of them do not update their software. Which group is larger I don’t know!

Oct 10, 2021 6:10pm

Stuart Swales (8827) 1257 posts

Try __SOFTFP__

Thanks – that one wasn’t anywhere near where __APCS_32 etc is defined so I missed it in the cc executable.

I’m sure that most RISC OS users who regularly use it are doing so on an Iyonix or earlier, though many of them do not update their software. Which group is larger I don’t know!

If they’re not going to update their software, they won’t ever get to see any improvements!

Oct 10, 2021 9:17pm

Matthew Phillips (473) 690 posts

Just looking at the practicalities of issuing software in two forms, with/without the softfp solution. Obviously we could just offer two downloads, but if we wanted to offer a single download that automatically loaded the correct !RunImage, what is the easiest way from a !Run file of working out whether VFP is supported?

RMEnsure VFPSupport and note whether it’s available?

Or can a machine with no VFP still have a VFPSupport module?

I see in apcs_softpcs.c that you try VFPSupport_CheckContext, so if it is necessary to go a bit further than RMEnsure to be able to be sure, perhaps a bit of BASIC or a Utility that checks that SWI and sets a system variable for the !Run file to pick up would be sensible.

Oct 12, 2021 10:59am

Stuart Swales (8827) 1257 posts

The VFPSupport module will refuse to initialise unless suitable hardware is present, so I’d use that.

If in the future someone was crazy enough (I hope not, it would be a hell of a lot of work for very little benefit) to implement a VFPSupport module to support emulation of VFP instructions on non-VFP hardware systems, it would might then sensible to do further checks.

[Sorry not to respond earlier but the forum seems a bit wonky at present in indicating new content for me.]

Nov 11, 2021 7:14pm

Stuart Swales (8827) 1257 posts

Trying to drag another thread (https://www.riscosopen.org/forum/forums/1/topics/16885?page=1#posts-128438) on topic, I shall repost these points here:

We need to prod some vendors (are you listening?) into updating their VFPSupport modules:

to at least 0.16 (28 Jun 2021) [which adds elementary function acceleration using VFP:
https://gitlab.riscosopen.org/RiscOS/Sources/HWSupport/VFPSupport/-/commit/0dfe15e56d19d40d82a00d696601c889ac7350c4]

and ideally to 0.17 (03 Nov 2021) [pow bug fix with NaNs:
https://gitlab.riscosopen.org/RiscOS/Sources/HWSupport/VFPSupport/-/commit/c6391c0d0fed212e5be951b1c76c6b0b7989d7cc].

It’s not just RiscOSM that benefits from newer VFPSupport, BASIC VI VFP does too:
https://www.riscosopen.org/news/articles/2021/07/10/going-round-in-circles-quickly

From where can we obtain the newer VFPSupport module? Currently I also have version 0.13

OK, so I got round to building the current VFPSupport 0.17 module from the ROOL source for RPi and ARMX6:

https://www.croftnuisk.co.uk/coltsoft-downloads/other/vfpsupport-0_17.zip

Though, as before, I would recommend using an updated nightly ROM if possible.

Would it be possible to do a version for Emulators RiscPC etc instead of using the FPEm module?

It would be feasible for someone to hook into the same set of function calls emitted by the Norcroft C compiler with -apcs /softfp (the clue is in the PCS name!) that my VFP library does, and provide a software floating point implementation of the basic ops such as _dadd in a different library. That’d probably help code under emulation as it’s just then plain old ARM code that the dynamic recompiler can have a go at just as with any other code, without the overhead of taking undefined instruction exceptions as used by existing FPA code with FPEmulator. It would also help on non-VFP systems that don’t have FPA hardware (that’s most of them – does anyone at all on here have a system with hardware FPA that’s still in daily use?).

Nov 23, 2021 9:56pm

Matthew Phillips (473) 690 posts

I’ve run into a problem when using variables of type int64_t and converting to double:

int64_t i;
for (i = 3; i>=-3; i--)
    printf("%lld = %f", i, (double) i);

This is giving me:
3 = 3.000000
2 = 2.000000
1 = 1.000000
0 = 0.000000
-1 = 18446744073709552000.000000
-2 = 18446744073709552000.000000
-3 = 18446744073709552000.000000

This is not what I want! Does your apcs_softpcs code support int64_t?

Nov 23, 2021 10:18pm

Stuart Swales (8827) 1257 posts

Does your apcs_softpcs code support int64_t?

Ooh. The Norcroft C compiler doesn’t generate code to go via any of the -apcs /softfp interfaces that the apcs_softpcs library provides in this case, but for me (in my test code, at least) issues a call to _ll_uto_d(uint64_t) in the SharedCLibrary, which I doubt that we can override.

(a) int64_t isn’t unsigned, so I’d have expected the compiler to generate a call to _ll_sto_d(int64_t).

(b) This could be handled by a suitable casting function instead – I’d best get on and concoct one!

[Edit: I had written “Unfortunately _ll_uto_d returns its result in the FPA F0 register, not {r0,r1}, so is unusable from a -apcs /softfp caller context.” but this isn’t the case – it DOES return a result in {r0,r1}, but the compiler has generated a call to the wrong function.

It would help if that bit of the SharedCLibrary source code (in s.longlong) had some comments…]

[Edit2: It might be a while till a compiler fix is available; are the places where you cast int64_t to double easy to find? I thought an assembler veneer might be necessary but a couple of lines of C ought to do the trick.]

Nov 23, 2021 11:14pm

Stuart Swales (8827) 1257 posts

Here’s some code that shows a zero-overhead workaround for this bug:

/* i64tod-bug.c */

#include <stdint.h>

#if defined(__SOFTFP__)
extern double _ll_sto_d(int64_t i64);
static inline double double_from_int64_t(int64_t i64)
{   double (* convert)(int64_t) = _ll_sto_d;
    return(convert(i64));
}
#else
#define double_from_int64_t(i64) ((double) i64)
#endif

double
right_fn(int64_t i64)
{
    return double_from_int64_t(i64);
}

double
wrong_fn(int64_t i64)
{
    return (double) i64;
}

#include <stdlib.h>
#include <stdio.h>

int main(int argc, char * argv[])
{
    const char * number = (argc > 1) ? argv[1] : "42";
    int64_t i64 = strtoll(number, NULL, 0);
    double d;

    printf("%s -> int64 %lld\n", number, i64);

    d = right_fn(i64);
    printf("%s -> double %+g Yay?\n", number, d);

    d = wrong_fn(i64);
    printf("%s -> double %+g Boo?\n", number, d);

    return(EXIT_SUCCESS);
}

/* end of i64tod-bug.c */

Compile with -apcs /softfp and link with the library and Stubs as usual.

To use in your own code, plop the double_from_int64_t... lines in a header and call double_from_int64_t() rather than casting. Remember that double can’t store int64_t to full precision anyhoo…

Nov 24, 2021 11:49pm

Matthew Phillips (473) 690 posts

are the places where you cast int64_t to double easy to find

Took a bit of reading through code identifying all the uses of int64_t types. Most of them are ID numbers, which only have addition and subtraction performed on them, and timestamps, which are actually all positive but stored in int64_t values because there is delta compression applied.

Turns out there was only one offending use in the whole code, so I just amended as follows:

#if defined(__SOFTFP__)
    extern double _ll_sto_d(int64_t i64);
    double latDeg = _ll_sto_d(lat * Granularity + LatOffset) / 1e9;
#else
    double latDeg = ( (double)(lat * Granularity + LatOffset) ) / 1e9;
#endif

Thank you very much for your help. I have identified and reported two or three compiler bugs over the years, and they always give you a sinking feeling. I don’t think I’d have been able to make head or tail of this one as I don’t know anything about all these internal functions.

I will report the problem to ROOL.

May 30, 2022 11:04pm

Matthew Phillips (473) 690 posts

I’ve encountered a problem when compiling the Parson library with the /softfp option. I’ve been using this library very successfully with applications that don’t do much floating point, so I’ve never tried to compile with -apcs /softfp before.

I get the following error:

“c.parson”, line 2017: Fatal internal error: FP op 66 unknown

Internal inconsistency: either resource shortage or compiler fault. If you cannot alter your program to avoid this failure, please contact your supplier

The line in question looks like this:

return fabs(json_value_get_number(a) - json_value_get_number(b)) < 0.000001;

Is the fabs function not supported by your apcs_softpcs library, perhaps? It’s included in the header file, defined as __apcs_softpcs__d_abs(x) so I would have expected it to work.

May 31, 2022 3:25pm

Stuart Swales (8827) 1257 posts

I do recall some funny with fabs(x) with Norcroft choosing to treat it as a compiler intrinsic anyway and still generating an FPA instruction (operating on undefined FP register contents and also returning result in an unused FP register). I’d done more work on the apcs_softpcs library since the copy you had to address various issues that were giving a couple of similar barfs for me, although I’d stopped short of getting it to a release state (but had done oozles more auto-tests). Will push on with that (now the thunderstorms have passed), as there are punters wanting to try out a Fireworkz built with apcs_softpcs library support.

In the meantime, does it help to hack:

return fabs(json_value_get_number(a) - json_value_get_number(b)) < 0.000001;

to:

double tmp_a = json_value_get_number(a);
double tmp_b = json_value_get_number(b);
double tmp_delta = tmp_a - tmp_b; /* or even tmp_b - tmp_a given fabs... */
return fabs(tmp_delta) < 0.000001;

as the actual f.p. operator it uses for subtraction depends on the order it’s chosen to evaluate – sometimes it does f.p. equivalent of RSB for convenience. Having not seen the Norcroft source for 34 years I don’t know the ins-and-outs of the actual error reported.

Mind you, I managed to get a Fatal internal error: FP Internal inconsistency yesterday on something unrelated, and that was NOT using -apcs /softfp, just normal FPA!…

[Edit: isn’t ‘the Parson library’ linked above an earlier unmaintained fork of https://github.com/kgabis/parson ? I can compile the latter with my current development apcs_softpcs using -apcs /softfp, so that’s a start!]

May 31, 2022 3:35pm

Rick Murray (539) 13440 posts

I managed to get a Fatal internal error: FP Internal inconsistency

That’s the compiler’s way of pleading for a little respect, begging for VFP support so it can do the job properly. ;)

May 31, 2022 7:48pm

Matthew Phillips (473) 690 posts

Isn’t ‘the Parson library’ linked above an earlier unmaintained fork of https://github.com/kgabis/parson ? I can compile the latter with my current development apcs_softpcs using -apcs /softfp, so that’s a start!

Quite right: looks like I’ve not kept up to date. I downloaded the maintained version, but I still get the same error from Norcroft C compiler 5.89 [18 Feb 2022]. These were my flags:

CCflags = -c -depend !Depend -IC: -throwback -c99 -APCS 3/softfp

Same problem if I use -APCS /softfp but I think APCS 3 is the default anyway, so that’s to be expected.

I’ll have a go at hacking the code a bit. Could always avoid fabs altogether by looking for the value being greater than -0.000001 and smaller than 0.000001 or whatever.

[Edit: your suggestion did not help, but changing to check for tmp_delta > -0.000001 && tmp_delta < 0.000001 has got it to compile.]

May 31, 2022 8:18pm

Matthew Phillips (473) 690 posts

Spoke too soon. Although the library compiles as a library, once I try to link to the application I get errors about the following symbols not being defined:

_ffltu
_fmul
_ffixu

Don’t know why as your softfloat header seems to define them. (I have to say, I don’t understand how your code works at all!)

May 31, 2022 8:50pm

Matthew Phillips (473) 690 posts

Well, I’ve got it all working.

There wasn’t really a problem with fabs() at all. I had failed to include apcs_softpcs.h at the right point. Stupid mistake!

I managed to work round the problem with ffltu, fmul and ffixu by reasoning that there must be a cast from a float to an unsigned int happening (_ffixu) and then located this line:

object->item_capacity = (unsigned int)(capacity * 0.7f);

changing this to

object->item_capacity = (unsigned int)(capacity * 0.7);

avoided the problem.

May 31, 2022 11:38pm

Stuart Swales (8827) 1257 posts

Glad you got it sorted.

The newer library does have more comprehensive support for float as well as double, I think there were a couple of oddballs missing that I found when I extended the test coverage.

Jul 23, 2022 3:08pm

Stuart Swales (8827) 1257 posts

Returning to this, I foolishly tried to get PipeDream using the C99 complex type together with /softfp (previously bodged to use my own implementation of complex numbers under C89).

Is there any documentation on the _cxd_* functions that ARM Norcroft C emits to support C99 complex (e.g. ones like _cxd_add I can guess) so that I can implement them in my library?

Jul 23, 2022 3:20pm

Rick Murray (539) 13440 posts

Do you mean this? https://gitlab.riscosopen.org/RiscOS/Sources/Lib/RISC_OSLib/-/blob/master/s/cxsupport

Jul 23, 2022 3:25pm

Stuart Swales (8827) 1257 posts

That sort of thing, Rick, but as they are compiler-internal functions they aren’t necessarily APCS-compliant, so I need to know which ARM registers the arguments are passed in when in /softfp mode. Having a quick scan of the C compiler binary shows that there are more complex support functions potentially emitted by it that aren’t covered by the SCL (like float and double support, where SCL supports none of the /softfp required functions). e.g. _cxd_mulr, probably multiplies a complex by a double but does _cxd_muli multiply a complex by a ( double × I ) or by an integer?

Jul 23, 2022 4:06pm

Rick Murray (539) 13440 posts

So those functions are output by the compiler itself, rather than being library functions?

Could you write test code that uses them, then just look to see what is actually output?

Hmm, I wonder if it would work to give the compiler the -S option to get it to write assembler as text, rather than a binary file for the linker…?

Jul 23, 2022 4:29pm

Stuart Swales (8827) 1257 posts

So those functions are output by the compiler itself, rather than being library functions?

Indeed they are – many normally done as built-in functions that created inline code (compiler doesn’t need to call a library function to add/subtract two complex numbers using FPA as it’s just (LDFD, LDFD, ADFD/SUFD)(x2), but it does need a helper function to multiply them – _cxd_mul).

It’s easy enough to create code that produces uses of many of these functions, but there’s no way to ensure that you’ve got full coverage without a specification (they aren’t all in the same place, there are at least two blocks of these functions, maybe more) as I found when adding double support – I got most of them first time round but missed a couple.

As an aside, here’s a good one for the ‘compilers are better than humans’:

    ; r0 and r1 point to two complex numbers (pairs of IEEE doubles)
    ; r2 points to where I want the result
    MOV r7,r2

    ; it's already made room on the stack as the
    ; second complex parameter needs passing on the stack
    LDM r1,{r1-r3,r12}
    MOV lr,sp
    STM lr,{r1-r3,r12} ; oh, what's matter with STM sp,{r1-r3,r12}

    ; first complex parameter is passed in registers
    ; as (r0,r1) real part and (r2,r3) imaginary part
    MOV r3,r0
    LDMIB r3,{r1-r3}
    LDR r0,[r0] ; er, again, LDM r0,{r0-r3} replaces all three???

    BL _cxd_add

    STM r7,{r0-r3} ; store result (yay)

Dec 18, 2023 5:57pm

Cameron Cawley (3514) 144 posts

Do you have a brief summary of what the current status of the apcs_softpcs library is, and what is still left to do?

Dec 18, 2023 7:40pm

Stuart Swales (8827) 1257 posts

Do you have a brief summary of what the current status of the apcs_softpcs library is, and what is still left to do?

The apcs_softpcs library has coverage of double (and float) operators, compiler support functions and standard library functions (to C99). No complex double support yet (none planned, I think it would only be me who’d use it).

Note that use of /softfp with Norcroft C is not officially supported on RISC OS, but ROOL have been very helpful in fixing compiler bugs in this area.

The current stable library sources can be downloaded from:

http://croftnuisk.co.uk/exports/apcs_softpcs_20231218.zip

Here’s a snippet from the top-level ReadMe to save downloading:

What is it?

apcs_softpcs provides support for the Norcroft C compiler for RISC OS when used with the softpcs procedure call standard, avoiding use of FPA floating point instructions in the compiled code, with ARM registers used to return floating point values from functions (and also to pass in floating point parameters, as already happens with APCS-32). double values are returned (and double parameters are passed) in ARM registers in FPA word order.

apcs_softpcs provides functions to support all the basic floating point operators that may be emitted by the compiler (e.g. _dadd(x,y) for double precision x + y; _feq(x,y) for single precision x == y).

VFP instructions are used to implement all of those basic operators for both double and single precision. Note that no separate long double support is needed for these as it is equivalent to double on ARM Norcroft C. However, if VFP is not present on the system on which an application created with apcs_softpcs is run, apcs_softpcs will automatically fall back to using FPA instructions to implement these operators.

apcs_softpcs also provides functions which wrap those standard C library functions which return double (or float) values in floating point register F0, converting that to the ARM register pair {R0,R1} in FPA word order for a double value, R0 for a float value.

VFP instructions (and even just ARM instructions in some cases) are used to implement many of these functions without using the C library fallback, see the ReadMeMore file.

If a new enough VFPSupport module (0.15 on) is present on the system on which an application created with apcs_softpcs is run, apcs_softpcs will use the elementary function tables provided by the VFPSupport module to implement the transcendental functions (cos, exp, and friends) for both double and single precision, otherwise falling back to using the C library code as before. VFPSupport module 0.17 or later is recommended.

Performance?

I obtained a 5-6 times speed improvement when inverting large matrices in PipeDream using VFP arithmetic provided by apcs_softpcs support compared to a standard FPA build using FPEmulator on the same system (ARMX6).

If VFP is not available on the system on which an application created with apcs_softpcs is run, there is a small performance penalty for falling back to using FPA instructions for compiler support functions, especially for systems with FPA floating point hardware (that’s very few of them!). This also applies to standard C library math functions.

Dec 19, 2023 11:48am

Colin Ferris (399) 1755 posts

Did someone mention FPE used in Draw!

A use for Stuart’s Lib :-)

Dec 19, 2023 11:57am

Stuart Swales (8827) 1257 posts

FPE used in Draw

The correct thing to do is to eliminate the (repeated, can be sensible for transform setup) FP use there for the win.

Jan 1, 2024 6:11pm

Matthew Phillips (473) 690 posts

RiscOSM is routinely compiled with the apcs_softpcs library and the application !Run file will automatically select the appropriate !RunImage for your machine. You can tell if you are running the VFP version by checking the Info box to see if the version field includes VFP. Stuart was very helpful getting this working, and ROOL sorted out a compiler bug we encountered.

FP support

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options