VFP advice/tutorial
Steve Drain (222) 1620 posts |
I have been looking back through the forum and at many other sites, but I feel that I just do not have enough confidence to use VFP – I do not want to use NEON any time soon. Firstly, there was mention of the VFP support module a couple of years back, but the link in those early posts no longer works, nor does a site search. The module is listed by *Modules, but is there documentation? Second, the BASIC assembler was updated to use the new mnemonics, but it does not provide any help. When this was discussed a year or so ago, I produced a trial BASICHelp module with space to include VFP and NEON help. I feel now that I could probably fill in some of those spaces myself, but would be happier if a more knowledgeable person did it. Is this a project worth pursuing, or is there the prospect of the BASIC/BASICTrans modules being worked on soon? Lastly, there is a scattering of VFP assembler code around, but it is not of a tutorial nature. I would really like some simple BASIC assembler examples to get me going. As VFP does not have the transcendental functions of the FPE, any VPF code that does that would also be very handy. I have seen that Charm could be used to produce VFP code, but that would need me to learn Charm. ;-) |
Trevor Johnson (329) 1645 posts |
Did your search find these? (I can’t remember whether or not they’re BASIC assembler and can’t check now.) |
Trevor Johnson (329) 1645 posts |
As for the VFPSupport documentation, has VFPSupport vanished too? |
Steve Drain (222) 1620 posts |
The link suggests it does not exist. That does not mean it speaks the truth. |
Steve Drain (222) 1620 posts |
More or less. I do not want NEON, nor VFP for vectors, but I would like to see how it compares to the FPE for convenience and speed. The examples use ExtASM; I found them a bit tricky to follow. |
Jeffrey Lee (213) 6048 posts |
As for the VFPSupport documentation, has VFPSupport vanished too? I get the feeling those pages vanished because they weren’t linked to from any other wiki pages. Although in the case of the VFPSupport page it took a long time for them to get deleted! No matter; I should be able to recreate it (and finally split it up into one page per SWI) when I get home. |
Trevor Johnson (329) 1645 posts |
Good job you keep copies/have things committed to memory! |
Jeffrey Lee (213) 6048 posts |
It turns out I didn’t have a local copy of the page, so there might be one or two details missing compared to the original (e.g. I think the original mentioned which errors were returned by calls). But I think this new version is a bit more descriptive than the old one. |
Rick Murray (539) 13821 posts |
Hi, So looking at this – "If a program attempts to execute a VFP/NEON instruction without having a context active then the instruction will abort. – if I wanted to put together a short single tasking program to test various VFP/NEON instructions, I need to wrap it in context SWIs? |
Steve Drain (222) 1620 posts |
Many thanks to Jeffrey in recreating the pages. It still leaves me stuggling a bit, and Rick’s comment echoes my request for some tutorial examples. It is clearly not as ‘simple’ as using the FPA/FPE. My interest lies in providing double precision floats in Basalt, working within BASIC V. The FPE is very slow for this, but I suspect the context switching might make VFP a bit cumbersome, too. Any further thoughts on providing BASIC help? |
Jeffrey Lee (213) 6048 posts |
Yes.
CreateContext, with user mode flag, letting VFPSupport allocate the memory is probably the simplest (and DestroyContext to get rid of it at the end). That way you won’t have to worry about the application space flag or allocating the context memory yourself.
The same way you should deal with file handles, RMA allocations, network sockets, etc – register environment handlers (or for BASIC use ON ERROR) and make sure your code cleans up after itself. Ideally the OS would be capable of cleaning up this kind of thing for you, but teaching the kernel about process creation/destruction is a task for another day. |
Rick Murray (539) 13821 posts |
As you are working within the realm of BASIC itself, would it not suffice to create a VFP context when a program is initialised, and destroy it when the program quits? Not sure how this works with the Wimp re. task switching, but one could argue that managing VFP contexts is a job for this task switcher to do, not the underlying application… [much as how it preserves FP right now – indeed this could be bolted into the existing FP code, couldn’t it?] |
Jeffrey Lee (213) 6048 posts |
The Wimp will switch VFP contexts automatically (unlike with FP, where you actually need to set a flag to get the FP registers to be preserved over Wimp_Poll!) |
Rick Murray (539) 13821 posts |
Does this work on a Pi? I wanted to put together some example code, however…
…? |
Chris Gransden (337) 1203 posts |
The Raspberry Pi has a VFP2 so only has 16 registers, VFP3 has 32. Try changing the 32 to 16. It should then work on OMAP3 and 4 as well. |
Rick Murray (539) 13821 posts |
I give up.
Well, that was the plan. I think I’ll stick with integers. :-/ |
Jeffrey Lee (213) 6048 posts |
Use a more recent version which uses the UAL syntax?
I think part of the problem is that you have stuck with integers. VMOV merely moves data between registers, it doesn’t convert between integer and floating point. You want something like this in the middle: VCVT.F32.S32 S0,S0 VCVT.F32.S32 S1,S1 VCVT.F32.S32 S2,S2 VCVT.F32.S32 S3,S3 VCVT.F32.S32 S4,S4 VCVT.F32.S32 S5,S5 VCVT.F32.S32 S6,S6 VCVT.F32.S32 S7,S7 VMUL.F32 S8,S0,S4 VCVT.S32.F32 S8,S8 VCVT.S32.F32 S9,S9 VCVT.S32.F32 S10,S10 VCVT.S32.F32 S11,S11 Although even that won’t do exactly what you want, as there are some arcane rules about how vector mode operates, depending on which banks the source and destination registers are in. You will get the output you’re expecting, but it won’t be performing the operation you’re expecting :) |
Steve Drain (222) 1620 posts |
It seems to be tricky. I have not been quite as ambitious as Rick, but using double precision. FMULD (VMULD/VFMULD?) was not recognised. I did get VLDR to be recognised, but VMOV assembles and throws an ‘Undefined instruction’ error. I am, of course, working in the absence of documentation for the BASIC VFP mnemonics, so I could be barking up the wrong tree. |
Steve Drain (222) 1620 posts |
I think I have found an anomaly in the documentation. Do these two statements conflict?
I unset bit 0, and now things work as expected – so far, that is. ;-) |
Jeffrey Lee (213) 6048 posts |
The key bit in that text is “supported on some systems”. At the moment I haven’t actually implemented support for that flag, so you’ll be able to use the contexts from user mode regardless of the setting. Depending on how lazy I’m feeling that might change in the future :) |
Steve Drain (222) 1620 posts |
It is interesting that when set I have problems that do not occur when unset. I am still working on this. |
Rick Murray (539) 13821 posts |
I see. Useful cross-reference: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473c/CIHFAHFG.html
;-) I meant, plain ARM, no FP at all. For me, the FP stuff is ‘not obvious’ (and never will be). Binary ops are more my speed…
Hehe – I wouldn’t say “ambitious”, I just wanted to set the system up to multiply four sets of numbers at once. After all, the ARM can do a straight multiply, so for something more exciting, I wanted to do a bunch.
Strange – the only thing that failed for me was the multiply. Take that out, the code runs. But, as you spotted, there might be something in the flags. I can’t test with bit 0 = 0 as I have turned my Pi off (got fed up of watching the rain splatter down the window) and anyway I have to write a letter to a new insurer because wonderful Monsieur-le-President Hollande implemented a new law where every employee in certain sectors must have a medical top-up, part paid by the employer. While this is a laudable action (ha ha Obama!), Mr Hollande didn’t include a mechanism for people who thought ahead and already have such a thing of their own volition. It should have sufficed to show my employer that I had coverage in course. The end. But no… But me? A stupid letter for something that should never have concerned me. And all this must be done in French. Oh joy. Anyway, thanks Steve for spotting what might be the problem. But no, I won’t be trying it tonight. 1 By ARM ARM, I mean this, all 800+ pages of it. It’s a shame there isn’t a published update for the Cortex era, because I much prefer a phone directory to a PDF. (^_^) |
Steve Pampling (1551) 8164 posts |
Excuse me? what made you think your nearest HR held the monopoly on that? |
Rick Murray (539) 13821 posts |
I didn’t. I was just hoping that mine might have notice……oh, never mind… |
Jan Rinze (235) 368 posts |
Just compiled gcc 4.7.4 for RISC OS to see if i can get it to compile with -mhard-float. Either gcc is not telling ld to use hard-float or there is no support for hardfloat with ld. |