No more big 32-bit cores for RISC OS from 2022
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 ... 19
Jeffrey Lee (213) 6048 posts |
I’m not sure how well QEMU AArch32-on-x86 performs, but when I tried Google’s Android emulator a couple of years ago I found it to be completely unusable. It ran so slowly that the Android OS kept killing processes because it thought they weren’t responding.
Since ARM’s business model is to license silicon implementations of the instruction set, they’ll come down pretty hard on anyone who’s trying to create their own unlicensed implementation (including soft-silicon implementations like in an FPGA). I believe ARM2/3 are safe to use (either because they pre-date ARM as a company, or because they’re older than X years – not sure which), and obviously anything ARM release themselves will be safe to use (like the Cortex-M cores, which RISC OS won’t run on because they’re Thumb-only). |
Clive Semmens (2335) 3276 posts |
X = 20. Patents expire after 20 years at the longest (they used to be 16 and then 17 years), so any design more than 20 years old cannot contain anything whose patent hasn’t expired. There can be exceptions for things whose time hadn’t come when they were first patented, but that certainly wouldn’t apply to anything that was manufactured soon after the patent was applied for, and anything actually manufactured more than 20 years ago is definitely patent-free. (https://www.gov.uk/guidance/manual-of-patent-practice-mopp/section-25-term-of-patent) |
Steve Pampling (1551) 8155 posts |
Before reading Jeffrey’s comments about QEMU performance in his experience my thoughts were about the CPU load and general baggage that running QEMU on another OS would bring. I suspect that the emulator most people expect, and one that would live up to the sort of description Rick put forth, is basically a boot-loader-cum-HAL on steroids. |
Steffen Huber (91) 1949 posts |
Not for the performance range we are now used to. The cheap-and-cheerful FPGAs from Xilinx and Altera/Intel like the ones inside MIST and MISTer (Altera Cyclone III and Cyclone V) will give you – with current implementation based on the “Amber” CPU core and assuming the cache can be enabled – performance around ARM3 levels. I know that some Amiga guys are carefully optimizing an 680×0 FPGA (the Apollo Core – see http://www.apollo-core.com/) and think they can far surpass the performance of a stock 68060 while being 100% compatible. So it is probably possible to optimize the Amber design quite a bit, but I would guess that something equivalent to the 1+ GHz Cortex-A72 or Cortex-A15 performance we already have will not be possible. By far. |
Andy S (2979) 504 posts |
the obvious fans of niche computers I’m visualising a giant propeller mounted on the outside of a rare computer’s case here. |
Clive Semmens (2335) 3276 posts |
The ARM9TDMI came out in 1998 – well out of patent. With the 32-bit address space (the ARM60 had it years earlier). You don’t need a licence to manufacture that, or anything using the same instruction set(s) (or a superset of them) as long as you doesn’t use anything patented after 2000. A new implementation of it could be a lot faster than the original, and very low power. The design of any extension to the instruction set would need careful thought. All you need is a manufacturer…and a big enough market to make the project viable… Oh. |
Theo Markettos (89) 919 posts |
With a decently-pipelined in-order CPU (ie not an ARM3 clone) I would expect (and we do) get about 100MHz. That number doesn’t really change much from one FPGA family to another. Newer/fancier/more expensive FPGAs are larger, but not a lot faster (say 50% over a decade). You can throw logic at the problem and build a superscalar core, but it doesn’t necessarily improve performance much. Adding a lot of cache does help, but even if every instruction operates in one cycle at 100MHz it’s still nowhere near a 2GHz hard core. |
Theo Markettos (89) 919 posts |
Android performance is largely because after Android 4.0 it requires graphics acceleration. I don’t know whether the emulator uses the host GPU at all, but software rendering on Android phones without GPU support is awfully slow. We have existence proofs of aarch32-on-x86_32 after all – VRPC and RPCEmu. They run well enough that people use them for day-to-day work. There’s no reason why QEMU need be any worse (although I haven’t tried it lately). |
Steffen Huber (91) 1949 posts |
The emulation techniques used in V-RPC and RPCEmu are comparatively simple. I recently read an in-depth piece about dynamic recompilation techniques currently used inside GraalVM, and what Microsoft and Apple currently do with x86-on-ARM. Impressive. The level of optimization and the resulting performance you can achieve with a JIT nowadays is staggering. But every JIT achieving those levels of performance is a work of many many highly-qualified engineers over years of optimization work. It does not seem that this is an area where Open Source solutions provides anything competitive. |
David J. Ruck (33) 1629 posts |
I suggest that the vast majority of RISC OS users are still users due to caring about that 99% of old software. It’s familiarity and inertia that keeps them using the OS, despite better solutions existing on other platforms. Having a just a desktop and a couple of built-in programs, isn’t going to retain those users. In this respect moving to 64bit is no different than tackling other major RISC OS issues such as pre-emptive multitasking, if it completely breaks compatibility with existing applications, you aren’t going to take the user base with you. So in the first instance emulation of a 32 bit platform will be a necessity to maintain application compatibility, and that’s probably where most of the existing RISC OS user base will go and stay. That’s not to say it isn’t worth trying to develop a 64 bit native RISC OS, but it’s pointless making it a simple machine translation of the current 32 bit RISC OS. It would take no advantage of any 64 bit processor features, and keep every single crippling limitation of the current OS. The new 64 bit OS needs to be completely redesigned to take full advantage of multicore processors, with pre-emptive multitasking, threading, memory protection, Linux driver model and hypervisor ready. The vast majority APIs need to be redesigned for 64 bit, as there is no need to maintain any compatibility. Only a few of the existing users (mainly those reading this) will be interested in this new OS, its future will be attracting new application developers and new users. There is no guarantee it would be a success, but its a certainty that 32 bit RISC OS is going the same way as every other legacy platform. |
Lothar (3292) 134 posts |
> Maybe, if the source is available, a machine translation could do 99% of the work How about this approach? We use it to “port” uC code from AArch32 to AArch64 1. Move AArch32 code piecewise into C as inline assembler e.g. asm( ".syntax unified \n\t" "LDR R0, =(1 << (20)) \n\t" // R0=GPIO1 "LDR R1, =0x40048080 \n\t" // R1=SYSAHBCLKCTRL0=0x40048080 "LDR R2, [R1] \n\t" "ORRS R2, R2, R0 \n\t" "STR R2, [R1] \n\t" ); 2. Without the need to really understand what it does, rewrite it as low-level C unsigned long *SYSAHBCLKCTRL0; SYSAHBCLKCTRL0 = (unsigned long *) 0x40048080; *SYSAHBCLKCTRL0 |= (1 << (20)); 3. Compile it into AArch64 |
Paolo Fabio Zaino (28) 1858 posts |
FPGAs (at this time) could be only useful for RetroComputing. One of the best SoftCores available (as mentioned by Steffen) is the Apollo 68080 which is an impressive achievement and very well designed 68000 with added full 32bit (instead of the 16/32 as the original 68000), FPU and (for the core 2.x releases) even an MMU as well as a ton of optimisations to the pipeline and added superscalar architecture, this is almost a miracle given the complexity there is behind optimising CISC. Now to the numbers (I have a Vampire V4 standalone here which is the fastest implementation to date, with latest SoftCore release) it’s ~121 times faster than an original Amiga 1200, which means (in “RISC OS” user community terms) almost as fast as a StrongARM 200Mhz on a RiscPC (however the Vampire V4 has better DMA and memory, so disc transfer can reach 10MB/s, so on that it’s faster than a RiscPC). For the Archimedes there is the Archie SoftCore, which is an ARM2a+MEMC,VIDC,IOC (should be the Amber I think Steffen) with performances a bit higher than 80% of the original A3000. RISC OS 2 and RISC OS 3 runs well and latest SoftCore release supports also VHD HDD images (so retrocomputing on RISC OS is fine and will live for many many years to come). As many stated, the other problem with modern ARMs implementation on FPGA would be patents and licenses on top of slower performances compared to the ASICs we are using these days.
So, this week AMD announced the new zen 3 CPU (will be on sales on Nov 5th) which right now is the fastest x86 available on performance per core and it’s available with up to 16 cores. QEMU on such a beast will probably keep up in ARM emulation with an RPi2 (need more zen 3 architecture details to be sure and possibly tests), however the reason I mention this is because AMD CPUs are usually cheaper to buy than Intel and so performance per core + more affordable price could potentially represent a way forward in emulation.
That’s not traditional JIT, it’s Binary Translation and at installation time, I already described that on the Apple macmini thread. I have an Apple DTK here, for anyone interested in running specific tests on it ping me on chatcube. The difference between traditional JIT binary Translation and what Apple (and Microsoft) do now is that the application gets fully translated into ARM64 at installation time, so when you run it it performs at native ARM performances, no JIT overhead basically. BUT the trick there is that the original application already was 64bit (86_64) and so the entire translation is on the same bases/assumptions. The problem is when we binary translate ARM32 to ARM64… the translated application is still expecting a 32bit API, which, in this case, has to be emulated and that causes performance penalties (depending on a lot of factors). Finally binary translation can be applied to Applications, but not the OS. The OS must be native 64bit. The way I see this: If (and only if) turns out that somebody (maybe Cloverleaf, maybe RISC OS Developments) find the funds (or voluntaries the time) the OS could be redesigned from the ground up using C (or Rust) on ARM64. (Sorry ROOL) but we should move to GCC as native compiler (or better to LLVM), the lack of proper debugging tools, support for 64bit, full ARM features support on DDE makes DDE suitable only for retro coding or just for IoT development on old 32bit ARM. However, a redesign is not just conversion. It’s time to actually redesign the OS. It’s ok to think so because the 64bit RISC OS (even in the case someone miraculously finds a way to do a basic migration to 64bit) won’t be compatible with the original 32bit RISC OS, unless someone adds: About the old Apps, if they can be converted to ARM64 great, if not then either run an emulator on RISC OS itself or trying the Binary Translation way (which, again, is not easy) would probably suffice at the beginning of the new 64bit OS. With that said, everything that may look impossible right now will stay impossible until someone makes it possible ;) My 0.5c… |
Peter Howkins (211) 236 posts |
A few things here that bear mentioning. 1) The date of badness is not 2022, it was 2018. With the release of the first big arm core (Cortex A76) that does not have 32bit supervisor mode, to run the RISC OS kernel and all the modules in. 2) The amount of work in a simple ‘port’ from arm32 to C or arm64 suggested here is orders of magnitude more than all the effort that has been expended on RISC OS in the last 20 years. 3) Deciding to also mix in a hit-list of all the big-ticket items missing from RISC OS is turning a impossibly large task into a comical one. E.g. Process aware, PMT, kernel threading, proper virtual memory, multi-core, multi-user, user/system privilege splitting, dynamic libraries in user-space. 4) It will break all the software, pretty much entirely, except a few incredibly simple basic apps. It’ll make the 26→32 bit addressing changes look trivial. 5) If you ever succeed in completing the port, you have an OS that might just about have caught up with its contemporaries, with no apps to run on it. At this point why expend the effort, there are a lot of other OSes that do not run RISC OS apps already. 6) Automatic decompilation of arm32 to C is a bit of a dead end here. Most decompilation techniques rely on the machine code/assembler to have a sensible/predictable ABI for things like function calls (C compiled code does, hand written assembler doesn’t). Also detecting whether each register is holding a pointer or a numeric value and generating the relevant code sequences becomes important and that is not programatically determinable. As much as I like speculating about this stuff as much as the next guy, 64 bit RISC OS is not happening ever. Anyone asking for money for this task is either delusional or wilfully exploiting you. |
Clive Semmens (2335) 3276 posts |
I fear this is reasonably accurate. I might believe it if someone said it was only one and a half orders of magnitude – but that’s more than enough to make the whole idea completely Pi-in-the-sky. My tongue-in-cheek proposal for a souped-up ARM9 remanufacture is possibly marginally less implausible, but it too is pi in the sky (when you die…) Pray for a long run of Pi4s, not forgetting that the contribution of RISCOS users to the demand for it isn’t enough to make an iota of difference to the prospect of that long run. </wet blanket mode> |
Clive Semmens (2335) 3276 posts |
I suspect this is also true. My attachment to !Draw and !Zap is precisely that I’ve not found any better (or at least equally or nearly equally good) solutions for those particular applications on other platforms. It’s possible they exist, but if so I’m pretty sure they only exist the wrong side of a rather large financial barrier. My attachment to BBC BASIC is purely intellectual laziness, an unwillingness to learn a new language; all the other computer languages I know are either lost in the mists of time or HORRIBLE. The effort involved in rewriting !Draw and !Zap to run on Another Platform would be orders of magnitude less than the effort of writing a 64-bit RISCOS, but sadly I think Mr Ruck is probably right in his assessment that my motivations are those of a small minority. |
Steve Pampling (1551) 8155 posts |
2) The amount of work in a simple ‘port’ from arm32 to C or arm64 suggested here is orders of magnitude more than all the effort that has been expended on RISC OS in the last 20 years. and more… Thank you, Peter, far better than anything I could ever conceive of writing. When I said
I was rather assuming people would see the mountain of a conversion task and realise that option 2 is really the only viable one1 for doing anything on a 64-bit only board. 1 Unless some miracle happens and a large number of developers turn up to convert the whole OS and applications in a fashion all the old users miraculously like. |
Dave Higton (1515) 3497 posts |
How do you eat an elephant? One slice at a time. I’ll remind you again: Julie Stamp has already started. |
George T. Greenfield (154) 748 posts |
I suppose the question arises: would running under emulation on a 64-bit ARM actually be faster than running natively on the fastest available 32-bit ARM boards? |
Kuemmel (439) 384 posts |
@Peter: Regarding that dropped 32bit supervisor mode since A76…it made we wonder, why Timothy could run Risc OS as a Linux app on a Neoverse N1 (which should be same family like A76, though couldn’t google exactly on the svc 32 bit support) ? Or does Risc OS – as a linux application – not care about the 32 bit svc ? …may be that’s also a hint on the way forward…Risc OS as an app on top… |
Clive Semmens (2335) 3276 posts |
And if there’s only one (or a few) of you, most of the elephant is somewhat decayed long before you’ve finished it, no matter how greedy you are, or how good your freezer(s). |
David Feugey (2125) 2709 posts |
Absolutely. Anyway, we’ll have better 32bit chips, and new boards for a few years. Boards that will become obsolete around 5 years later. So we have time. After these (10?) years, we’ll still have Cortex-R and Cortex-M 32bits offers, with probably MMU on them and much better performances. A chance for RISC OS in the embedded market. Some parallel legal work could be made around the ARMv1, v2, v3, with new work around softcores. These cores could be used with FPGA or be developed as full ASIC/Microprocessors. And could compete with RiscV for example. And of course, we have RISC OS on Linux, with native performance, and possibility to run it on Cortex-A78/X1 and the next/last 32bit design.
Correct. RISC OS is not only an OS with a GUI. It’s an ecosystem, with a lot of software.
Not on ARM, where it’s KVM virtualisation.
Yep. RISC OS is a 8bit like OS for 32bit systems. With a better size/requirements than Linux. And an ‘acceptable’ GUI with ‘quite’ modern applications. Why should we want to compete with Windows or macOS, when we can compete with Embedded OS, while being probably the best retrocomputing solution? Realistic projects could be: For this last idea, we could rely on the work made on RISC OS 3 ROM. Today, we can disassemble it, and so replace a lot of components with more modern ones. The main kernel is another story, but perhaps not impossible to change too. Benefits: |
Doug Webb (190) 1158 posts |
So the issue here is how long before we run out of available 32bit mode processors which mean we have to move across to 64Bit and/or emulate. The thing is that if you do a port of RISC OS today you really ought to be factoring in the availability of that board and boards like the iMX6, which are designed for say the automotive industry, have to declare a product life cycle and hence if you are careful you can do a port(s) for products that will keep the market going for a reasonable amount of time whilst other work goes on. RISC OS currently does not make use of multi cores so on any system it still has lots of potential for speed increases. In addition it does not always have the required functionality i.e. WiFi so there is still potential improvements there. So as work goes on to bring the OS up to date in these areas then as long as it is done in a portable language and API design looked at carefully you can minimise any further work required to make a 32/64 bit change if indeed that is ever required. So whilst 2022 seems a hard date in reality it is not and that gives time to move things forward and also to service the existing market and also perhaps bring new blood in who may wish to cut their teeth in helping move the OS over to some new 64bit version or provide a layer that helps that move. I don’t under estimate the task and issues faced but everything depends on a clear idea of what the end game is i.e. New 64bit OS or future Emulation/Virtualisation, the road map and plan on how to get there. Until someone or a group such as ROOL/ROD agree on that vision then we can all throw in ideas but it does need some leadaership on which way to go. |
David Feugey (2125) 2709 posts |
Yep. I hope the next Pi5 will still be compatible with RISC OS. I bet also a lot on ADFFS and other compatibility solutions :) |
Doug Webb (190) 1158 posts |
All I will say is RISC-V if rumours are to be believed… |
Steve Pampling (1551) 8155 posts |
While it’s much needed and laudable work it’s a long job. There is also the fact that any transfer to a 64-bit platform would still kill off a large proportion if not the vast majority of legacy (32-bit safe) software and put a stake into the undead 26-bit stuff. The source being all in C does avoid needing to understand any underlying machine instruction quirks1 so that would probably speed any future changes. 1 Having got to the point where I half understand and can write chunks of assembler I feel I know various of those quirks, but the mental model and knowledge of the quirks of C is something else. |
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 ... 19