Where to start kernel hacking?
NancySadkov (10280) 34 posts |
Well, RISC OS had modern style apps running out of arc fs, when everyone else had to guess how to unpack these ARJ files and what BAT file to run inside this folder (that is still a thing on Windows, since Microsoft completely failed with its appstore). When I said “modern,” I meant that RISC OS has modern user space and the general vector of movement, while Unix is stuck in the mainframe time-sharing era with tty everywhere, tracing history back to early telegraph systems. The kernel is just something making the userspace work. In fact, as a user you don’t really care if it is a BSD or Linux, as long as it has CC and make, allowing you to build and run Unix software, you just expect it to do the magic for that to happen. There are a few rough parts (apparent inspired by the classic IBM/VMS), like the leaky filesystem abstraction (i.e. ADFS::HardDisc4.$) instead of just `.` for root. But nothing nothing major like with Unix’s style package system and `make install`, which is just defective by design. As I see people currently are trying to bring in the monolithic Unix userspace to RISC OS and there is a separate package manager for that, but it is ultimately a bad idea, just like the idea of exposing all `*` commands in a single namespace (i.e. that prevents you from say having separate alternative network packages or two different C compilers). With Unix people go to greater lengths to compile anything, down to invoking chroot. And no Linux distro fork could ever solve that. But with RISC OS I think that can be fixed since RISC OS uses Apps, so just place App’s stuff into a personal namespace, which could have just anything. That should avoid DLL/dependency hell. There is Android now, but it is becoming more and more of a walled garden, while apps there are geared at mobile phone use, not workstation productivity. |
Paolo Fabio Zaino (28) 1882 posts |
Had it’s definitely not “have”. Linux in this day and age boots entirely from a compressed image, has tons of modularity and can be even updated live (without even rebooting the OS). SeL4 is mathematically proven to be secure and it’s a fast micro-kernel. There are plenty of fantastic architectures out there, RISC OS ain’t one of them. Sure, if people sit down and redesign it correctly, nothing is stopping it from becoming an amazing OS, but right now it’s just a fantastic BBC Micro pumped with steroids and no one could have imagined we could have in 2023 a crazy fast BBC Micro with a lot of memory and of the size of a tiny little box. This is what makes RISC OS amazing, not comparing it with actual modern OSes. RISC OS has no concept of user-space vs kernel space, that is a light division that exists only because of the way SWI works, a “user-space” app can write in the kernel space and knock off the OS any momemnt either by wish or involutarely. Normaly we pass poiinters between user-space and kernel like they are the same place. RISC OS has no concept of processes, nor threads, and it’s architecture is limited to do the minimal it was required for an Acorn to run (now people have put some effort into making the minimal for a Raspberry Pi to run). It doesn’t even use all the resources offered by a Pi. Other OSes works really well compared to RO and that is why no one cares about the “almighty never seen power of RISC OS”. And that is how it should be if the RISC OS community doesn’t want it to be as almighty as very few users tend to describe it here and there. The concept of a self contained App was indeed created by Acorn back at the end of the 80s, but now it’s available on macOS and has been extended and improved and macOS runs on ARM at low power and it’s a really nice OS presenting all the benefits of the old RISC OS and more. With that said, if you like RISC OS (again) that is great, but I’ll give you time to familiarise with all the idiosincrasies and inconsistencies. Enjoy your journey :) |
NancySadkov (10280) 34 posts |
Thanks. But for me the main issue is the absence of a tool-chain. I’m unemployed (mental issues), so cant afford it, but even if I could, it really creates a barrier for entry. And then compiler should come with source code, which I think DDE doesn’t include. So even if you convert everything to C, you wont be able to re-target the compiler to get RISC OS running on your fav pregnancy test. The features like multi-threading and paging are again something orthogonal to the core OS design. Original Unix ran without MMU and you can run Linux without MMU or memory protection, and users wont even notice that (until some software crashes the kernel instead of a segfault). These are again “left as an exercise to the reader.” But sticking with the classic British Micros spirit, I see it as an educational project, while writing a compiler or a thread-scheduler is not the worst undergrad project. |
Steve Pampling (1551) 8188 posts |
OK, so they finally realised they were being idiots. Still going to use NotePad++ because… |
NancySadkov (10280) 34 posts |
Ok. I made a github for the public domain RISC OS toolchain I work on. Also published my previously commercial NCM preprocessor as public domain. It is needs to be cleaned up a bit and integrated with subc, It offers advanced macro features, which should be disabled for C99. There is also the prototype transpiler for the C extensions, I used in my projects. Beside RAII, does everything C++ doesn’t allow you to do, but you really want to do. Again, should be implemented as an extension for people who dislike both C++ and ObjC. char *int.cstr { int n = *this; int l = snprintf(NULL, 0, "%d", n); char *s = (char*)mAlloc(l+1); sprintf(s, "%d", n); return s; } char *U32.cstr { int n = *this; int l = snprintf(NULL, 0, "0x%x", n); char *s = (char*)mAlloc(l+1); sprintf(s, "0x%x", n); return s; } char *F32.cstr { F32 n = *this; int l = snprintf(NULL, 0, "%g", n); char *s = (char*)mAlloc(l+1); sprintf(s, "%g", n); return s; } char *F64.cstr { F64 n = *this; int l = snprintf(NULL, 0, "%g", n); char *s = (char*)mAlloc(l+1); sprintf(s, "%g", n); return s; } char *vec3.cstr { vec3 v = *this; int l = snprintf(NULL, 0, "(%g,%g,%g)", v.x,v.y,v.z); char *s = (char*)mAlloc(l+1); sprintf(s, "(%g,%g,%g)", v.x,v.y,v.z); return s; } S8 char.asS8 {return atoi(this);} U8 char.asU8 {return atoi(this);} S16 char.asS16 {return atoi(this);} U16 char.asU16 {return atoi(this);} S32 char.asS32 {return atoi(this);} U32 char.asU32 {return atoi(this);} //FIXME: handle the full range F32 char.asF32 {return atof(this);} F64 char.asF64 {return atof(this);} CFile *CFile.write(void *bytes, int len); int CFile.read(void *bytes, int len); #CFile_readType(type) { inline type CFile.read#<type>() { type t; this.read(&t,sizeof(t)); return t; }} CFile_readType(S8) CFile_readType(U8) CFile_readType(S16) CFile_readType(U16) CFile_readType(S32) CFile_readType(U32) CFile_readType(S64) CFile_readType(U64) CFile_readType(F32) CFile_readType(F64) CStrs *char.split(char delim) { auto r = new(CStrs); auto d = this.dup; //in case it resides in ROM auto q = d; do { auto p = q; while (*q && *q != delim) q++; if (!*q) { r.push(p.dup); break; } *q = 0; r.push(p.dup); } while(*++q); delete(d); return r; } /* ... */ CFile *in = new(CFile); if (!in.ropen(filename)) goto fail; PLYObject *hdr = read_header(in); if (!hdr) goto fail; //say(hdr->format); if (hdr->format.ne("ascii 1.0")) goto fail; char *l; foreach(t,*hdr->tbls) { say("loading:", t->name,"[",t->nrows,"]"); t->rows = mAlloc(t->nrows * t->cols.len * sizeof(F64)); F64 *p = t->rows; F64s *list = &t->list; for (i = 0; i < t->nrows; i++) { l = in.line; mBegin(0); //FIXME: this needs a macro `with(words, l.split(' '))` auto words = l.split(' '); mBegin(gply); int k = 0; for (j = 0; j < t->cols.len; j++) { auto col = &t->cols.elts[j]; if (col->size_type) { *p++ = list.len; //printf("%d\n", list.len); char *w = words.elts[k++]; double nelems = string_as_number(col->size_type->id,w); list.push(nelems); int inelems = nelems; int nwords = words.len - k; if (inelems != nwords) { say("PLY: list is missing elems: ", inelems," != ", nwords); } for ( ; k < words.len; k++) { w = words.elts[k]; double item = string_as_number(col->type->id,w); list.push(item); nelems++; } } else { char *w = words.elts[k++]; // convert everything to F64 *p++ = string_as_number(col->type->id,w); } } mEnd(); mEnd(); } } delete(in); |
NancySadkov (10280) 34 posts |
GET/INCLUDE, WHILE, MACRO and [|] work now: Took longer than expected to get there. Unfortunately now the fun parts are over and it is about adding the rest of the features and checking that output is binary equivalent to ObjAsm If somebody wishes to help, please supplement the RISC OS or some other rich code base with a set of ObjAsm produced AOF files. Meanwhile I’m still looking at the best path towards a public domain build system. |
Lauren Etc. (8147) 52 posts |
For what it’s worth, I’m not the most adept with compiler stuff but I was curious so I did get ncc to build on RISC OS via gcc, once I fixed what appeared to be a syntax error in the parser.y:
It looks like it’s a cfront-style transpiler for something called New C, but I’m not familiar with that. It does look interesting though. Could you elaborate on that? Given the earlier discussion, was any of this AI assisted? I was also able to build ncm. I didn’t take much of a crack at getting subc set up to build on RISC OS, but I’m not sure it’s been modified anyway? |
Colin Ferris (399) 1819 posts |
Err – why not use the GNU assembler instead of ObjAsm. |
Rick Murray (539) 13872 posts |
Because they aren’t the same?
RISC OS has never supported ELF. It’s a softloaded extension that does that.
Anything else, it looks for the RunType system variable to see what to do to get the file running (BASIC is " |
NancySadkov (10280) 34 posts |
1. GAS is not compatible with ObjAsm. In fact, even GAS’s x86 syntax manages to be completely incompatible with the Intel’s reference. I heard the reason for that was to dissuade people from using GAS to compile existing commercial software.
It will require a lot of work. Because it is not a complete C compiler. Author kept it simple for the purposes of his book (i.e. it doesn’t have a proper libc or even a macroprocessor). There is also Small C compiler, but it will be harder to mod into something working. Same way with Clang – it is too big to easily maintain. Luckily RISC OS comes with its own libc (Sources.Lib.RISC_OSLib.c), so the only thing we need is a Norcroft C compatible compiler.
It is a set of lightweight extensions to C99. It introduces `auto` variables, typeof and a way to define methods on structs and builtin types. I.e. you can take `int` and define a cstr method on it. So NewC supplies `say` macro, which just calls cstr on its arguments and passes the result to printf. Obviously is supports vectors and matrices (see vec3.h). Basically NewC does bare minimum to turn C into a comfortable language for larger projects. The prototype fared really well on my Voxpie voxel editor, so the idea is to add NewC as an extension to SubC, since I already have to reuse NCM for it. I have never bothered to document NewC, since I only used it myself: but the general operation is as follow: A few quirky parts:
Hope ELF will always stay as a 3rd party plugin. Makes more sense to continue developing AIF if something more complex will be required (i.e. if ROS gets ported to x86). |
Theo Markettos (89) 919 posts |
I’m very late to this party, but puzzled why nobody has mentioned GCCSDK It contains:
It’s been over a decade since somebody tried to cross compile RISC OS with GCCSDK but that page details the state of things at that time. Some of those issues have now gone away (the ‘shared makefiles’ mean there’s a Unix-friendly build system, now using git not CVS) but no doubt there are numerous further compatibility things lurking. I suggest it would be most productive to start by running the build on RISC OS, replacing ‘cc’ with ‘gcc’ (will need some flags changing, I think the makefiles may already have them) and ‘objasm’ with ‘as’ and see how it goes. Eventually you’ll find something won’t compile – either that means changing the source or fixing up asasm or other tool. |
Rick Murray (539) 13872 posts |
That screams of disinformation. Why would people writing a free to use assembler give a crap about what it is used to build? Artificial barriers would only get in the way of its adoption by people. More likely Intel have a good legal department who shook a stick and said “grrr” and thus things were made a little different not to fall foul of America’s crazy legal system.
GPL “infects” source code. Using a GPL tool to make something else doesn’t carry any specific penalty. Granted, I’m not sure about distributing it with the source, but the simple answer is just don’t. Rather like how RISC OS doesn’t come with ObjAsm and CC built in…
? Sure, you’re going to maybe have to dick around with file extensions, but that’s pretty par for the course given every other mainstream OS uses them, so network shares, emulators, Git… all needs extensions to be messed around with in some way or other.
It doesn’t need one. RISC OS has a standard library as a standalone (ansilib) and as a shared module (Shared C Library).
Plus, possibly, a linker… unless DRLink was ported to 32 bit?
AIF effectively isn’t developed and never was. Much of the stuff in there is for the debugger and is ignored by RISC OS.
I think the only thing RISC OS pays attention to is the presence of decompression code in order to perform the decompression itself (given a lot of the built-in decompression code failed on a machine with split caches). It’s in FileSwitch → FSControl if you fancy a rummage. When it comes to x86 executables, in some imaginary world where there’s an x86 RISC OS, I can guarantee that AIF will be worse than useless as the x86 doesn’t work like that. That’s part of the reason ELF is a mess. It has to cope with the fact that there are 16 bit machines, 32 bit machines, word aligned machines, little endian and big endian and maybe even middle endian machines, and…… Of course, RISC OS being RISC OS would probably cope like this:
|
Rick Murray (539) 13872 posts |
Well, that was unexpected. I copied the first five words of the Windows 3.1 TaskManager to the Ovation’s AIF header. +00 00A25A4D ADCEQ R5, R2, R13, ASR #20 +04 00000003 ANDEQ R0, R0, R3 +08 00000020 ANDEQ R0, R0, R0, LSR #32 +0C 0007FFFF <undefined instruction> +10 40650100 RSBMI R0, R5, R0, LSL #2 The result? Nothing happened. Well, actually something did happen. It is executing the code. The various unknown instructions, until it gets to this: +6C CMP R3, #0 +70 MOVLE PC, R14 at which point it bails out. There are several unknown instructions (&7FFFF, &5B5B4, and &199C), but since the top nibble is zero this makes them EQ conditional and that’s not a status that is met, so these bogus instructions are simply skipped over. Entirely replacing the ARM code with the Windows 3.1 code, it actually starts up and makes it as far as &8200 where the instruction So, as you can clearly see, AIF… doesn’t do a lot within RISC OS itself. |
Steffen Huber (91) 1958 posts |
On the very first page of this discussion, GNU GCC/GPL were ruled out categorically, so I refrained. It usually ends up anyway with discussions around ELF and “it’s not native enough”, which too often successfully derailed any reasonable discussion in the past. |
NancySadkov (10280) 34 posts |
I think one can introduce SWI_InitMe call, as the first word in the file or use one of the undefined ARM opcodes, for the newer AIFs, and then dynamic linker will read the header, doing unpacking and relocation, and whatever is required to improve portability between OS versions. But you can still have raw ARM files.
The thing with assembler/c-compiler/make is that one could want to add them into existing code base. I.e. both assembler and linker could be integrated with C compiler to avoid creating intermediate files. And that by extension requires C compiler to be GPL. That is why I think a public domain toolchain could be useful in general. Now, at one point my personal Lisp-based programming language, Symta, used GCC to compile into source code, but Lisp supports `eval` (a way to compile code during runtime), while carrying GCC around didn’t felt well. Especially when one wants the compiled programs to be small in size. Now integrating GCC (a huge unmaintainable monster) and relicensing everything under GPL was not an option. Making a C compiler is not a small feat, so I just went with a personal VM, which required designing a bytecode and an assembly language for it, but still more doable than a C compiler. I’m still looking towards compiling to native code, but again, I want to use a C compiler for that purpose to easy integration with FFI. Yet that still needs an easy to mod C compiler, so one could add builtin features specific to the application. That is one of the reasons I’m designing this assembler to be a single file include, so it could be used as a part of other program, like a debugger or a Lisp system. And given that I already need such utilities, why not make them useful for somebody else? RISC OS has a good amount of software in ObjAsm, which is closed sourced, so having an open source too cold help maintaining it. I.e. people could mod the assembler to do conversion to their favourite GAS if they so desire. |
Steve Fryatt (216) 2107 posts |
We already have asasm, which came from as. I don’t know how compatible it is with ObjASM, but the README seems to suggest that at least some of the RISC OS sources have been built with it. For what it’s worth, I use it to build all of my modules using ObASM source code, and have never found anything that it doesn’t do by the ObjASM manual. |
NancySadkov (10280) 34 posts |
asasm’s source code footprint size is comparable to RISC OS’s Kernel (which is in assembly). And that is excluding all asasm dependencies (gnu bintutils?). And it doesn’t look particularly well coded with stuff like Input_MatchStringLower (“extension_register_count}”)going around for hundred of lines. It would have took you less effort to rewrite RISC OS into C at this point. And I keep wondering, if you need yet another GNU/Linux, why even bother with RISC OS? RISC OS seems like complete opposite to everything Unix/Linux is. It is like learning Japanese just to translate Bible/Quran into it. |
Clive Semmens (2335) 3276 posts |
I suspect there has been at least one person who has done pretty much exactly that, Nancy! :-) Although they may have done some preaching in Japan too, of course. I’ve met that kind of person in India. Their Hindi is better than mine, but it’s not so good that their Hindi Bibles wouldn’t have benefited from some cooperation with native speakers… |
Rick Murray (539) 13872 posts |
Isn’t that what missionaries used to do in order to bring religion to the natives they wanted to indoctrinate?
Well, we do say “RISC OS isn’t Linux”. But… What does needing (or not) another Linux have to do with whether or not asasm is any good? I’m not following the logic here…
Why? Here’s one for you to ponder. Will your compiler be able to create module code? If so, the libraries and stuff that it links with are not the same as applications. It’s the responsibility of the linker to ensure that all the attributes match. Likewise, not mixing 26 bit and 32 bit code… [in technical terms, basically if you’re a module and you’re in SVC mode then you’ll need to preserve R14 around SWI calls because calling SVC mode from SVC mode…] |
Dave Higton (1515) 3555 posts |
This has never been a good idea, and it still isn’t today, for the reasons Rick just gave. And more. |
NancySadkov (10280) 34 posts |
asasm is part of the GCC toolchain. If you start moving into this direction, you will bring in the rest of the GNU stuff and the GPL with it. At this point you may as well rewrite everything into Rust. I heard they are now doing that to Linux kernel, right after they added C++ support.
Depending on your build setup. But generally intermediates introduce a proportional slow down, while inviting the use of hacks like RAM FS. Especially if you try to use GCC for something it was never designed to do, like being a part of a Lisp system, which tends to produce a lot of throw away code, which invites not only absence of intermediate files but a way to send pre-parsed and pre-checked syntactic trees to the compiler. It may also expect compiler to avoid inlines and behave in a well defined ways. I had issue with GCC inlining everything resulting in enormously sized files, but there was no option to disable the high level optimizations, leaving only the peephole ones.
GCC and associated tools don’t do it good, because you can’t easily integrate them any way you like, while they also require a unix environment to function. They make a ton of assumptions.
Isn’t that a part of the calling convention? Something GCC is not very good at, since GCC doesn’t allow user to customize it. Because GCC was designed for a single compiler Unix system, where it dictates the entire ABI, including calling convention. To modify calling convention, you have to patch GCC and then somehow recompile it – an adventure on its own, since GCC uses super arcane bootstrapping, generating yacc parsers for simple config files, doing a ton of other nonsense. In fact, GCC even needs Perl to function!!! Luckily not Perl 6 ( See https://gcc.gnu.org/install/prerequisites.html ). Hope they will rewrite GCC in Rust one day. |
Steffen Huber (91) 1958 posts |
And this is relevant because…? Comparison to ObjAsm might be interesting, but irrelevant in practice I guess.
I would be surprised if the original AsAsm functionality would need binutils, but that might have changed when ELF output was introduced. Anyway, I fail to see the relevance.
So the obvious strategy would be to take AsAsm and improve its code. Because once you developed your replacement of ObjAsm and ironed out the last few bugs, it is likely that your codebase will also look like it was not particularly well-coded.
I would guess you are wrong by two orders of magnitude.
Does the existence of GCC on a platform automatically turn it into GNU/Linux? Amiga? Atari ST? DOS? Windows? Solaris? AIX? HP-UX? What you say makes little sense to me.
It is certainly very different. Which is one of the reasons why cross-compiling RISC OS on faster and more powerful platforms like Linux on x64 is such a good idea. Hence GCCSDK. |
NancySadkov (10280) 34 posts |
Maintainability. Bloated code-bases full of poorly written code (i.e. GNU software) are hard to modify. And assembler is obviously one of the more important parts of the OS. Since everything else depends on it. Only CPU is more important.
I would rather contribute to public domain. Then everyone could use it for any purpose.
If you spent this amount of effort to maintain a compatible assembler, why not remove any need for it instead? I.e. use GAS ARM, which doesn’t need any modding to GCC. Or go further and just use Linux, adding a WINE style emulation for RISC OS programs, then call it say RINE. I think some people tried that to a small degree by running RISC OS as a Linux kernel module or something. Guess they can be proud.
Nope. Just like bringing a ravenous bear into your house doesn’t turn your house into a forest.
Sounds a bit chauvinistic. |
Paolo Fabio Zaino (28) 1882 posts |
Ok given some “imprecise” information have been shared, here are the set of corrections, just to avoid the proliferation of personal opinions shared as “facts”:
The source size of AsASM is rougly 938KB (that includes source’s comments and everything in the src directory).
No, an assembler normally assembles (doesn’t compile aka doesn’t transform or add code). Hence there is absolutely nothing in an assembler that can introduce code that may force any type of licensing, unless the user volutarly links or includes such extra code. AsASM in particoular can produce AOF and RISC OS native formats, so nothing to see here, move along ;) Different story is for a linker and other tools of course.
Correct, as of last I know, it was fully compatible with ObjASM and also incorporated some extra extension from other ARM toolchains, so it’s perfectly capable of producing the right results from ObjASM sources (which also remarks the previous comment, ‘cause it shouldn’t be adding anything else to the binary)
No, unless you integrate that with your core system, but even in that case, if integrated as a sub-system, it’s still an extension. See Windows POSIX Sub-System, or even Windows WSL (which is a full Linux infrastructure, Windows is still Windows).
This to me feels like an extremely generic comment, sorry. An Assembler (whatever assembler) can’t warrantee the OS infrastructure and syscalls set. For that you need some degree of Emulation, even when adopting techniques like JITing or AOting.
That is generally a terrible idea which will lead to an exponentially increased complexity, for which there is no need or use for. If you’re trying to integrate a front-end, optimizer, checker and assembler into a single process, it’s still convenient to have separate representations of the user’s code at every stage even in the case you’re building for a VM, it just makes your life easier especially when you need to maintain your code on the long term. But of course you’re free to implement your code the way you prefer and if turns out to be messy people are also totally free to completely ignore the tool.
This is indeed true, GCC and everything built for GNU Linux indeed make a ton of assumptions, however most of them are what is considered the common set of assumptions of modern Operating Systems. I am not arguing if such a choice is the right choice or not, that is for each person reading this forum to decide. However, a good portion of such assumptions are now standardised across many OS and Architectures. RISC OS in the other hand is a 80s microcomputer operating system, at that time even the concept of Operating System was extremely weak and most micros came just with ROM based libraries of Service Function Calls and a BASIC interpreter using them, so, while RISC OS had a better start than a Commodore 64, certainly it lacks most of such modern assumptions.
RISC OS IS the complete opposite of Unix/Linux. It’s designed to work as close as possible to the Acorn MOS, that is its nature. Over the years people added on top of that and without refactoring the core assumptions, hence it’s literaly the evolution of Acorn MOS, like an “MS-DOS 20” would be instead of Windows NT, which became a “different timeline”. Puting the UnixLib on top of RISC OS is an interesting experiment, which a group of people is enjoying doing and that some user likes. But given RISC OS is completely irrelevant to modern computing, what is wrong with it? It’s an OS to have fun with, that makes your project valid as much as it does for the UnixLib and associated tools. So, I am not sure why you seems to have a problem with this. You want to prove you are right? Great, sit down code your stuff and prove it through your code, there is nothing better than this for you and for others. I had a quick look at your work, you’ve just started your journey, keep at it, don’t stop, certainly at somepoint you’ll have learned a lot and/or produced something very useful. For what concern RISC OS? Oh the amount of problems on RO is so high that a public domain assembler it’s just a drop in the ocean, so after that you’ll have plenty of challenges to pick and have fun with ;) |
Rick Murray (539) 13872 posts |
Give it half an hour… |