RPi 4B with RISC-OS 5.28 & 5.29 Lockups
Pages: 1 2
|
You guys need to brush up on your 2’s complement arithemtic. Really. |
|
Should work, as -20 plus 10 is -10, which will be the same as big hex number to slightly larger hex number (I’m writing this on my phone so can’t ask BASIC to That being said, the Wimp Poll/Idle source is, uh, interesting. A Byzantine maze of exception. So I’m not going to go looking to see if it’s a signed or unsigned comparison… |
|
Already tested that, ages ago, when I still thought giving it a large unsigned integer (a.k.a. a negative signed one) would work. It didn’t (i.e. it returned immediately). |
|
For the purposes of OS_MonotonicTime and Wimp_PollIdle, where we’re just doing additions and subtractions and testing relative values, it shouldn’t matter. If you never compare two time values directly, but always subtract one from the other and compare the result to zero, things cancel out just fine. That’s why you get constructs like this one, paraphrased a bit from the PRM (page 3-185, in the description of Wimp_PollIdle):
Assuming, of course, that internally, both OS_ReadMonotonicTime and Wimp_PollIdle are trading in unsigned 32-bit ints. The PRM doesn’t explicitly say they are, but the OS_ReadMonotonicTime description implies it. |
|
And Wimp_PollIdle doesn’t. Tested extensively, some time during the 2010s. |
|
The result is treated as signed. It doesn’t matter whether the numbers being compared are signed or unsigned, so long as (a) they are both of the same type, and (b) the difference is less than half the range of the numbers. Like I say, brush up on your 2’s complement arithmetic. IIRC there used to be a problem in the Wimp at the wrap-round of the monotonic timer, but it was fixed decades ago. |
|
The relevant code seems to be
from hereabouts down to the That looks to be very similar to the
construct from the Wimp_PollIdle PRM entry? If “now” is greater than “return time” then return? PS. That’s got to be a contender for a “most idiotic comment” award, hasn’t it? I can see that you’re using |
|
Gentlemen, there’s a degree of confusion here. 1. The time parameter for Wimp_PollIdle is an absolute MonotonicTime. If you want to yield for one second, do 2. (Related point) Don’t do anything just because you tried it and it didn’t obviously catch fire. 3. If you want to be called back immediately, don’t use PollIdle – just use Wimp_Poll as the gods intended. The code in question is this simple (my labels) – the pink bit: Task_R2 is the MonotonicTime the task asked to idle until. It will therefore NOT receive a null event until MonotonicTime is greater than or equal to that number, by subtracting it and comparing with zero. But also note that null events are the lowest priority event, and your task will be called back regardless of the idle time if anything else at all happens. For the record, the events are delivered in this priority order (i.e. first one that happens gets delivered):
HTH |
|
In context, it gets credit for someone considering whether a signed comparison was required. Regrettably we have not always been so lucky in that regard – see VDU-12345678 in a TaskWindow for example. |
|
Incidentally, the above priority list is why it is a Very Bad Idea™ for a task to send a message to itself – Wimp_Poll will just immediately return without a context switch. Do not mistake the WindowManager for any kind of multitasking scheduler. There is no concept of time-slicing or time-starvation. It’s a very simple hack of the single-tasking Wimp and it’s a testament to legions (generations, even) of Wimp programmers that the desktop works as well as it does. |
|
I wonder if the confusion here is coming from people not realising that there’s a limit to the time delay which can be applied to Null events by Wimp_PollIdle, due to the wrap-around of the arithmetic? The delay can only be half of the total timespan allowed by OS_ReadMonotonicTime: &FFFFFFFF centiseconds is just over 497 days, so we can only delay by 248-and-a-half days before the comparison wraps around from “in the future” to “in the past”. Given this, Paolo’s -1 will be “in the past” until the machine has been running for more than 248 days, and so for that time it will return immediately with a Null event (just as if Wimp_Poll were used). However, after 248 days, it will start to cause Wimp_PollIdle to block Null events until 497 days have elapsed. Then there will be another 248 days where it returns immediately, and so on. In a similar vein, a “large unsigned integer” is more likely to be “in the past” than “in the future” for the first few days after the system has booted. Testing the Wimp’s behaviour is tricky unless the monotonic timer is tampered with. |
|
First of all, thank you everyone for checking this! :) @ Steve
Not quite, the BPL instruction uses the N flag (Negative), so if N is not set… hence your BBC BASIC code should be:
Also, I need to make an apology, I have mentioned -1, but I do not use -1 as a magic number, sorry, I was in a hurry due other things going on at the same time and cutted my content way too much! For my test, I used:
Where user_delay% in my test was set to -1 and normally is set to 0 for a redraw, while it’s set to 7 otherwise. Which in my understanding should return immediately, as explained by nemo, I do not consider RO as an OS with a real multi-tasking scheduler (as I have mentioned bilions of times, for me it’s an Acorn MOS pumped with steroids). So using an immediately expired time sounds like it should just return immeditately to me. In other words, the difference between +0 and +(-1), should be in the lines of, with +0 there might be something else happening (like try to switch to another task), while with +(-1) there is no chance and just return straight to my task and, in my case, execute the next chunk of work. I guess I am wrong then. Again thanks for checking it :) |
|
Yes, one of the first clients that was updated to use Wimp_PollIdle (and incorrectly) was the internal MailMan at Acorn. After the requisite interval had elapsed on one of the manglement desktops, it no longer polled for mail. Ironically, this particular mangler had wangled one of the early A440 production systems so he could boot either RISC OS or RISCiX and was ‘going to be doing that all the time’. Clearly not for 248 days, they hadn’t… [Edit: as Nemo infers below, they did in fact barely use the system.] |
|
Steve garbled
Days. If you manage to 1. Use RISC OS without crashing or resetting for eight months; and 2. Write a program that does so very little that the idle null poll is the first thing it hears about; then you win a special prize from the RISC OS faeries. |
|
Sorry guys, but I am lost, what are all these messages to each other got to do with my locking up issue? Daniel. |
|
It moved off onto the orthogonal topic of tasks ceasing to perform their function (“locking up”) because of a failure to understand Wimp_PollIdle. |
|
@ Daniel It started with people trying to understand why you are experiencing that locking up, but unfortunately it has moved off onto a side discussion because some people have made quite an enormous set of assumptions, which led to a confusing comclusions. In your case (and even in others mentioned here) I don’t think that, even a mistaken use of Wimp_PollIdle is causing the issue, here is why: 1) Some people mentioned that use now% - 1 would cause problems at some point in the future, this is obviously not quite right (and can happen only in an extremely remote condition), here is why: The monotonc timer will reset after a while, in the end is a 32bit number, so when reached its maximum it will restart from zero. This seems to be the sole element evaluated by who thinks it will cause problems in the future, but there is more to consider: a) WIMP_PollIdle interval only decide which NULL events to be send to our task, doesn't preclude other events and messages, this is absolutely crucial to consider, because in the rare case of "locking up" by using -1, as soon as a new event comes in, obviously the interval will be reset to proper numbers and so everything will be back to normal. So I agree with Nemo, someone is definitely confused about how Wimp_PollIdle works and affects RO, hopefully this will help. b) NULL events are the lowest priority AFAIR, so any other event, will reach a task and that will also trigger a reset of the now% - 1, which will then solve the problem. c) To actually get into the case where the number produced is going to lock a task, we need to execute the now% - 1 just about it is resetting and that task must only be accepting NULL events, but that requires some very specific configuration, which I have never seen done tbh also because a task should accept at th every minimal a signal to quit. But, even in this case, on modern RO there is a chance to kill that specific task using [ALT]+[F12]. 2) Someone suggested to use Wimp_Poll in cases where we wish to receive ALL the NULL events, and that is true, but not quite the same. In fact using Wimp_Poll shoudl equate to use Wimp_PollIdle with now% + 0 or something, not the same as now% - 1, which will literaly return immediately to the original task, no process "swap" will happen at all. In your case, if the system is locking up, it’s most likely something else causing the problem, and I would start from investigating which modules you have loaded etc. Hope this helps, [edit] |
|
Paolo monospaced
No. now%-1 is pointless (use Wimp_Poll) but harmless – wrap-around is never an issue in that case. The problem is in the theoretical case of using a fixed number, whether -1, 0 or RND, which in the worst-case might not return for eight months.
Yes.
No. There’s no difference. I posted the lines of code above – the Wimp will return to your task if MonotonicTime is >= your idle time. This will happen only if there is no other event to be delivered anywhere, but will happen1 regardless of whether your idle time was now%, now%-1 or now%-2147483647. As Dave has pointed out, this is a simple matter of twos-complement arithmetic. 1 Your task will also only be called back immediately if there are no other tasks waiting for nulls – they’re delivered in round-robin manner to avoid one task monopolising things. |
|
That was, originally, my understanding as well, but I swear I have seen my Launchpad redrawing the icons faster than when using now% + 0, I’ll re-test tonigth again (it may have been just some peculiar combinations of things).
In my test, that was probably the case, as it was the only user executed task running. In any case, if I see again that visible difference I’ll make a video and post it somewhere. |
|
Hi Paolo, You mention about testing modules, What is the best way and is there a copy of !Boot that has minimum amount of modules loaded? Thanks. Daniel. |
Pages: 1 2