RISC OS Open
Safeguarding the past, present and future of RISC OS for everyone
ROOL
Home | News | Downloads | Bugs | Bounties | Forum | Documents | Photos | Contact us
Account
Forums → General →

RPi 4B with RISC-OS 5.28 & 5.29 Lockups

Subscribe to RPi 4B with RISC-OS 5.28 & 5.29 Lockups 45 posts, 13 voices

Posts per page:

Pages: 1 2

 
Jan 22, 2023 1:20pm
Avatar Daniel Garrod (9459) 16 posts

Hi everyone!

I run The Jolly Roger BBS on my Pi4 and just lately I have been experiencing hard lock ups running RISC-OS, I have tried both versions 5.28 & 5.29.
The lockups are very random in when it happens, i.e. from 10 mins, 2 hours up to 3-4 days. Has anyone else had lockups running RISC-OS on a Raspberry Pi4B 4GB?

Kind regards,
Daniel.

 
Jan 22, 2023 2:16pm
Avatar Colin Ferris (399) 1422 posts

What programs are you running?

 
Jan 22, 2023 2:39pm
Avatar Martin Avison (27) 1312 posts

Does an Alt-Break show a program to stop, and does RISC OS then start running? Or is a complete restart required?
What is the date of your 5.29? There are many different versions of the development ROMs!

 
Jan 22, 2023 3:48pm
Avatar Daniel Garrod (9459) 16 posts

The Pi4 is currently running on RISC-OS 5.29 (06.01.23) with ArmBBS BBS software and a Telnet server, Alarm is also used for Backup tasks, nothing much else is running. I am using the nightly HardDisc4 boot sequence. I have tried the 5.28 SD image as-well, It locks up very randomly on both versions.

When it locks up, the only way is to do a reset by holding the power button down until it goes off, then I power on again.

 
Jan 22, 2023 8:26pm
Avatar Jon Abbott (1421) 2356 posts

Has anyone else had lockups running RISC-OS on a Raspberry Pi4B 4GB?

Not a Pi4, but I was seeing them on my Pi3 – since that post I’ve not used the Pi3 long enough to see if it still occurs with 5.29/5.30

 
Jan 22, 2023 8:39pm
Avatar Stuart Swales (8827) 847 posts

Though there was this “Out-of-spec USB devices can cause a system crash (Pi 4 only)” on https://www.riscosopen.org/wiki/documentation/show/RISC%20OS%20bugs%20specific%20to%20the%20Raspberry%20Pi

Wasn’t there some issue with early Pi 4 firmware that could GPU lockup?

 
Jan 22, 2023 9:17pm
Avatar Andrew McCarthy (3688) 435 posts

RE: Out-of-spec USB devices. I assumed all disc adapters were equal; I discovered they are not. Here are a couple of links to help; if you think there may be an issue.

https://jamesachambers.com/raspberry-pi-4-usb-boot-config-guide-for-ssd-flash-drives/
https://www.jeffgeerling.com/blog/2020/uasp-makes-raspberry-pi-4-disk-io-50-faster

An out-of-date Line Editor module will cause lockups if using the task window.

I have a Pi 4, and I haven’t seen a lockup for some time now.

Is it possible that one of your programs has a memory leak; you may need to set up a remote debug session. Or can you trace it to a specific action or event?

I see you say you are using a nightly build. Try a new build, using one of the stable releases.

A final thought. How is the disc drive? Have you tried a verify or checked it with DiskKnight?

 
Jan 22, 2023 9:21pm
Avatar Stuart Swales (8827) 847 posts

I’ve experienced a ‘user lockout’ as I have PS/2 keyboard and mouse attached to an old KVM then wired via PS/2-to-USB adaptor into a Pi 4. Maybe every other day on switching over to the Pi it will appear to be unresponsive to keyboard or mouse input, but ‘gets better’ when I disconnect and then reconnect the USB lead at the Pi end.

 
Jan 22, 2023 9:45pm
Avatar Andrew McCarthy (3688) 435 posts

That reminds me, my Pi 3 gave me a few headaches (lockups); I traced it to a dodgy cabled mouse; no more issues since I switched to a USB dongle wireless mouse.

 
Jan 23, 2023 7:59am
Avatar Paul Sprangers (346) 365 posts

I’ve experienced a ‘user lockout’ as I have PS/2 keyboard and mouse attached to an old KVM then wired via PS/2-to-USB adaptor into a Pi 4. Maybe every other day on switching over to the Pi it will appear to be unresponsive to keyboard or mouse input, but ‘gets better’ when I disconnect and then reconnect the USB lead at the Pi end.

I have exactly the same experience, albeit with a wireless keyboard/mouse and a new KVM. But could it be related to Daniels lockups?

 
Jan 23, 2023 9:52am
Avatar Colin Ferris (399) 1422 posts

Is there a way of a logging the PC to a circular buffer or a way of narrowing down the fault?

 
Jan 23, 2023 11:40am
Avatar Daniel Garrod (9459) 16 posts

Just come back from going out, I have just tried to login via VNC to check on the BBS, ping and VNC doesn’t work, so it has locked up again :(

I am running the Pi4 as a headless with no keyboard and mouse connected.

 
Jan 23, 2023 12:02pm
Avatar Martin Avison (27) 1312 posts

You may find that my Reporter will give some clues about the lockup, as it will log any command activity and may even catch an error. You would need Logging on to capture details even after a lockup, but that may slow things down a little. See website for download, then !Help for details.

 
Jan 23, 2023 12:16pm
Avatar Daniel Garrod (9459) 16 posts

Thanks Martin, I will try it out and see what happens…

 
Jan 23, 2023 12:29pm
Avatar Andrew McCarthy (3688) 435 posts

just lately I have been experiencing hard lock-ups

An interpretation of what you’ve said above, it’s been running with no problems for a while. Now it’s started to lock up. You have made no changes or applied any updates. There is plenty of disc space and no errors.

Am I correct?

I wouldn’t be using the nightly builds on a live service unless you have good backups or, doing so, fixed a particular issue.

Based on what you’ve said, I would want to create logs; local then remote.

 
Jan 23, 2023 1:57pm
Avatar Daniel Garrod (9459) 16 posts

I originally had the Pi4 running on 5.28, it would lockup at random times, so that is why I went over to 5.29, which didn’t solve the issue. I have put the latest upgrades on the Pi4 in hope that it would help the situation, but it hasn’t worked… I have started collecting data using Martin’s Reporter tool, so we will see what comes of it.

 
Jan 23, 2023 2:12pm
Avatar Rick Murray (539) 12210 posts

and VNC doesn’t work, so it has locked up again :(

For mine, I don’t run anything like VNC. In the sysop tools is access to the command line (currently it’s only possible to login as a sysop from a local LAN address). Any more then that, I’ll need to get up and turn the monitor on. ;)

My issue is completely different. There are no lockups, but after a while (usually numerous days), my other thing, the weather station logger, simply ceases to work. The app is still running, but it no longer does the every five minute update.

Does anybody know how long it takes the centisecond ticker to hit negative numbers? Maybe that’s part of the reason?
Hmm, what does the Wimp do if one gives a value to PollIdle that’s in the past?

Try yours without VNC, or Alarm, or anything other than the BBS server and the telnet gateway.
Can you log all access attempts? Let’s just say that my server does a quick lookup of the IP address and it’s it’s Russia or China or certain other less salubrious parts of the world, the connection is immediately dropped. There’s still loads of crap that gets through, though, including one that spews a load of control characters and such. And, of course, attempts to log in as “root” (password “root”), seriously?!?

 
Jan 23, 2023 5:59pm
Avatar Paolo Fabio Zaino (28) 1307 posts

@ Rick

Hmm, what does the Wimp do if one gives a value to PollIdle that’s in the past?

I can’t speak for the specifics here Rick, but I use a PollIdle of -1 (so the negative value as you mention it), when my Launchpad has to redraw a lot of icons (hundreds and hundres). In terms of user experience, that makes the return to the App for the next NULL event pretty much immediate, almost like a single task app. However, given that NULL events have low priority, it doesn’t make the App fully behave like a single task app, everything else stays responsive and obviously CPU usage grow.

This allows a really fast background (and multi-tasking) redraw of a large amount of icons (please note that icons redraw on Launchpad is multi-tasking, aka doesn’t lock the desktop, doesn’t matter how many icons one has on their installation).

IIRC, the only exception to the above that I have noticed is, with very low (or negative) values, if the App is doing I/O (for instance accessing a file), it effectively becomes a single task app, aka the other Apps in the desktop freeze for a while, until I/O completes (this even when the I/O is done in multi-tasking like I do it on Launchpad, aka read a portion of a file and then return control to desktop and read the next portion on the next NULL event and so on and so forth).

Hope this helps, again, not sure how a useful info it is in this specific case.

 
Jan 23, 2023 6:19pm
Avatar Stuart Swales (8827) 847 posts

PollIdle of -1

‘A decision that I later came to bitterly regret’

Doesn’t it ‘go negative’ after about six months uptime? Yeah, 124^H^H^H 248 days. So after that point, -1 will suddenly transform from ‘a long time ago’ to ‘waaay in the future’.

Waaay back I did have a module that incremented MetroGnome with an additional number (asm-time determined) of ticks per TickerV to get times to advance faster than usual.

 
Jan 23, 2023 6:22pm
Avatar Steve Fryatt (216) 1802 posts

Hmm, what does the Wimp do if one gives a value to PollIdle that’s in the past?

According to the docs, it will just return a Null event immediately. The PRM has a specific comment on this in its remarks on the SWI.

I use a PollIdle of -1

What’s the intent of that, though? The value passed in R2 is an absolute monotonic time, so -1 is 0xffffffff, or the end of the monotonic time range. So in effect I think you’re saying “don’t return a Null event to me until around 500 days after the machine booted, or the same number of days after time last wrapped around”?

I suppose that as the machine gets close to the 497-plus-a-bit days, there’s a time when applications might start to find the call returning somewhat sooner than expected if their intended delay causes the current time value to overflow when the two are added.

 
Jan 23, 2023 6:27pm
Avatar Steve Fryatt (216) 1802 posts

Yeah, 124 days

248, isn’t it?

So after that point, -1 will suddenly transform from ‘a long time ago’ to ‘waaay in the future’.

Please tell me that we’re treating the output of OS_MonotonicTime as an unsigned 32-bit int…?

 
Jan 23, 2023 6:55pm
Avatar Paolo Fabio Zaino (28) 1307 posts

‘A decision that I later came to bitterly regret’

On Luanchpad there is a config file to allow to change this, the default is to use 0, but on my local system I have experimented with -1, the option to set is the following:

RefreshPriority: 0

It’s in a file called Config and that is located in !DeskCfg:Gadgets.Iconbar.Launchpad

[edit]
To be clear: In my tests, using -1 made Launchpad icons redraw faster than using 0. I think this did not come out well with my bad English. Test, as usual, lasted only few hours (between testing, doing something else, leave the system, come back test again, kinda process), so can’t tell what happens after more than that time.
[/edit]

 
Jan 23, 2023 7:00pm
Avatar Rick Murray (539) 12210 posts

Please tell me that we’re treating the output of OS_MonotonicTime as an unsigned 32-bit int…?

What’s an unsigned int in BASIC?

There’s this grey area where stuff like “&FFFFFFFF is -1” exists. ;)

 
Jan 23, 2023 7:00pm
Avatar Paolo Fabio Zaino (28) 1307 posts

What’s the intent of that, though? The value passed in R2 is an absolute monotonic time, so -1 is 0xffffffff, or the end of the monotonic time range. So in effect I think you’re saying “don’t return a Null event to me until around 500 days after the machine booted, or the same number of days after time last wrapped around”?

And that is what I thought when I tested it, however the result of my tests are quite the opposite. It is possible that it’s either transformed into a 0 (as the default valu above) or that triggers something (I had no time yet to dig this into the sources tbh).

People can run their own experiment when it’ll be fully available. In the test what I have observed was a visible reduction of “latency” in the multi-tasking icons redraw start, compared to, for example, using 1 as value.

I suppose that as the machine gets close to the 497-plus-a-bit days, there’s a time when applications might start to find the call returning somewhat sooner than expected if their intended delay causes the current time value to overflow when the two are added.

Possible, but we should also check how the value is used when negative, I am not 100% sure that it may be just a straight passing it through, but (again) I could be wrong, had no time to check it in the ROOL’s sources yet.

 
Jan 23, 2023 7:04pm
Avatar Paolo Fabio Zaino (28) 1307 posts

Please tell me that we’re treating the output of OS_MonotonicTime as an unsigned 32-bit int…?

In BBC BASIC calling SYS? I doubt it.

What’s an _un_signed int in BASIC?

This, lol saint words! And yes it did drive me nuts at the beginning when I got back in to BBC BASIC, together with having to “reset” my brain to use a language without datastructures and with a peculiar concept of “pointers”.

XD

There’s this grey area where stuff like “&FFFFFFFF is -1” exists. ;)

If someone has time, can we please check how the PollIdle treat such a value, pleaseee? I am still working and it’s a busy day here, so won’t be able to check RISC OS stuff before very late tonigth, thanks! :)

Next page

Pages: 1 2

Reply

To post replies, please first log in.

Forums → General →

Search forums

Social

Follow us on and

ROOL Store

Buy RISC OS Open merchandise here, including SD cards for Raspberry Pi and more.

Donate! Why?

Help ROOL make things happen – please consider donating!

RISC OS IPR

RISC OS is an Open Source operating system owned by RISC OS Developments Ltd and licensed primarily under the Apache 2.0 license.

Description

General discussions.

Voices

  • Daniel Garrod (9459)
  • Colin Ferris (399)
  • Martin Avison (27)
  • Jon Abbott (1421)
  • Stuart Swales (8827)
  • Andrew McCarthy (3688)
  • Paul Sprangers (346)
  • Rick Murray (539)
  • Paolo Fabio Zaino (28)
  • Steve Fryatt (216)

Options

  • Forums
  • Login
Site design © RISC OS Open Limited 2018 except where indicated
The RISC OS Open Beast theme is based on Beast's default layout

Valid XHTML 1.0  |  Valid CSS

Powered by Beast © 2006 Josh Goebel and Rick Olson
This site runs on Rails

Hosted by Arachsys