RISC OS Open: Forum: Filetype corruption with SparkFS on recent 5.25 builds

Jul 6, 2018 12:35pm

Fault seems to affect all boards following change sometime between May 28th and June 12th.

An issue occurs with SparkFS when zipping small files on recent builds of 5.25 on a range of filing systems.

Essentially, small files (I’ve seen it with files up to 2306 bytes, but a 6 KB file was OK, so maybe 4KB is cut-off?) are incorrectly typed, having 0000 load/exec address. It is pretty easy to reproduce – zip a small file (eg. an obey file or small text file) and see if the filetype is preserved in the zip.

David’s on the case, but since this seems to have been triggered by an OS change, it would be helpful to have confirmation from ROOL (or whoever made recent Fileswitch etc changes) before he starts modifying code based on beta-status changes.

Jul 6, 2018 1:23pm

David Pilling (401) 41 posts

Guessing… it seems from the CVS log “However, the flag check appeared to be inverted, so an FS with it set would call OS_File, and one clear
would call OS_GBPB.” that maybe the bits that say use block save/load have changed meaning, and now what is doing the copying is using open/write/close file – so that’s one issue.

Second issue, from debug trials we know that small files are now being saved using open/write/close where they used to be saved with save block of memory. Also it is apparent something outside SparkFS is in this situation setting the load/exec addresses to zero. But what happens is that outside reads load/exec gets back zero, and then sets them both to zero. So possibly it could be SparkFS returning the wrong answer.

It is a fair point “why only SparkFS”.

https://www.riscosopen.org/wiki/documentation/show/Filing%20System%20Information%20Word

says

20 Use Open/GetBytes/Close instead of FSEntry_File 255
19 Use Open/PutBytes/Close instead of FSEntry_File 0

and SparkFS has these bits clear, the intention was to allow memory block writes and reads because this is more efficient for it.

Jul 6, 2018 5:16pm

John Williams (567) 768 posts

Obviously the consequence of this new behaviour is that downloaded apps which have been zipped/archived will not have their essential small files correctly file-typed and so will not work anymore.

Just thought I should state the obvious!

We normally say, often in capital letters, that archives MUST be de-archived under RISC OS to preserve filetypes, but …

I look forward to a swift solution!

Edit: Tried an archive with InfoZip – got a lot of question-mark typed files!

Jul 6, 2018 10:16pm

Steffen Huber (91) 1945 posts

SparkPlug will still work…no complicated Image Filing System stuff with FileSwitch involvement there!

Has someone tested ArcFS? Or any other image filing system like TBAFS?

Jul 7, 2018 6:56am

Sprow (202) 1113 posts

It is a fair point “why only SparkFS”. FS info word says b20/b19 and SparkFS has these bits clear

My understanding is SparkFS is an image filing system though, so only bit 27 of the FS info word is relevant when registering (there’s no wiki page to link to for the image FS info word, but it’s on PRM 2-536).

That the load/exec ends up as zero suggests the sequence of calls is happening in a different order to other saves using OS_GBPB. Filer_Action has some heuristics that large files will be opened, GPBP’d, closed over several Wimp polls whereas small files are Load/Save’d in one gulp. A little BASIC to force OS_File to be used every time would simplify diagnosis.

REM Target a SparkFS image, doesn’t work
SYS"OS_File",10,“$.new/zip.TestObey”,&FEB,,&9000,&9100
REM Target a StrongHelp image, does work
SYS"OS_File",10,“<HelpData$Dir>.OS.TestObey”,&FEB,,&9000,&9100

Currently SparkFS should see

Create
OpenUp
GBPB
Close

On closing R2/R3 (the load/exec) will contain zero if either the file hasn’t changed, or ImageEntry_Args 9 returned zero to declare that SparkFS can’t stamp files. That’d be my prime area to look at – is ImageEntry_Close being called with R2=0 R3=0? If I stop FileSwitch after the Create step, the above test BASIC snippet results in an Obey file as expected.

Because the image FS info word only implements b27, all image FS’ are in effect whitelisted as though b19 and b20 are set. It’s possible the logic was the right way round before (ImageEntry_File 0 never gets called now which does seem odd, ImageEntry_File 255 has never existed) and b19=0 b20=1 is the implied setting, but either way we should get to the bottom of whether it’s the order of calling or SparkFS’ response to that order; it should be tolerant of both methods (since it’s up to FileSwitch which it uses).

Jul 7, 2018 7:14am

John Williams (567) 768 posts

On the original thread
Chris Mahoney has established that it is the FileSwitch update to 2.87 on 7th June that has caused the problem by reverting to the previous version with the current ROM image.

As part of this was implemented;

FileSwHdr.s: Remove use_fsfile_Save and use_fsfile_Load switches, since clients declare at runtime what their preference is (since RISC OS 3).

Jul 8, 2018 11:13am

David Pilling (401) 41 posts

Sprow – good points. SparkFS dates from before image file systems, so it can be used as a normal file system (at the same time) or image support can be turned off. Both types of file system sit on top of common code.

Does SparkFS update the file type etc on file close – yes – but there is a comment in the code:

“Arrgghhh! only update if file has been modified”

apparently there was trouble in this area once upon a time.

I have to tinker some more…

Later – I’ve reproduced the problem.

Jul 9, 2018 11:06am

David Pilling (401) 41 posts

“is ImageEntry_Close being called with R2=0 R3=0? " – yes.

" ImageEntry_Args 9 returned zero to declare that SparkFS can’t stamp files." – Args 9 is not being called.

Is that it then, not in my domain?

Jul 9, 2018 11:31am

David Pilling (401) 41 posts

OK, out of band, someone has said maybe the create file operation that precedes all the copying is going wrong. I know correct load/exec info is being passed in there. I’m now going to explore that theory…

Jul 9, 2018 12:19pm

Andrew Rawnsley (492) 1411 posts

Two (possible dumb) questions for Sprow…

1) The Fileswitch changes refer to “monitors” – since the “normal” meaning of “monitor” (ie. display device) doesn’t really make sense in this context, I assume it refers to bits of code monitoring things. Would be interested as to exactly what is meant.

2) It sounds like one of the changes flipped the behaviour of the OS to match the published spec (and likely caused this issue for David, but I could be wrong). After 30 years of software running on the OS, I’m wondering why “change OS” was considered preferable to “change published spec to match what OS actually does”.

Jul 9, 2018 12:40pm

David Pilling (401) 41 posts

It seemed to me the idea was the specification was changing, but perhaps it is just the understanding of File Action of the spec. that has changed. But the point Sprow has raised is that these block operation flags only apply to non-image file systems.

Anyway SparkFS should work in either block or byte mode.

I’ve done more debuging, I get FS_Entry Close with load and exec both zero, I use those values to overwrite non-zero ones. But I am not getting an Args 9.

Jul 9, 2018 8:46pm

Sprow (202) 1113 posts

It seemed to me the idea was the specification was changing, but perhaps it is just the understanding of File Action of the spec. that has changed. But the point Sprow has raised is that these block operation flags only apply to non-image file systems.

Correct – image FSs don’t get a say because only b27 of their info word applies, you have to support both saving methods (whereas a non-image FS can opt out and avoid FSEntry_File completely).

I’ve done more debuging, I get FS Entry Close with load and exec both zero, I use those values to overwrite non-zero ones. But I am not getting an Args 9.

I looked into this and it seems the PRM is a bit vague. When a file within the image is closed ImageEntry_Args 9 isn’t called at all, because the ‘modified’ flag is artificially cleared, and FileSwitch always gives 2 x zeros.

I think to satisfy FileSwitch all you need is, in pseudo code:

IF (load | exec) THEN do_restamp(load, exec);

like it says on PRM2-555. It doesn’t need qualifying on whether you’re an image FS or not, since FSEntry_Close and ImageEntry_Close have the same specification, as do FSEntry_Args 9 and ImageEntry_Args 9.

Jul 9, 2018 8:55pm

David Pilling (401) 41 posts

PRM page 2-545 “Restamping takes place if the file has been modified and …Args 9 returned a non-zero value in R2” – well …Args 9 has not been called and if it had been a non-zero value would have been returned in R2 from SparkFS.
I believe that is the end of the matter, SparkFS is implementing what the PRM specifies.

As to the switch from block operations to bytes. PRM 2-522 is clear about the meanings of the bits. If they’ve changed or something is interpreting them differently it doesn’t matter to SparkFS, it supports both block and byte operations.

Jul 9, 2018 9:08pm

David Pilling (401) 41 posts

and I’ll tell you another thing, opening a SparkFS archive in non-image mode, block operations are used. So it is an image FS thing. (quick way of doing this is to create a Spark directory archive – no image support for them).

Jul 10, 2018 4:05pm

David Pilling (401) 41 posts

Just to point out my previous two comments were posted without seeing the one by Sprow which precedes them – both of us were writing at the same time. I’m not arguing.

Another data point, we know that less than 4K files show this problem, what happens with a 5K file which doesn’t? Well it is written in two chunks and Args 9 is invoked and of course it gets the right type. Ah this is because it does the amusing thing of setting load/exec to DEAD DEAD whilst polling.

What to do – I can fiddle SparkFS to resolve this problem – but that means all users will have to upgrade their SparkFS. Or maybe this is a chance to tweek RISC OS so that it follows the docs and behaves consistently.

Let me know.

Jul 10, 2018 4:18pm

John Williams (567) 768 posts

What to do – I can fiddle SparkFS to resolve this problem – but that means all users will have to upgrade their SparkFS. Or maybe this is a chance to tweek RISC OS so that it follows the docs and behaves consistently.

I vote tweek RISC OS so that it follows the docs and behaves consistently.

Jul 11, 2018 7:58am

Sprow (202) 1113 posts

Just to point out my previous two comments were posted without seeing the one by Sprow which precedes them – both of us were writing at the same time. I’m not arguing.

That’s a relief – I thought we were in for a PRM page number quoting standoff at dawn!

Another data point, we know that less than 4K files show this problem, what happens with a 5K file which doesn’t? Well it is written in two chunks and Args 9 is invoked and of course it gets the right type. Ah this is because it does the amusing thing of setting load/exec to DEAD DEAD whilst polling.

There’s nothing magic about 4k or 5k, that’s because you’re using Filer_Action to trigger the copy, and it layers extra stuff (like stamping with DEADDEAD) on top which is confusing matters. Take another look at the BASIC 1 liner I gave, change it to 5k 10k 20k whatever, it’s all about what happens at the end of the OS_File.

I also mentioned it seems the PRM is a bit vague, in particular the description for FSEntry_Close (and ImageEntry_Close):

If your filing system returned from the FSEntry_Args 9 (or ImageEntry_Args 9) call with R2 and R3 both zero, then they will also have that value here, and you should not try to restamp the file.

That doesn’t match what FileSwitch actually does, and has done since at least as far back as RISC OS 3.60 when CVS history began (I’ve not yet bothered to disassemble RISC OS 3.10’s copy, strictly that’s what the PRM is describing and it could have changed 1992-1996). If I rephrase the PRM text to what FileSwitch actually does it would say:

If R2 and R3 are both zero you should not try to restamp the file
This can be because either your filing system returned from the FSEntry_Args 9 (or ImageEntry_Args 9) call with R2=R3=0 or because FileSwitch does not wish you to restamp the file (for example, because nothing changed)

In other words: there are ways R2 and R3 can become zero other than from you saying so in Args 9. Surveying the other (writable) FS’s I have to hand confirms this interpretation:

FS	Image?	Handles R2=R3=0	How
StrongHelp	Yes	Yes	Returns R2=R3=0 for Args 9, never even looks at R2/R3 during Close
DOSFS	Yes	Yes	Returns valid file stamp for Args 9, special cases 0’s during Close and skips writing those entries
NetFS	No	Maybe	Passes R2/R3 on to the File Server, not sure what it does with them
FileCore (aka RamFS/ADFS/SCSIFS/SDFS)	No	Yes	Returns valid file stamp for Args 9, special cases 0’s during Close and skips writing those entries
SparkFS	Yes	No	Sets the load/exec to zero
RPCEmu HostFS	No	Yes	Returns early if R2=R3=0
NFS	No	Yes	Skips server write unless R2 non zero

If you could humour me and try my pseudo suggestion

IF (load | exec) THEN do_restamp(load, exec);

I reckon that’ll sort it.

Why does any of this matter? FileSwitch is a key bit of the filing system stack and is doubtless going to change in future, in particular non blocking IO and better file cacheing are 2 areas I can think of where we may elect to use FSEntry_File in its original mode in some situations but not others. An unintended side effect of the change 5 weeks ago is that we’ve found this quirk in SparkFS, if we can get an updated SparkFS percolating out there then there’s no particular harm in changing FileSwitch back in the short term (buying a bit more time) so everyone will be updated by the time FileSwitch actually needs to use this (previously unexercised) code path.

Jul 11, 2018 10:55am

Chris Evans (457) 1614 posts

If the changes that break SparkFS are to remain it needs significant justification! On a practical level as well as technical. There is almost certainly non maintained software out there that will also fail.

Jul 11, 2018 11:02am

Chris Evans (457) 1614 posts

Like Andrew I’d also be interested to know what “Fix for missing monitor name after load” means, which appears to be the (only?) reason for the change.

Jul 11, 2018 1:51pm

Jeff Doggett (257) 231 posts

Fat32Fs also gets this right. There is a comment which says “R2=R3=0 means don’t restamp file” in FSEntry_Close.

Jul 11, 2018 3:03pm

Steffen Huber (91) 1945 posts

Reading the referenced bits of the PRMs (I never implemented an FS myself!), I would conclude that both the current SparkFS implementation as well as the current FileSwitch implementation follow what the PRMs say. Both interpretations are valid. So unless there are very very very good reasons for the new FileSwitch behaviour, I would prefer to reinstate old behaviour. FileSwitch is extremely critical. Even if very specific behaviour is changed, this needs a very good reason, especially if both the old behaviour and the new behaviour matches the docs.

I would even argue that if we are going to significantly change FileSwitch in the future (64bit file size, non-blocking I/O, file caching and possibly a lot of other things), we need to make sure that “new” clients wanting to use the new features need to register specifically as “new” with FileSwitch so that the old stuff can be left unchanged. It is extremely complicated to keep things working unchanged when changing things!

There are a lot of FSes to be tested if the changed behaviour stays as it is. TBAFS. ArcFS. LanManFS. LanMan98. ShareFS. Sunfush. ImageNFS. NFS. X-Files. Win95FS. Memphis. HostFS in V-RPC. LayerFS. raFS. These are just those that come to my mind immediately (and yes, this is a bit sad…).

Jul 11, 2018 4:44pm

nemo (145) 2437 posts

I’m afraid that the documentation has always been so ropey that actual implementation has always outranked theoretical API.

Find a way to retain compatibility even if it has to be special-cased by FS number. There is precedent in FileSwitch with regard to pathname processing.

Jul 11, 2018 5:21pm

Rick Murray (539) 13405 posts

Find a way to retain compatibility

Easy. Change it back.

If there is a need for a “new improved recipe”, this should be denoted by way of a flag or some other appropriate mechanism.

actual implementation has always outranked theoretical API.

True – as you often remind me when I quote bits of the PRMs of how things are supposed to work. We’re aware that the “documentation” is describing a quarter century (plus) incarnation of the OS, with a “set of patches” known as book 5a…and even then there are “issues”. :-)

That said, if a core part of the OS has done “X” since RISC OS 3, it does need a good reason to change in a way that risks affecting existing software.

FileSwitch is a key bit of the filing system stack and is doubtless going to change in future,

Exactly. New behaviour should be new behaviour. Old should be old. And new should be fudged to old when and where applicable. I’ll leave it up to a smarter person than me to work out how to deal with applications only capable of 32 bit file sizes if they try to touch a file with a 64 bit file size…
…my personal opinion here is to have a secondary API, like “OS_Find64” or something, and that’s the one that can access Big Files. The regular OS_Find will simply error upon trying to open a large file (logic: if you’re using the wrong SWI, you’re not capable!).
This is to hopefully prevent a problem such as the earlier DOSFS situation where it was (originally) only capable of dealing with partitions up to 2GB, but would happily open and work with larger partitions. The end result (in my experience) being a disc structure so messed up that it crashed the FAT driver when running chkdsk!

Jul 12, 2018 11:57am

David Pilling (401) 41 posts

I’m going to do a new version of SparkFS that handles the zero load/exec address case. more later.

Jul 12, 2018 4:08pm

David Pilling (401) 41 posts

…www.davidpilling.com/primroses/sfs145beta.zip is a version which implements “if load and exec parameters are zero then don’t stamp the file”. Maybe this is what Acorn meant to say, maybe that is what the PRMs mean. If no one has any complaints I’ll make 1.45 the latest version.

RISC OS documentation needs updating or at some point all this will happen again.

Filetype corruption with SparkFS on recent 5.25 builds

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options