Possible bug in AcornHTTP
Pages: 1 2
Matthew Phillips (473) 709 posts |
Thanks – I may look at the Browse code to see whether it would cope with HTTP_AddCookie generating errors. Another limitation of AcornHTTP is that it does not guard against supercookies (e.g. a cookie set for a top-level domain like .com or a high-level domain like .ac.uk). It should be reasonably easy to add protection to the module by using the Public Suffix List This would entail supplying the text file of the suffix list for the module to use. Should this go in !Boot.Resources.!Internet.files like the CertData file, do you think? |
Dave Higton (1515) 3477 posts |
I can think of no more logical place. |
Matthew Phillips (473) 709 posts |
Another little puzzle. In the code which reads configuration for the module, for example, the saved cookies file, there is a function I don’t quite understand static FILE *config_try_open(const char *dotsuffix, const char *mode, int del) { FILE *f = fopen(config_set_name(CHOICES_DIR, dotsuffix), mode); (void) del; return f; } What is the “(void) del;” doing in there? I assume it compiles away to nothing, but what is the point of it? Does it just avoid the compiler warning that a parameter is unused? Why did they not just remove the parameter, as this function is only used in this file? |
Jeffrey Lee (213) 6048 posts |
Correct.
When you’re removing a feature from code, a common approach is to touch the minimal amount of code necessary (either to try and avoid breaking the code by removing/changing too much, or because you want to remove the feature quickly with minimal effort). E.g. if it’s a feature which is enabled/disabled by a #define, you’d only touch the relevant #if/#else/#endif blocks and leave everything else alone. That can easily lead to bits of dead code like the above. |
Matthew Phillips (473) 709 posts |
Thanks Jeffrey. If I do major work on this module then I may just clean some of this out of the way to avoid confusion. I am intending to update it to implement RFC 6265 and remove a lot of (buggy) code relating to the obsolete Set-Cookie2 and other things which no modern browsers support. I have been having a good read through the code to understand what it currently does, and I’m occasionally spotting things that make me wonder “can that possibly work properly?!”. The latest crazy bug I have uncovered is in check_domain_valid which gets its knickers truly in a twist when checking Domain attribute values without a leading dot (now the standard, then frowned upon). I have verified, using a test script, that if you fetch a page from abcdef.co.uk then the module will accept Set-Cookie headers in the response where Domain=abcde.co.uk or Domain=abcd.co.uk as well as Domain=abcdef.co.uk If the domain does have a leading dot, things are not much better because the host/target test is performed with strstr, without checking where the match occurs, so if you fetch a page from www.abcdef.co.uk and it has a Set-Cookie with Domain=.abcdef.co or even Domain=.abcdef.o then those are also accepted! (The process reverses domains before matching, so www.abcdef.co.uk becomes uk.co.abcdef.www which might help you understand how .abcdef.o matches.) I have been testing by using a PHP script I am hosting on my own domain which accepts parameters such as name, value, path, domain and then builds a cookie to send back. This makes it easy for me to test the RISC OS end using a BASIC programme. Is it appropriate to provide tests written in BASIC to go into GitLab, or would C be preferable? Does anyone know of a public cookie-building facility that can be used to test cookie code in this way, as I would rather not submit test scripts that rely on PHP hosted on my personal domain? I think some test scripts would be a valuable addition to the source, given the number of bugs lurking in this module! |
Jeffrey Lee (213) 6048 posts |
Either is fine. However, rather than tokenised BASIC, I think it’s preferable to use BASICtxt (&FD1), to make it source control friendly. Note that it was only last year that RISC OS 5 gained an Alias$@RunType action for BASICtxt files (despite BASIC having support for loading the files for many years), so you might need to set up the action manually: set Alias$@RunType_FD1 Basic -quit "%0" %*1 (or the same thing but without the quotes, to avoid an expression evaluation issue that started tripping up software that was checking if the action had already been defined)
Not offhand. I suspect that’s the kind of thing that would normally be tested by having the client proxy all the requests to a local server (either a normal server like Apache/nginx, or a special test server to allow things like handling of malformed responses to be tested) |
Chris Hall (132) 3544 posts |
Yes, when I played around with cookies and perl scripts to render off line material, I set up a local host using xitamu-25 on my PC. I could then test server-side scripting in a ‘sandpit’. |
Matthew Phillips (473) 709 posts |
Further stupid bugs in this module: When determining which cookies in memory match for sending along with a request, the domain and path of the cookie only need to be substrings of the domain and path of the request. There is no test to make sure they match from the start, so any cookies set for abcde.org will also be sent to abcde.org.uk, and any cookie with a path set to /abc will also be sent to /abcde or /def/abc. The cookie path is not reloaded from the cookie file correctly, so the saved path will gradually get shorter and almost disappear. (I am working on fixing these bugs, but I’m still unfortunately at the stage of understanding how the code is supposed to work before making the fixes. The trouble is, I am just understanding more and more the ways it does not work!) |
Matthew Phillips (473) 709 posts |
Merge request now opened to fix almost all the bugs listed above. |
Chris Mahoney (1684) 2160 posts |
Well done. I bet this looked like a relatively quick and simple job when you started! |
Matthew Phillips (473) 709 posts |
It certainly did! And ironically, the problem that led me to find all these bugs was the unprocessed cookie queue getting full: a cookie queue that I had no idea was building up because the flag in bit 16 of R0 returned by URL_GetURL was undocumented. I’ve now written some documentation! Edit: It’s actually R0 bit 16 of URL_ReadData which indicates cookies. You might like to revise your C library to look out for bit 16 being set. If it’s set, the cookies should be processed. If you want to just throw them out (safest option) then repeatedly call HTTP_EnumerateCookies with R0 and R1 both equal to zero. If you get R1<>0 on exit, pass that into HTTP_ConsumeCookie in R2 with R0 set to 0 to discard the cookie. You may also like to consider setting bit 30 of R0 in your call to URL_GetURL (also undocumented previously). This will prevent AcornHTTP from automatically adding cookies to your request that have been accepted into the store by other processes. I’m thinking about how the whole API could be adapted to allow for separate cookie stores for separate applications. This would be preferable, I think. |
Rick Murray (539) 13742 posts |
Just a really trivial silly thing, but… The document we are after including the protocol I had to read that multiple times to understand it. Perhaps “requesting” would be a better word than “after”? And, for that matter, “resource” might be better than “document” (is a JPEG a document?). Sorry. My mind would bother me all night long if I didn’t say something. Now I have, so I can |
Dave Higton (1515) 3477 posts |
Eats shoots and leaves… |
Matthew Phillips (473) 709 posts |
Yes, that is oddly worded, but not my contribution. Dave created that page, but the phrasing dates back to at least 1998 with the Acorn URL Fetcher API Specification. It looks like Dave changed the example URL from http://www.acorn.co.uk/ to http://www.bbc.co.uk I meant to say that the ironic thing was that if the cookie queue had been better documented, and I had been processing it properly, I would never have discovered the crash which occured when it filled up! |
Chris Mahoney (1684) 2160 posts |
I’ll take a look at that when I get a chance. It doesn’t sound too tricky (touch wood!) Just to confirm, the ‘old’ module sets bit 16? So I should handle it regardless of module version as all 32-bit versions of AcornHTTP are likely to behave that way? |
Rick Murray (539) 13742 posts |
Great. Since it’s not your feet I’ll be trampling, I’ll go ahead and change it. ;) |
Matthew Phillips (473) 709 posts |
Yes, bit 16 in the URL_ReadData RO return value has been supported for at least the last twenty years, as has bit 30 of R0 on entry to URL_GetURL It’s a pity that URL_Status does not also return bit 16 because otherwise the exit value of R0 is identical to that of URL_ReadData. This could be changed, of course! |
Chris Mahoney (1684) 2160 posts |
I’m a little sleepy at the moment and also don’t have ready access to a test environment with cookies, so can someone (doesn’t have to be Matthew!) please just give this a quick sanity check?
|
Pages: 1 2