Keyboard Handling
Pages: 1 2
nemo (145) 2437 posts |
I’m having a big think about keyboard handling. I’ve not yet reached a conclusion, but I have some thoughts and I’d welcome feedback. Terminology: Keyboard DriverDetects physical button movements via some electronic interface and issues KeyV low-level key transitions. Partially identified by Keyboard ID, sort of. Keyboard IDThis is really keyboard geometry – it’s unrelated to the keycaps that happen to be fitted, and cares not for which interface the keyboard is attached to. KeycapsThe buttons you press have symbols on them. If you swap them around the computer won’t notice. This is actually very important and I’ll come back to it. Low-level keysThe arbitrary numbering of physical keys on the keyboard geometry identified by Keyboard ID. Roughly speaking numbered from top left across rows, starting at 0. There’s no guarantee of correlation between different KeyboardIDs (though we try) or what’s printed on the keycap. Kernel KeyV claimantSends KeyV key transitions to the registered Keyboard Handler in return for keypress codes. Supplies an interface for the Break key. Uses tables to avoid calling the Handler for most keys, and for generating internal key Events. Contains hard-wired assumptions about the Alphabet. Keypress codesAs inserted into the keyboard buffer with OS_Byte,138, with the BBC MOS era interpretation of b7 that means accented Latin characters (for example) take two bytes. When using the UTF8 (sic) Alphabet, this means the already lengthy UTF-8 sequences are doubled in length. All so function and cursor keys can be one byte. Too late now. Internal key numbersAlso known as INKEY numbers, these are logical (not physical) keys, related to the physical arrangement on a BBC Micro and all its progeny. These are used for the key up/down Event, and for OS_Byte,129. The numbers are mostly fixed, some don’t exist and are emulated, and some change function depending on what Keyboard Handler you have. AlphabetThe current Encoding (or CodePage) – the character set available. This is what decides whether 160 means ‘hard space’ (Latin1) or ‘dagger’ (MacRoman). The Kernel’s KeyV claimant doesn’t believe that EBCDIC exists. Converting low-level key transitions into the right keypress codes for the Alphabet is the job of the Keyboard Handler, with some assistance from the Kernel. Most Alphabets are supplied by the International module. Other alphabet modules are available, it is easily extensible. If you want to use a different alphabet with the FontManager (or in the desktop) you’ll need an appropriate Encoding file. This limits the repertoire to that of the available /Base sets, such as /Base0, or requires a matching symbol font. This is a complicated subject and concerns FontManager, not keyboard handling. Country/Keyboard/TerritoryThere are some APIs that claim to set or specify the “keyboard by number”. They don’t. They select a keyboard by country number. Countries are a confused amalgam of countries, regions, languages, scripts, devices and sea breezes, probably. Many countries imply an alphabet, often a sensible one. In the absence of a configured country, it is inferred from the Keyboard ID, hilariously. Additional countries can be supplied by module, but they’re just an indirect way of specifying an Alphabet and implying a language for internationalisation of applications. Except ‘language’ is not ‘country’. TerritoryA country with extra smarts – the Territory module provides a unified API for regional variations of number formatting found in different territories and countries. It also tries to produce collation tables for sorting and comparing text, and tables for changing case. Unfortunately that is dependent on Alphabet, not on country, but the modules ignore that. So this only works if you don’t change the alphabet from the default for the country. Disturbingly, it doesn’t even recognise UTF-8, so will corrupt UTF-8 sequences while trying to lowercase them for example. Keyboard HandlerThis is the big cheese with the big name. Usually called InternationalKeyboard, though it should perhaps more conservatively be referred to as EuropeanKeyboard. There are many implementations of this module, that do different things. It performs a number of unrelated tasks which I’ll now go into. The keyboard handler has to be intimately aware of the keyboard geometry, and hence which low-level key corresponds to Return, Shift etc. However, it has to assume what your other keycaps are – it can’t tell whether it is a QWERTY or AZERTY keyboard, it must simply be asserted by which InternationalKeyboard module you happen to be running. Closer to home, it also decides whether Backspace is Delete or End is Copy. Which really depends on an arrangement of neurons in your head. It is also responsible for generating the correct keypress codes for your Alphabet, which you can change from moment to moment with The keyboard handler also, paradoxically, is responsible for “changing keyboards”, which actually means changing keyboard handler by country. This is done via Alt-Ctrl keypresses – F1 and F2 select the configured and UK keyboard, and F12 allows the keyboard country to be specified by dial code. Stupidly, this is hard-wired inside the handler, which means additional country/territory modules can’t be selected – the keyboard handler can only select countries it already knows about. In other words, it can only select itself. |
nemo (145) 2437 posts |
The reason for the brain dump is that I’m considering changing some of this (with a view to making some of it work). I’ll amass my thoughts on that after dinner perhaps. |
Rick Murray (539) 13406 posts |
Yes, Steve and I have talked at length about this deplorable state of affairs (re. Territory), where “Welsh” is a country (rather than a UK regional language), “Switzerland” is a country (with four languages), and “Dvorak”…is also a country!
Only three little (!) requests:
|
Clive Semmens (2335) 3130 posts |
(Thumbs very much up emoji) (that anyone with the necessary skills to do anything about it is actually thinking about this issue). |
Tristan M. (2946) 1036 posts |
Would all this keyboard handling modification include a way to put keys (and hopefully combinations) into the buffer programmatically minus the migraine / having to write a driver? |
nemo (145) 2437 posts |
Keyboard handlers are almost completely tabular. That they have to be ‘compiled’ is silly. This is indeed the plan, and I reserved a filetype for this a long time ago.
Actually, no. In Unicode it’s absolutely clear what the accented characters are. All that would be necessary is to say this key combination is the acute dead key… and then you’d automagically get áćéíĺńóŕśúýź precomposed characters and b́d́f́ǵh́j́ḱḿṕq́t́v́ẃx́ letter/accent grapheme clusters. And I was able to type all of that because I’m using my Windows keyboard handler at the moment, tee hee. In fact, my venerable MMK InternationalKeyboard module supported 14 different dead accents, for all the Latin alphabets, BFont, MacRoman, WinAnsi, and even PDFDocEnc (with help from the CerilicaAlphabets module).
Again, the MMK keyboard had all the usual multimedia keys, and its custom PS2 driver delivered them as low-level keys in the range 104-125. Interestingly, the Kernel provides gating for low-level keys 0-159, but is happy for higher key numbers to exist – the handler would have to gate them appropriately but there’s no real restriction. Doesn’t the Pandora driver issue transitions in the low &2xx range? Have I imagined that?
Sigh. There’s no excuse for that. Multimedia keyboards have been supported on RISC OS since at least 1999. <puts fingers to temples>
Well. It all depends on what you mean by “key codes”. You press the key on your keyboard, and its Keyboard Driver receives magic numbers on some interface and recognises it was the PageUp key. (If it were PS2, I could tell you those numbers) The Keyboard Driver issues key transitions on KeyV, using appropriate numbers for the keyboard geometry. This will probably be 33, and is unique. The Keyboard Handler provides a table to the Kernel which tells it that low-level key 33 should activate Internal Key 63. This is fixed, and is unique. It also provides a different table which tells the Kernel to put the codes &8F, &9F, &AF and &BF in the keyboard buffer, depending on Shift and Ctrl. These are the same codes as the Up key, although not (RIP Andrew Preview) in the same order. So the codes in the keyboard buffer are not unique. The Wimp empties the keyboard buffer, futzes with the codes, and delivers KeyPress messages to the tasks. As you say, these are not unique for PageUp/Shift-Up. However, DeepKeys is well aware of all this, so buffers the current modifier state in parallel with the keypress codes. When the Wimp sends a KeyPress message, DeepKeys appends the modifier and internal key numbers to the message. The resulting combination of Wimp code, modifier state and internal key number is very unique for tasks that are aware of it, and completely backwards compatible for everything else. The last point is the important one. As it happens, my task coding infrastructure has recently evolved to the point of having configurable key bindings, reflected in the menus. This required extending the Wimp keycodes to uniquely identify every reasonable combination. Having done that it made keyboard handling so much easier in-task, that I’m wondering if that could be either a DeepKeys message extension or additional SWI.
That has always been possible – either the BBC MOS way of using OS_Byte,138, or the RISC OS way of using InsV. The previously-mentioned DeepKeys extends the KeyV calls so it is possible to buffer keypress code, modifier state and internal key number simultaneously, so one can unambiguously buffer Return, Enter, Ctrl-M and Ctrl-13 as desired, and all the multivarious modified versions (including while holding the ‘Logo’ key down for example – DeepKeys has always treated Logo as a modifier as well as a key in its own right). |
nemo (145) 2437 posts |
I am continuing to research the ultimate length of the piece of string marked ‘keyboard’ that I ill-advisedly tugged. Some observations: ‘Country’ is mostly broken and useless. They are not what they pretend to be. ‘Country’ is a bad way of selecting keyboards, and the de-facto way of selecting user interface language. That these two things are only tentatively related in the 8-bit world, and basically orthogonal in the Unicode world is rather embarrassing. Country is mainly used by competent programmers to select user interface language by providing multiple Messages and Templates files, usually in a directory named with the country number (or allowing such a thing to be retrofitted). However, how many competent programmers have coded their task such that if resources for country 156 are not present, it’ll try country 6, country 17 or even country 31 before forcing Little English on an explicitly French speaking user? I haven’t, so that’ll probably be zero. What we have is inherited from the BBC Master Compact, mostly. And it’s wrong. And who can tell me what that list of numbers means, without looking them up? Imagine how much easier it would be if they were ISO language codes: fr-CH, fr, fr-CA, fr-BE, respectively. ‘Territory’ is mostly broken and useless. And not just because “Master” is theoretically a valid territory number. Its API covers four different functions apart from admin: Calendar, Text, Numerics and Keyboard. Whether you want a date displayed as a Japanese Dynasty Era or Rest-of-World 2019; and whether you like negative monetary values formatted with parentheses; is NOT related to whether you have a Dvorak keyboard or Devanagari characters on your keytops. If you’re Swiss, you may well wish to switch between French, German and Italian – you probably won’t change your physical keyboard, but you may want the keypresses to change. If you’re trying to write English on your Devanagari keyboard, you definitely will. |
Clive Semmens (2335) 3130 posts |
And if you’re trying to write Devanagari, Russian and Greek all on the same English keyboard. Who cares what’s on the keycaps? If one’s a touch typist, one types with one’s eyes on the screen, not on the keyboard. Or quite often with one’s eyes shut. And Nemo, you’re a brave soul and I applaud you. (As a little aside, does one put an apostoffice in “one’s” or, what with “ones” being a possessive pronoun like “its,” does “ones” not have one?) |
Chris Mahoney (1684) 2100 posts |
The OED says that the “apostoffice” (am I missing a joke there?) should be present. |
Clive Semmens (2335) 3130 posts |
Probably not. Just my fingers type what they hear my brain say, not what my brain tells them to type. Or sumfink. |
Steve Drain (222) 1620 posts |
Oh, yes!
Or a number suffix. This is the ‘official’ method, but there are others widely adopted, eg: ResFind. Therein lies one of the stumbling blocks to rationalising language provision.
There is a long tail of discussions on this. For myself, I wrote a system for language provision that had primary, secondary and fallback (English) languages. It involves a code system variable that I called Obey$Path and it also allowed for UTF resources. This was part of a Region configuration.
I just used the two-character iso code.
Indeed!
Also time zones:
And currency:
It is instructive to see how Windows handles these in its Regional and Language settings. I have got long way into writing a ‘universal territory’ module. This interacts with the Territory Manager as any territory module does, but it holds the data in separate files: Messages for textual and Data for the character tables. A user/programmer can ensure the appropriate combination. However, I am a bit of a dilettante in this and it is written using BASIC assembler in an ideosyncratic way. ;-) Still, it has proven to me that such a thing can work and still be backward compatible. One problem is that a there are two-character ISO codes for countries, which I use, but there is no number system. For territory numbers I have used the two bytes of the code as lsb and msb. Territory Mangager seems happy with this. |
nemo (145) 2437 posts |
The “apostrophes are for possession” assertion is one of the biggest crimes of British education and should be surgically removed from every head in the country. The explanation I use is this: the apostrophe is the little spot of glue that squeezed out when you stuck bits of words together to make a new word that isn’t in the dictionary. The word “cat” is in the dictionary. The word for more than one cat, “cats”, is in the dictionary. But the word we use to mean “belonging to the cat” is not in the dictionary (though the related word “everything” is) so we have to make it by sticking an “s” on the end. Just like when we cut “is not” into pieces and stick them together as “isn’t”. |
nemo (145) 2437 posts |
The inestimable Steve volunteered
If you build it they will come. We can’t make the existing API work properly because it’s built on sand (made of crushed M128s) but we can provide something sturdier alongside.
Indeed. The smarts should be centralised, but the resources are provided by the App. So there should be a However, one might want to be able to merge Messages files, as it seems wasteful to duplicate entire English Messages files just because of “cancelled/canceled”. I say this in light of my Resources |-en | |-Messages | |-US | | '-Messages | '-AU | '-Messages '-fr |-Messages |-CA | '-Messages '-CH '-Messages
I include those in Calendar and Numerics because of the cross-contamination of “StandardDateAndTime” and “RepresentationTable”.
And Linux, and ICU.
Everyone takes one step back and applauds our brave and noble volunteer. Naturally I would point out that for Territory to be useful it should already know how to format numbers… but all of the Text stuff is obsoleted by Unicode – if you have the UnicodeData then everything else follows. (And what is the point of being able to collate, say, Swedish, if you can’t compare it with French? Unicode solves that).
Well I still use a customised version of Aasm, so we’re all normal here, Napoleon.
“Territory” numbers are currently Country numbers, which are definitely bytes and definitely not countries. For backwards compatibility one would hope to get sensible results from existing APIs when in a newly-coined state. For example, if I configured my machine to have Canadian English, Swiss French and Brazilian Spanish languages, are the slightly esoteric country numbers more useful than the simpler numbers? ie, perhaps 1=en, 6=fr, 5=sp, rather than 1=GB, 6=FR, 5=ES, IYSWIM. Some other API provides the more detailed and nuanced (and probably hierarchical) state. |
Tristan M. (2946) 1036 posts |
Does anybody remember my recent-ish experiments with using an Atmega32u8 as a mouse and / or keyboard device in RO? I found that the keyboard device was difficult because RO could only use a keyboard in boot (or was it BIOS?) mode. The correct nomenclature escapes me sorry. It did however seem to be a cause of some keyboard support issues. |
nemo (145) 2437 posts |
I don’t, and the forum’s search doesn’t seem to find it either. Did you solve your problem? |
Tristan M. (2946) 1036 posts |
It would seem phone posting propagated a previous typo. It should have been ATMega32u4 https://www.riscosopen.org/forum/forums/11/topics/12363?page=1#posts-84641 RO was only willing to support a a keyboard which supports boot mode, which IIRC is somewhat limited. I believe there should be some sort of mode change after the initial detection but it doesn’t happen with RO. |
nemo (145) 2437 posts |
I’ve zero idea how USB is implemented in RO5, since my flirtation with USB ended with an early prototype of the Simtec USB board on a RiscPC. I’ll have a glance at the sources. <pokes sources> Ok, so because your keyboard doesn’t describe itself as supporting ‘boot protocol’ it is ignored by the USB driver. Does your keyboard present multiple endpoints for N-key rollover? If so, are these used dynamically (ie only when more than six keys are held) or does it put some keys on one endpoint and some on another? If it’s the former case – dynamic – then it would work fine if the USB driver relaxed its insistence on boot protocol. If it’s the latter then I can understand not wishing to have to support that. I wonder what benefit there is from refusing to recognise a keyboard because of apparent lack of ‘boot protocol’ – is it not the case that the worst that would happen is some or all keys would not work? How is that worse than the entire keyboard not working? And since it is highly likely that most non-compliant keyboards would work anyway (as long as you don’t type with boxing gloves) I suspect this restriction is overly virtuous. Do the USB experts have a more satisfactory explanation? |
nemo (145) 2437 posts |
Is this right? RO5’s USB keyboard handler doesn’t support multimedia keys?! :-O |
Steve Pampling (1551) 7932 posts |
I suspect the remit was “make it work like a genuine Acorn old keyboard”, after all when was the last keyboard made that required “KeyNo_Copy”? The venerable RPC upstairs has a keycap that says “End”. Anyway, what is this new-fangled “mutimedia” stuff you speak of? :) Edit: To paraphrase an old saying “112 keys should be enough for anyone” |
nemo (145) 2437 posts |
I’ve skimmed through the source (god how I hate looking at the official sources, I don’t know how people put up with it. Kudos to Jeff et al) and I can’t find the inevitable usage page 7 restriction, but it must be there somewhere. The driver would need to pay attention to usage page 12 for the multimedia keys, it would be very little additional code (because of reuse) and one additional table. Just let me check… yup, it is 2019. No multimedia keys. Insistence on boot protocol. 2019. Crikey. |
Steffen Huber (91) 1945 posts |
I think what you linked to is the “Boot keyboard support” inside the low-level USB driver which is active initially on boot. Later on, the USB keyboard driver from NetBSD is used: https://www.riscosopen.org/viewer/view/mixed/RiscOS/Sources/HWSupport/USB/NetBSD/build/c/usbkboard?rev=1.9;content-type=text%2Fplain Not sure if this is better or worse, please have a look :-) |
nemo (145) 2437 posts |
They’re identical. BTW I haven’t forgotten about the Devanagari keyboard, it’s on the list. |
nemo (145) 2437 posts |
The IANA language/script/region registry would seem to be the obvious way to unambiguously define translations of applications’ messages, which is currently done by several home-spun methods involving the misleading ‘Country’ number, ‘Country’ name, or other mystic heuristic. This would allow sensible substitution, so if your preferred Type: language Subtag: aa Description: Afar Added: 2005-10-16 %% Type: language Subtag: ab Description: Abkhazian Added: 2005-10-16 Suppress-Script: Cyrl %% Type: language Subtag: ae Description: Avestan Added: 2005-10-16 %% Type: language Subtag: af Description: Afrikaans Added: 2005-10-16 Suppress-Script: Latn %% Type: language Subtag: ak Description: Akan Added: 2005-10-16 Scope: macrolanguage %% Type: language Subtag: am Description: Amharic Added: 2005-10-16 Suppress-Script: Ethi %% Type: language Subtag: an Description: Aragonese Added: 2005-10-16 %% Type: language Subtag: ar Description: Arabic Added: 2005-10-16 Suppress-Script: Arab Scope: macrolanguage This can be trivially compressed by 80%: aa:,Afar ab:/Cyrl,Abkhazian ae:,Avestan af:/Latn,Afrikaans ak:,Akan am:/Ethi,Amharic an:,Aragonese ar:/Arab,Arabic And if you don’t need the wordy labels, it’s less than 40KB: aa ab:/Cyrl ae af:/Latn ak am:/Ethi an ar:/Arab This would be enough to parse and manipulate the language-extlang-script-region-variant-extensions format, normalise, substitute and generalise in a modestly sized module. Then, just as Templates and Resource files are often similarly localised, but I can’t think of a cheap way of merging them. It’s not impossible, it’s just a lot more work and less well defined. |
nemo (145) 2437 posts |
As for keyboards, it’s silly that keyboard layouts are switched by “dialling code” (which is not, in general, a dialling code and, in the triflingly tiny case of the USA and Canada, can’t tell one from the other) but those codes must be hard-wired into the Keyboard Handler. This is bad. I propose a new Service_International reason code to select keyboard by “dialling code”, so the Handler responds to Alt-Ctrl-F12 (or some alternative where those keys are not available) by inputting a code and then delegating it to other modules. This would allow multiple Handlers to coexist in a future-proof way. It would also allow other codes to be coined for more awkward cases – we already have codes for Dvorak and Esperanto (which aren’t countries, honestly), so there should be codes for USA and Canada, amongst others. The user may wish to coin their own codes. Also, Ctrl-Alt-F1 selects the UK keyboard layout (because Rule Britannia, apparently) and Ctrl-Alt-F2 selects the configured keyboard (because Know Your Place), but F3-F11 ought to be available for binding to the user’s preferred layouts. Again, these codes should be delegated via Service_International. We do need to move away from the Master-Compact-era Territory/Country/Keyboard/Alphabet model. It just cannot cope with the wider world. |
Steve Pampling (1551) 7932 posts |
Shortcut key assignment ought to be easier to change – then again available distinct physical key combinations ought to have distinct codes (that’s a pigs ear, which you know or DeepKeys probably wouldn’t exist)
Cant really go with ctrl-alt-f1 for USA like country codes as we fall short on the F keys way before F44.
I think everyone agrees, but are worried about breaking various applications. |
Pages: 1 2