Desktop wide Unicode support
Guide target £7,500
RISC OS 5 has a Unicode Font Manager. Currently, various other areas of the operating system either don’t make use of it or aren’t aware of some of the later Unicode standards that postdate the Font Manager work. This bounty revisits all the remaining locations where Unicode might be encountered.
Overview
In the context of the desktop, we’re specifically talking about UTF-8 which is a byte oriented means of encoding Unicode which is at least partially backwards compatible with ASCII.
In order to complete the transition to using Unicode to a point where the entire desktop experience can run in UTF-8, the following main areas need addressing:
- Edit
- To recognise the byte order mark (BOM) heading and switch to UTF-8 editing
- To navigate by characters rather than bytes when moving the caret
- Search and replace UTF-8 sequences
- Copy and paste UTF-8 via the global clipboard
- Toolbox TextAreas gadget
- To navigate by characters rather than bytes when moving the caret
- Copy and paste UTF-8 via the global clipboard
- Adopting the latest UTF-8 standard (RFC3629)
- Configurability of the desktop alphabet
- The printing system
- Updating the PostScript printer to output Unicode (the bit image drivers already do so)
- Adapting FontPrint to reflect mappings between fonts where possible
- File names with UTF-8 in them
- Directory listing display, sort order, name matching
- Translating from the filing system’s representation to UTF-8
Technical details
Where there is duplication of effort it may be worthwhile changing components to make use of UnicodeLib rather than carry their own local implementations. In particular where work was done historically to add UTF-8 only to be superceded by RFC3629. However, this may not be possible for assembler components.
Some filing systems do already support UCS-16 or similar extended character sets, even if not UTF-8 directly, in particular DOSFS and CDFS. The network filing systems NFS and ShareFS could encounter a Unicode aware partner and hence need to exchange data too. The local filing systems PipeFS and ResourceFS may also encounter creation of files with UTF-8 names.
Note: The use of Unicode in FileCore and LanManFS are covered in other bounties and not part of this work. We also don’t expect to support Unicode in NetFS.
It is also expected that there will be minor remedial work needed in the Font Manager and Window Manager which will become apparent during the above larger tasks, such as unambiguously displaying missing glyphs within a font.
Deliverables
- Updated source code to
- Edit
- the Toolbox TextArea
- the printing system
- the Font Manager and Window Manager if applicable
- alphabet Configure plug-in
- affected Filing systems as listed above
- Application note(s) explaining to authors how to use unicode in the desktop
- Revised text for the User Guide, if substantial in nature
Donations | 59 |
---|---|
Guide target | £7,500.00 |
Total | £2,135.00 (28%) |
State | Open |
Help |
More information about the bounty scheme Bounty scheme discussion forum |