Changes between Version 3 and Version 4 of UTF8Notes


Ignore:
Timestamp:
10/03/11 15:43:16 (8 years ago)
Author:
noz
Comment:

Updated remaining ports

Legend:

Unmodified
Added
Removed
Modified
  • UTF8Notes

    v3 v4  
    2525(What about screen dumps?) 
    2626 
     27---- 
     28 
    2729== Internals == 
    2830 
     
    3133The internal representation of the main (and other terminal) screen(s) is as two arrays, one of "attributes" ('''byte attr''') and one of "characters" ('''wchar_t char'''). The characters to be displayed are stored as unicode, in the native wchar_t representation of a unicode character on the platform, whatever that is. When strings are printed to the screen, they are converted from UTF-8 (as '''char *''') to wide characters ('''wchar_t *''') using '''z-term.c''':''Term_mbstowcs()''. This allows the conversion function to be overloaded if a particular platform needs it. 
    3234 
    33 When they are displayed, the wide chars are put on the screen in different ways, depending on the port (see below). In the case of graphics tiles, things are slightly different. The "character" is still stored as a '''wchar_t''', but only the bottom 7 bits are used, as an index into a large 2-D bitmap, containing the tiles along the x axis, and the attributes ("colour") on the y axis. (''Check this bit'') In the original design, I had hoped to treat tiles as  a special case of a font, and allow all unicode character support, but the tiles are multi-coloured, so this cannot work.  
     35When they are displayed, the wide chars are put on the screen in different ways, depending on the port (see below). In the case of graphics tiles, things are slightly different. The "character" is still stored as a '''wchar_t''', but only the bottom 7 bits are used, as an index into a large 2-D bitmap, containing the tiles along the x axis, and the attributes ("colour") on the y axis. (''Check this bit''). A tile used to be indicated by the top bit set in both the attribute and the character, but it is now indicated only by the top bit of the attribute. In the original design, I had hoped to treat tiles as  a special case of a font, and allow all unicode character support, but the tiles are multi-coloured, so this cannot work.  
    3436 
    3537=== Textblock === 
     
    4143In reading the edit files, all strings are maintained in UTF-8 until needed. 
    4244Glyphs are read in directly to a wchar_t type. 
     45 
     46---- 
    4347 
    4448== Ports == 
     
    5256=== X11 === 
    5357 
     58Wide chars on the canvas are drawn directly to the window using ''!XwcDrawImageString()'' in ''Infofnt_test_std()''. Fonts are now rendered using '''XFontSet'''s, rather than the previous '''XFontStruct'''. 
     59 
    5460=== GCU === 
     61 
     62Now requires the "wide" version of ncurses (i.e. ncursesw), and will fail to build if this is not present. 
     63 
     64Wide characters from the canvas are written directly to the screen using ''mvwaddnwstr()'' in ''Term_text_gcu()''. 
     65 
     66Some of the default symbols have changed as follows: 
     67 
     68||= Feature =||= From =||= To =|| 
     69||Floor||Period '.' (U+002E)||MIDDLE DOT '·' (U+00B7)|| 
     70||Magma||?? (0x03)||MEDIUM SHADE '▒' (U+2592)|| 
     71||Quartz Vein||?? (0x03)||LIGHT SHADE '░' (U+2591)|| 
     72||Granite Wall||?? (0x02)||DARK SHADE '▓' (U+2593)|| 
     73 
     74This has the added advantage that standard fonts can be used, and it is not necessary to resort to hacking fonts to get "solid walls". 
    5575 
    5676=== Windows === 
    5777 
     78Windows does not properly support UTF-8 using the standard C library routines for locale, so the ''term->mbcs_hook'' function is defined to use the Windows-native ''MultiByteToWideChar()'' function, and the external files are assumed to be in UTF-8. Wide chars from the canvas are written directly to the screen using ''ExtTextOutW()'' in ''Term_text_win()''. 
     79 
    5880=== OSX === 
     81 
     82Work in progress (I believe). 
    5983 
    6084=== GTK === 
    6185 
     86No work has been done to change the GTK port to support UTF-8, so it will probably not even compile. 
     87 
    6288=== Android === 
     89 
     90There are significant problems in adapting an Android port to this change, as the support for wide chars in older versions of Android is lacking. I understand that '''wchar_t''' is implemented as an 8-bit quantity, and some of the support functions such as ''mbstowcs()'' are missing, or broken. It may be possible to overload this using ''Term_mbstowcs()''. ''Please update this if you make any significant progress''