|
Certainly have, I just recently bought myself a new copy, the old one was falling apart. One of the great critical analyses of English history, and considerably more accurate than many serious takes on the subject. sjc |
Using unicode characters encoded as HTML entities also enables you to have many languages simultaneously on the page, not just latin characters with some diacritics, as in iso-8859-1.
The cons are that some browsers (I believe only very old ones by now, and perhaps some extremely light e.g. on PDAs) won't process and show HTML entities correctly.
Anybody wants to add to this?
--AV
Well, it looks like we should be using Unicode throughout then. Anyone know where we can get a definitive list of them for pretty much any language we might need? It might be a useful page to have up here with links. sjc
They are not Unicode (although, as AV says, modern browsers will generally map them to Unicode). There's a complete list (for HTML4.01) at http://www.w3.org/TR/html4/sgml/entities.html . Those in the first table should work even in fairly old browsers, but most of those in the other two tables are less well supported. Other characters can be obtained by using Unicode instead, and you can get a complete list of Unicode codes from http://www.unicode.org/Public/UNIDATA/ by downloading the UnicodeData.txt file (plus the huge Unihan.txt file if you want Chinese-Japanese-Korean characters too). --Zundark, 2001 Oct 14
Thanks muchly. sjc
Using Unicode named entities is a way to bypass this restriction. By writing a name, you're no longer assuming that the software that was used to write the page agrees with the software displaying it. That assumption is often incorrect, since in the browser does not have enough information about the software used to author the text to figure out how it chose to encode some the symbols. With Unicode, however, you say explicitly to the web browser "give me a thorn". A good (HTML 4.0-compliant) browser then should look up the font table and try to display the symbol, no matter how it is represented internally.
The conclusion is that using Unicode is the only fully correct practice. It enables people from all over the world see the page correctly at the moment they arrive at it. Also, it is expandable so it allows using several alphabets over a single page. Unicode named entities are supported by most of the recent browsers (both IE and Netscape since Version 4). Although older browser might have problems with Unicode, any solution optimized for them will break a much bigger number of Unicode-compliant system that for some reason do not use the same encodings for some characters.
--Uriyan
Sing a song of Saxons
In the Wapentake of Rye
Four an twenty eaoldormen
Too eaold to die ...
That's more like Middle English, I suppose, but they also have a great take on the Beowulf there. --AV
Certainly have, I just recently bought myself a new copy, the old one was falling apart. One of the great critical analyses of English history, and considerably more accurate than many serious takes on the subject. sjc