Print

Print


On Tue, Jan 13, 2004 at 03:28:03PM +0100, Benct Philip Jonsson wrote:
> At 14:34 12.1.2004, Mark J. Reed wrote:
> >RM> Does "&schwa;" exist?
> >
> >Nope.  Although one of the nice things about XHTML is that XML lets you
> >define your own entities, so that you can add it if you like.
>
> How is that done?  Can it be done in a stylesheet?

The later replies on this thread by me and Tristan had an example,
but here's what is required:

1.  A browser that will apply HTML-type rendering to an XHTML document.

        It appears that Internet Explorer doesn't do this.  If it
        recognizes a document as XHTML and parses it as an XML document,
        all you get is pretty-printed source, rather than a visual rendering.
        Or, if it parses it as an HTML document, then it is rendered, but
        XML-specific tricks like custom entities don't work.
        If anyone knows a way to convince IE to both parse as XML and
        render as HTML, please let me know.

        Mozilla/Firebird/Gecko works fine; I haven't tried Opera or
        any other browsers.

2.  The document must be recognized by the browser as XHTML, not just
    HTML.

        That means the Content-Type sent by the web server has to
        be "application/xhtml+xml", not "text/html".  A modern web
        server will do the right thing if the file is named with
        a .xhtml suffix; any browser that meets condition 1 will
        probably also do the right thing if you open a local file
        with such a suffix.

        Note that if the web server is not configured properly, you can't
        fake it with a <meta http-equiv> element, because by the time that
        element is processed the browser has already decided whether it's HTML
        or XHTML.

3. The document has to be legal X[HT]ML.

        Among other things, this means that the very first thing in it - no
        leading whitespace, even -has to be the XML processing directive:

                <?xml version="1.0"?>

        An XML document is assumed to be UTF-8-encoded Unicode by
        default; if you're using another character set, such as Latin-1,
        you must say so in the XML directive, like so:

                <?xml version="1.0" encoding="iso-8859-1"?>

        After that you need a <!DOCTYPE> directive (See below), and then
        you can finally get things rolling with the opening tag of the
        <html> element - which needs some extra attributes.
        The rest of the document has to be valid
        XHTML: lowercase element names, all empty tags explicitly
        marked, all attributes with double-quoted values, etc.

4. Any custom entities are defined in the DOCTYPE directive.

XML defines an entire (infinitely large) family of markup languages; the
<!DOCTYPE> directive tells the parser which particular language is in
use for the document containing it, by pointing to a formal description of
that language (called a Document Type Definition, or DTD).  For describing
a web page, the particular language is XHTML, but even then, there are
several dialects to choose from.  If your web page is using any of the
older presentation-type markup (<body bgcolor=>, <b>, <i>, etc; basically
anything that is supposed to be done with stylesheets these days), you need
to label it as XHTML 1.0 Transitional:

<!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Otherwise, you should label it as XHTML 1.1:

<!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.1//EN" "xhtml11.dtd">

You also need to specify the default namespace and language
in the <html> tag:

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">

The first time you're trying to get all this working, you should
probably make sure the document validates as-is before trying to
add the custom entities.   You can check it at http://validator.w3.org.

Custom entities are defined by adding <!ENTITY> declarations inside the
<!DOCTYPE> declaration.  For instance, if you want entities for the
IPA vowel symbols:

 <!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.1//EN" "xhtml11.dtd"
     [
        <!ENTITY alpha  "&#x0251;">
        <!ENTITY talpha "&#x0252;">
        <!ENTITY openo  "&#x0254;">
        <!ENTITY reve   "&#x0258;">
        <!ENTITY schwa  "&#x0259;">
        <!ENTITY eps    "&#x025B;">
        <!ENTITY reveps "&#x025C;">
        .
        .
        .
        <!ENTITY ups    "&#x028A;">
        .
        .
        .
     ]
 >

Then you include those entities in the text just like the built-in ones:

    <p>Phonemically, the word &lt;about&gt; is /&schwa;'b&alpha;&ups;t/</p>

-Mark