[slightly edited] From: Arjun Ray To: xml-dev@lists.xml.org Date: Wed, 18 Sep 2002 07:45:05 +0000 Message-ID: <3d2gou4orr2sc3perbt2o5vos9bhf7igfh@4ax.com> Subject: [xml-dev] XMap: A Mechanism for Mapping Names Eventually this will become a paper with all the gory details, but here is a preview of a markup scheme which I believe a. constitutes an alternate non-invasive syntax for XML Namespaces. b. allows annotation of values in multiple taxonomies simultaneously. c. enables name mapping in the style of Architectural Forms. 1. Basics. XMap is a method to record the details of name (re)mapping in the values of specially defined extra attributes. These values are lists of tokens, most of which are names of attributes; the exceptions are keywords with defined semantics. The basic idea is to associate names in one taxonomy with names in another by a list of pairs (e1 l1 e2 l2 ...), where external name e1 is associated with local name l1, external name e2 with l2, and so on. The general interpretation is that an attribute specification l1="v" in the local taxonomy - i.e. the instance markup - would be (circumstantially) the same as an attribute specification e1="v" in the external taxonomy. This list will be the value of a _mapping control attribute_ (to nominate it in true SGML longwinded style), whose local name is not constrained: it can be chosen to avoid conflicts. The specification of this local name will be in another list of pairs comprising the value of a _mapping support attribute_. The name of this support attribute is constrained because it must be recognized by generic software: it is not tied to any application semantic (i.e. would not require understanding of the application markup and their meanings.) A reserved name in XML is easy: using the 'xml' prefix. Hence the name 'xmlmap' for the mapping support attribute. (In SGML, even this name would have to be declared, probably through a processing instruction in the prolog; this technique could work for XML too.) The pairs in the value of the mapping support attribute (xmlmap) associate a _distinguished name_ with the local name of a mapping control attribute. Several interpretations of this distinguished name are possible, such as, inter alia: the name of a local attribute whose value is a Namespace URI, the name of an SGML architecture, or even the name of a declared notation. XMap has rules to determine which interpretation applies. The realities of the XML world dictate that the default interpretation, requiring no explicit recording, is that of "XML Namespaces". An example, somewhat deliberately obfuscated in order to bring out the generality, using categories from the HTML taxonomy: The support attribute xmlmap identifies zz as the name of a namespace attribute, and yy as the corresponding control attribute to translate local names to names in the namespace identified by zz. The effect of this markup is to identify four "translated" attribute specifications, href="some.gif" type="simple" role="http://example.org/some/role" title="Sleeping Kitty" understood in the light of the namespace specification, zz="http://www.w3.org/1999/xlink" Had all the external names except 'href' been used as-is locally, the control attribute would have been: yy="href src type type role role title title" The redundancy can be obviated by the keyword ":auto", which is paired with the name of a local attribute, whose value is a list of the names that are taken as-is in the translation. Thus, the markup could be 2. "Unnamed" Attributes. Generic identifiers (which are really unnamed attributes) are mapped by the keyword ':gi'. By default, generic identifiers are not mapped. Elements in SGML/XML have another unnamed attribute besides the GI: the data content of the element. The keyword ':data' identifies this. (This is of use to AF wonks; it's of no relevance to XML Namespaces.) Mapping an external GI has the following possibilities :gi foo - The external GI is the value of the local foo attribute. :gi :gi - The external GI is the same as the local GI. :gi :data - The external GI is the text content of the local element. :gi :null - The external GI is some category known to the external taxonomy (this is the default). 3. Instance markup and DTDs/Schemas XMap is based on instance markup because the association with an external taxonomy could be circumstantial. When it isn't - i.e. the association is systematic for a local element of a given type - the instance markup can be economised by appropriate ATTLIST declarations in the DTD (often in the internal subset.) E.g. 4. Some Topical Notes. One important feature of XMap is that all mappable names are in attribute *values*, where there is no ambiguity when the semantic of the attribute name (a control, typically) is known. Thus, compound names with internal syntax are not necessary: documents can be parsed with simple APIs such as SAX1. This is a simplicity win, for the cost of usually at most two extra attributes. There is also the feature that data values can be shared by authorial intent (as opposed to relying on heuristics such as string equivalence for potentially repeated values) by simply entering the value once and mapping all other names. The xlink:href versus href/src problem simply vanishes. Perhaps more topically, XMap covers all the notational requirements of XML Namespaces, rendering colonification unnecessary, and permits the same defalting rules for "applicable namespaces" (when the namespace is not identified by URI in the starttag but merely declared by name in the xmlmap attribute.) The connection with AFs (and thus the details I've left out here) probably doesn't need comment.