<?xml version="1.0"?>
<?xml-stylesheet href="slides.xsl" type="text/xsl"?>
<?cocoon-process type="xslt"?>

<!-- Written by Harold Boley "boley@dfki.de" -->

<slides>

  <title>Subtag</title>

  <slide>
    <center>
      <xhtml><big><b>Subsumption Semantics for XML Tags*</b></big></xhtml>
      <br/> <br/> <br/>
      <i>
      Version: Apr 12, 2000
      <br/> <br/>
      First Version Prepared for:
      <br/>
      <a href="http://www.dagstuhl.de/DATA/Title/00121.html">Dagstuhl Seminar 00121, Semantics for the Web</a>
      <br/>
      March 19-24, 2000
      </i>
      <br/> <br/>
      Harold Boley
      <br/>
      DFKI GmbH
      <br/> <br/> <br/> <br/> <br/> <br/> <br/> <br/>
      <small>* "Practice what you preach": XML source of these slides at <a href="http://www.dfki.uni-kl.de/~boley/xdocs/subtag.xml">subtag.xml</a> (<a href="http://www.dfki.uni-kl.de/~boley/xdocs/subtag.xml">subtag.xml.txt</a>);
      <br/>
      transformed to HTML via <a href="http://www.dfki.uni-kl.de/~sintek/">Michael Sintek</a>'s SliML stylesheet at <a href="http://www.dfki.uni-kl.de/~boley/xdocs/slides.xsl">slides.xsl</a>
      </small>
    </center>
  </slide>


  <slide>
    <title>The Ontological XML Imperative</title>
    <big><i>
    Problem of Semantics for the Web:
    </i></big>
    <br/> <br/>
    <center>
      <big>One, <b>Standardized Semantics</b></big>
      <br/> <br/>
      <big>Would Boost the Web Further Than</big>
      <br/> <br/>
      <big>Many, Non-Standardized Semantics</big>
    </center>
    <br/> <br/> <br/>
    <big><i>
    Proposed Solution Step:
    </i></big>
    <br/> <br/>
    <center>
      <big>Incorporate <b>Subsumption Semantics</b></big>
      <br/> <br/>
      <big>Right Into the Web's <b><a href="http://www.w3.org/XML/">XML</a> x.0 Tags (x>1):</b></big>
      <br/> <br/>
      <big>Build <b>Taxonomy Into DTDs</b>, Leave Axioms To Schemas!</big>
      <br/> <br/>
      <big>(<a href="http://www.ontology.org/">Ontology</a> = Taxonomy + Axioms)</big>
    </center>
  </slide>


  <slide>
    <title>Attacking a Problem with XML</title>
    <itemize>
      <item>Main building blocks of XML DTDs are <i>chaining</i> (<i>whole-part</i>; inversely, <i>part-of</i>) relations between a parent element and an (ordered) sequence of child elements</item>
      <item>XML DTDs cannot specify <i>subsumption</i> (inversely, <i>isa</i> or <i>kind-of</i>) relations between a tag and an (unordered) set of subtags, as needed for the taxonomy backbone of ontologies</item>
      <item>Subsumption of element tags is thus handled ad hoc or in non-DTD schema languages
	<itemize>	  
	  <item><a href="http://www.w3.org/TR/PR-rdf-schema/">RDF Schema</a>: <tt>subClassOf</tt> specifies subsumption (not specific for tags)</item>
          <item><a href="http://www.w3.org/TR/xmlschema-1/">XML Schema (Part 1)</a>: <tt>derivedBy="extension"</tt> right-appends children (no subsumption specification)</item>
	</itemize>
      </item>
      <item>Various non-DTD tag-subsumption and -inheritance semantics thus lead to diverging XML uses</item>
    </itemize>
  </slide>


  <slide>
    <title>An XML Sample</title>
<p>
Consider this DTD for a <tt>sales</tt> database table:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company, item, quantity) >

<!ELEMENT company        (#PCDATA) >
<!ELEMENT item           (#PCDATA) >
<!ELEMENT quantity       (#PCDATA) >]]></code></small></box>
<br/>
<p>
Also, one element for a corresponding tuple:
</p>
    <box bgcolor="FFCCCC"><small><code><![CDATA[

<sales>
  <company> Onoffbook </company>
  <item> XML4You </item>
  <quantity> 12417 <quantity>
</sales>]]></code></small></box>
  </slide>


  <slide>
    <title>Differentiating Parent Subtags: From Copying to Subsumption</title>
<p>
Now, for the E-commerce era, let us differentiate <tt>online-sales</tt> from <tt>offline-sales</tt> tags,
which should both `inherit' their children from the neutral <tt>sales</tt> element:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company, item, quantity) >
<!ELEMENT online-sales   (company, item, quantity) >
<!ELEMENT offline-sales  (company, item, quantity) >

<!ELEMENT company        (#PCDATA) >
<!ELEMENT item           (#PCDATA) >
<!ELEMENT quantity       (#PCDATA) >]]></code></small></box> 
<p>
Instead of such copied child declarations or corresponding <tt>ENTITY</tt> declarations,
we propose a new XML version (NEXML) with a true tag-inheritance construct <tt>SUBSUMES</tt> for DTDs,
shortening the first three lines to:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company, item, quantity) >

<!SUBSUMES sales online-sales >
<!SUBSUMES sales offline-sales >]]></code></small></box>
  </slide>


  <slide>
    <title>How NEXML Uses SUBSUMES: Child Inheritance</title>
<p>
Element instances with <tt>online-sales</tt> and <tt>offline-sales</tt> tags thus obtain
children with <tt>sales</tt>-declared <tt>company</tt>, <tt>item</tt>, and <tt>quantity</tt> tags (note the 329 `unclassified-rest' copies while Onoffbook is heading towards its offline/online break-even point for the book "XML4You"):
</p>
    <box><small><code><![CDATA[

<!SUBSUMES sales online-sales >
<!SUBSUMES sales offline-sales >

<!ELEMENT sales          (company, item, quantity) >

<!ELEMENT company        (#PCDATA) >
<!ELEMENT item           (#PCDATA) >
<!ELEMENT quantity       (#PCDATA) >]]></code></small></box>
    <box bgcolor="FFCCCC"><small><code><![CDATA[

<sales>
  <company> Onoffbook </company>
  <item> XML4You </item>
  <quantity> 329 <quantity>
</sales>

<online-sales>
  <company> Onoffbook </company>
  <item> XML4You </item>
  <quantity> 12417 <quantity>
</online-sales>

<offline-sales>
  <company> Onoffbook </company>
  <item> XML4You </item>
  <quantity> 15182 <quantity>
<offline-sales>]]></code></small></box>
  </slide>


  <slide>
    <title>Element Attributes vs. Element Children</title>
    <itemize>
      <item>(NE)XML elements, with attributes and children, correspond to frame/OOP instances; their DTDs correspond to frame/OOP classes</item>
      <item>Children are ordered, with inheritance for DTD <i>declarations</i> only</item>
      <item>Attributes are unordered, hence (inspired by frame/OOP) inheritance
         <itemize>
           <item>in DTDs: performed for <i>declarations</i></item>
           <item>in processors: permitted for <i>values</i></item>
         </itemize>
      </item>
      <item>"<tt>#REQUIRED</tt>" in NEXML, by attribute inheritance, means "required for all descendant leaf elements"</item>
    </itemize>
  </slide>


  <slide>
    <title>How NEXML Uses SUBSUMES: Attribute Inheritance</title>
<p>
Element instances with <tt>online-sales</tt> and <tt>offline-sales</tt> tags also obtain
<tt>sales</tt>-declared attributes such as <tt>year</tt> and <tt>price</tt> (the DTD
inherits both attribute <i>declarations</i>; an element processor should inherit the <tt>year</tt>
attribute's <tt>2000</tt> <i>value</i>):
</p>
    <box><small><code><![CDATA[

<!SUBSUMES sales online-sales >
<!SUBSUMES sales offline-sales >

<!ATTLIST sales year  CDATA #REQUIRED >
<!ATTLIST sales price CDATA #REQUIRED >]]></code></small></box>
    <box bgcolor="FFCCCC"><small><code><![CDATA[

<sales        year="2000">
  ...             |
</sales>          |
                  v
<online-sales             price="27.50">
  ...
</online-sales>

<offline-sales            price="22.50">
  ...
<offline-sales>]]></code></small></box>
  </slide>


  <slide>
    <title>Differentiating Child Subtags: From Copying to Subsumption</title>
<p>
Let us now differentiate <tt>household-item</tt> from <tt>business-item</tt> tags,
which should both `inherit' their child context (siblings) from the neutral <tt>item</tt> element:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company, item, quantity) >
<!ELEMENT sales          (company, household-item, quantity) >
<!ELEMENT sales          (company, business-item, quantity) >

<!ELEMENT company        (#PCDATA) >
<!ELEMENT item           (#PCDATA) >
<!ELEMENT household-item (#PCDATA) >
<!ELEMENT business-item  (#PCDATA) >
<!ELEMENT quantity       (#PCDATA) >]]></code></small></box>

<p>
Instead of such copied sibling declarations we can use the
XML choice (<tt>|</tt>) construct,
shortening the first three lines to:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company,
                          (item | household-item | business-item),
                          quantity) >]]></code></small></box>
<p>
However, to express the subsumption semantics hidden within this choice, we again use our
tag-inheritance construct <tt>SUBSUMES</tt> for NEXML DTDs,
obtaining:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company, item, quantity) >

<!SUBSUMES item household-item >
<!SUBSUMES item business-item >]]></code></small></box>
  </slide>


  <slide>
    <title>Combining Parent and Child Subtags: "Multiplying Out" Subsumption</title>
<p>
We can also combine the parent differentiation <tt>online-sales</tt>/<tt>offline-sales</tt> with a child differentiation such as <tt>household-item</tt>/<tt>business-item</tt>, obtaining (<tt>#PCDATA</tt> declarations omitted):
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company, item, quantity) >
<!ELEMENT sales          (company, household-item, quantity) >
<!ELEMENT sales          (company, business-item, quantity) >

<!ELEMENT online-sales   (company, item, quantity) >
<!ELEMENT online-sales   (company, household-item, quantity) >
<!ELEMENT online-sales   (company, business-item, quantity) >

<!ELEMENT offline-sales  (company, item, quantity) >
<!ELEMENT offline-sales  (company, household-item, quantity) >
<!ELEMENT offline-sales  (company, business-item, quantity) >]]></code></small></box>
<p>
As in the child-differentiation example, this can be regarded as the result of "multiplying out" choices from a shorter DTD:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company,
                          (item | household-item | business-item),
                          quantity) >

<!ELEMENT online-sales   (company,
                          (item | household-item | business-item),
                          quantity) >

<!ELEMENT offline-sales  (company,
                          (item | household-item | business-item),
                          quantity) >]]></code></small></box>
<p>
As in the parent- and child-differentiation examples, it can also be regarded as the XML result of "multiplying out" <tt>SUBSUMES</tt> from a more semantic NEXML DTD:
</p>
    <box><small><code><![CDATA[

<!ELEMENT sales          (company, item, quantity) >

<!SUBSUMES sales online-sales >
<!SUBSUMES sales offline-sales >

<!SUBSUMES item household-item >
<!SUBSUMES item business-item >]]></code></small></box>
  </slide>


  <slide>
    <title>Left and Right Child Extensions in NEXML DTDs</title>
    <enumerate>
      <item>Child chaining and tag subsumption can alternate arbitrarily</item>
      <item>Since children are ordered, a subtag can extend inherited child content to the left and/or right</item>
      <item>This is indicated in NEXML DTDs by using <tt>ELEMENT</tt> declarations with <i>two</i> (possibly empty) parenthesized child sequences for subtags: the left and right extensions</item>
      <item>We do not employ multiple supertag inheritance simply because a specification corresponding to item 3. would become much harder when multiple child sequences have to be merged</item>
    </enumerate>
  </slide>


  <slide>
    <title>A Refined NEXML DTD Example</title>
<p>
This is the full DTD
of the original NEXML sample refined by <tt>online-sales</tt> subtags and
<tt>item</tt> children, one of which rooting a <tt>product</tt>-taxonomy top-level
(<tt>sales</tt>' subtags <tt>online-sales</tt> and <tt>offline-sales</tt>
are extended by children to both sides; the <tt>online-sales</tt> subtag <tt>web-sales</tt>
is extended only to the left; <tt>email-sales</tt>, only to the right):
</p>
    <box><small><code><![CDATA[

<!SUBSUMES sales online-sales >
<!SUBSUMES sales offline-sales >
<!SUBSUMES online-sales web-sales >
<!SUBSUMES online-sales email-sales >
<!SUBSUMES product book >
<!SUBSUMES product cd >
<!SUBSUMES product video >

<!ATTLIST sales year  CDATA #REQUIRED >
<!ATTLIST sales price CDATA #REQUIRED >
<!ATTLIST online-sales weight   CDATA #IMPLIED >
<!ATTLIST online-sales oversize CDATA #IMPLIED >
<!ATTLIST offline-sales stock CDATA #REQUIRED >
<!ATTLIST web-sales href CDATA #REQUIRED >
<!ATTLIST email-sales mailto CDATA #REQUIRED >
<!ATTLIST product code CDATA #REQUIRED >

<!ELEMENT sales          (company, item, quantity) >
<!ELEMENT online-sales   (portal, delivery) (authentization) >
<!ELEMENT offline-sales  (store) (location) >
<!ELEMENT web-sales      (header) () >
<!ELEMENT email-sales    () (user, subject) >
<!ELEMENT item           (wrapper, product) >

<!ELEMENT company        (#PCDATA) >
<!ELEMENT quantity       (#PCDATA) >
<!ELEMENT portal         (#PCDATA) >
<!ELEMENT delivery       (#PCDATA) >
<!ELEMENT authentization (#PCDATA) >
<!ELEMENT store          (#PCDATA) >
<!ELEMENT location       (#PCDATA) >
<!ELEMENT header         (#PCDATA) >
<!ELEMENT user           (#PCDATA) >
<!ELEMENT subject        (#PCDATA) >
<!ELEMENT wrapper        (#PCDATA) >
<!ELEMENT product        (#PCDATA) >
<!ELEMENT book           (#PCDATA) >
<!ELEMENT cd             (#PCDATA) >
<!ELEMENT video          (#PCDATA) >]]></code></small></box>
  </slide>


  <slide>
    <title>A Tree-Like Diagram Form of NEXML DTDs</title>
    <itemize>
      <item>Chained (ordered) children branch horizontally, in green; subsumed (unordered) tags branch vertically, in red</item>
      <item>Left and right child extensions are shown via geometric branch positions</item>
      <item>Attributes are written (in italics) next below their elements' tags</item>
      <item>For repeated tag occurrences, the tree-like structure needs repeated tag nodes, which could be expanded/contracted via "<tt>+</tt>"/"<tt>-</tt>"-buttons or <a href="http://smi-web.stanford.edu/pubs/SMI_Abstracts/SMI-1999-0806.html">FlexDAGs</a></item>
      <item>Further EBNF-like DTD syntax can be accommodated by labels (<tt>*</tt>, <tt>+</tt>, <tt>?</tt>) on branches and a new kind of branch (<tt>|</tt>)</item>
    </itemize>
  </slide>


  <slide>
    <title>The Refined NEXML DTD Diagram</title>
<xhtml><img src="dtdtree.gif"/></xhtml>
  </slide> 


  <slide>
    <title>Implementation Approaches</title>
    <itemize>
      <item>Two approaches for implementing NEXML in validators, browsers, stylesheet processors, and other tools:
        <enumerate>
          <item>DTD preprocessor: Reduce NEXML DTDs to XML 1.0 DTDs by "multiplying out"
the <tt>SUBSUMES</tt> hierarchy as shown in the examples</item>
          <item>XML successor: Develop a version of NEXML that directly supports
the <tt>SUBSUMES</tt> hierarchy into a new XML x.0 for W3C standardization</item>
        </enumerate>
      </item>
      <item>Approach 1. can explode the size of generated DTDs, which in certain cases can be avoided by XML choices (<tt>|</tt>) and <tt>ENTITY</tt> declarations</item>
      <item>Still, approach 2. should be attempted, utilizing efficient subsumption algorithms, e.g. from <a href="http://dl.kr.org/">description logics</a></item>
      <item>The remaining XML 1.0 features (<tt>ENTITY</tt>, <tt>NOTATION</tt>, etc. ) can be taken over to XML x.0, making XML 1.0 DTDs upward compatible</item>
    </itemize>
  </slide>


  <slide>
    <title>Conclusions</title>
    <itemize>
      <item>NEXML augments XML 1.0 by <tt>SUBSUMES</tt> declarations and enhanced <tt>ELEMENT</tt> declarations for subsumption DTDs</item>
      <item>A corresponding XML x.0 version would be immediately useful by enabling a standardized subsumption semantics for the information on the Web</item>
      <item>Further refinements could, e.g., comprise overlapping vs. disjoint subsumptions (special integrity constraints)</item>
      <item>Subsumption DTDs constitute a taxonomy that can also be accessed by axioms (e.g., general ICs) in schemas, giving us the full power of ontologies</item>
      <item>By moving expressivity from RDF Schema and XML Schema into XML x.0, these separate standards could be reconciled (cf. <a href="http://www.w3.org/TR/schema-arch">The Cambridge Communique</a>)</item>
    </itemize>
  </slide>


</slides>
