Problem 2; 24-Oct-2000

XML: Foundations, Techniques, Applications; Summer 2000

Harold Boley; DFKI, Univ. Kaiserslautern

Consider the following proposal of a node-labeled ordered tree for addresses (exceptionally allow your browser to use document-specific font face="Times" here):

                           /           \
                         /              \
                       /                 \
                 name               place
                  / | \                      /  \
                /   |  \                   /     \
            first  in  last      street    town
             |       |     |             / \           |\
       Xaver  M. Linde     /    \          | \
                                     /       \         |  \
                    appellation number  zip appellation
                               |              |         |        |
                  Wikingerufer      7  10555  Berlin

a1) Does the repeatedly occurring appellation cause a problem with respect to the unambigious representation of all parts of the address information?
Hint: Is it even necessary to check whether every leaf of the tree - ordered  from Xaver to Berlin - is uniquely denoted by the "path name" of node labels leading to it from the address root?

a2) What if the nodes labeled 'appellation' would have been labeled 'name', too?

a3) Independently of a1) and a2), improve the readability of the tree via a simple relabeling.

b1,2) Could well-formed XML elements be given for the tree versions from a1) and a2)?

b3) Give a well-formed XML element representing your a3)-improved tree.

c) Give a b3)-equivalent Prolog term.

d1) Try to write a DTD that exactly defines addresses according to the original a1)-tree or b1)-element. Explain why this is (im)possible.

d2) Try a modified DTD that exactly defines addresses according to the a2)-tree or b2)-element, with 'appellation' relabeled to 'name'. Explain why that is (im)possible.
Hint: Contrast PCDATA with subelements.

d3) Modify things such that a DTD exactly defines addresses according to your readability-improved a3)-tree or b3)-element or c)-term. In what sense is the b3)-element valid with respect to this DTD?