Description:
At the forefront of interoperability using XML in an Internet environment is the issue of
semantic trans-lation; that is, the ability to properly interpret the elements, attributes,
and values contained in an XML file. In many cases, specific domains have
standardized the way data are represented in XML. When this does not occur, some
type of mediation is required to interpret XML formatted data that does not adhere to
pre-defined semantics. The prototype X-Map was developed to investigate what is
required to mediate semantic interoperability between heterogeneous domains. An
essential component of this system is structural analysis of data representations in the
respective domains. When mediating XML data between similar but non-identical
domains, we cannot rely solely on semantic similarities of tags and/or the data content
of elements to establish associations between related elements, especially over the
Internet. To complement these discovered associations one can attempt to build on
relationships based on the respective domain structures and the position and
relationships of evaluated elements within those structures. For this purpose, the
domains are represented as hierarchical trees in XML syntax; a more general solution
handles arbitrary graphs. A structural analysis algorithm builds on associations
discovered by other analysis, using these associations to aid in discovering further links
that could not have been discovered by purely static examination of the elements and
their aggregate content. A number of methodologies are presented by which the
algorithm maximizes the number of relevant mappings or associations derived from the
XML structures. The paper concludes with comparative results obtained using these
three methodologies.