Update: On "Preserving unknown XML-tags and their content" in the Kolab-Format specification
Florian v. Samson
florian.samson at bsi.bund.de
Mon Jun 20 10:01:19 CEST 2011
On "Preserving unknown XML-tags and their content" in the Kolab-Format
specification, v0.2 (2011-06-19)
Some thoughts and suggestions on preserving unknown (i.e. future) XML-tags
and their content in the Kolab-Format specification (after some
brainstorming in private emails and phone-calls):
1. Kolab-clients SHOULD retain all unknown XML-tags and their embraced (i.e
embedded within the opening and closing tag) content, regardless of their
position in the XML-tag-tree and of their content (e.g. sub-tags); But all
Kolab-clients MUST retain all top-level XML-tags along with their full
content within an Kolab-object.
{Some argue, only the latter statement shall be put into the Kolab-Format
specification. I do see the hurdles for various clients to implement the
preservation of *all* unknown XML-tags regardless of their position in the
XML-tree, but strongly believe that limiting tag-preservation to top-level
XML-tags only vastly reduces the extensibility of the Kolab-Format, as
mixed environments with both old and new clients are extremely problematic
to impossible then. Hence the rather weak "SHOULD" (for this kind of
statement), instead of a "MUST", but I think the aim and general direction
must be made absolutely clear.}
1b. If a XML-tag appears multiple times within one Kolab-XML-object, ALL
occurrences of this XML-tag MUST be preserved.
{Currently some clients only preserve their first occurrence, discarding all
other occurrences. Explicitly forbidding this behaviour simplifies some
things. More detailed reasoning can be requested from Hendrik.}
2. IMHO a non-issue: Eventual semantic issues. But in order to avoid them,
clear guidance for future extensions (i.e. XML-tags) to the Kolab-format
are needed.
{Bernhard pointed out semantic issues, when old clients hit newly defined
XML sub-tags (i.e. unknown to them), embedded in existing, known XML-tags,
and such a client alters other known fields. Yes, the content of the
unknown (hence untouched) XML-tags may become out of sync in relation to
known tags which may be altered by the old client,
but I think ...
I. ... it is easily overcome by defining the XML-tags and their content
properly, so they do not contain duplicate or dependent information. In
case of orthogonal (i.e. completely unrelated/uncorrelated information),
this is a non-issue.
II. ... furthermore, that is a minor issue even without such carefully
specified XML-tags, compared to limiting the extensibility of the
Kolab-Format to top-level XML-tags only, when rule 1 (above) is mandatory.
}
a. Duplicate information in XML-tag definitions
The contents of XML-tags MUST NOT contain duplicate information (even
if it is expressed in different ways), except for nested (sub-) tags, which
SHOULD NOT contain information which is already provided by a tag's content
at a higher level in the XML-tree.
b. Dependent information in XML-tag definitions
The contents of XML-tags MUST NOT contain dependent information (even
if it is expressed in different ways), except for adjacent XML-tags (i.e.
on the same level in the XML-tree), which MAY contain dependent information
(although this is nit recommended for new tags). Nested (sub-) tags (i.e.
on a lower level in the XML-tree) implicitly contain information, which
depends on the higher-level tags they are embedded in.
{I would love to see the "adjacent tags"-clause being taken out, but
am unsure, if that is feasible.}
3. The depth of nesting XML-tags MUST be limited to 7 levels at most.
{A suggestion from Bernhard in order to limit the size of and parsing
efforts for Kolab-objects, which otherwise may be used for
Denial-of-Service attacks. I picked the "7" as an educated guess (i.e.
some considerations done, but not at all exhaustive).}
a. Please discuss!
b. Do we need similar, but separated statements for XML-attributes?
Cheers
Florian
-------------- next part --------------
On "Preserving unknown XML-tags and their content" in the Kolab-Format specification, v0.2 (2011-06-19)
Some thoughts and suggestions on preserving unknown (i.e. future) XML-tags and their content in the Kolab-Format specification (after some brainstorming in private emails and phone-calls):
1. Kolab-clients SHOULD retain all unknown XML-tags and their embraced (i.e embedded within the opening and closing tag) content, regardless of their position in the XML-tag-tree and of their content (e.g. sub-tags); But all Kolab-clients MUST retain all top-level XML-tags along with their full content within an Kolab-object.
{Some argue, only the latter statement shall be put into the Kolab-Format specification. I do see the hurdles for various clients to implement the preservation of *all* unknown XML-tags regardless of their position in the XML-tree, but strongly believe that limiting tag-preservation to top-level XML-tags only vastly reduces the extensibility of the Kolab-Format, as mixed environments with both old and new clients are extremely problematic to impossible then. Hence the rather weak "SHOULD" (for this kind of statement), instead of a "MUST", but I think the aim and general direction must be made absolutely clear.}
1b. If a XML-tag appears multiple times within one Kolab-XML-object, ALL occurrences of this XML-tag MUST be preserved.
{Currently some clients only preserve their first occurrence, discarding all other occurrences. Explicitly forbidding this behaviour simplifies some things. More detailed reasoning can be requested from Hendrik.}
2. IMHO a non-issue: Eventual semantic issues. But in order to avoid them, clear guidance for future extensions (i.e. XML-tags) to the Kolab-format are needed.
{Bernhard pointed out semantic issues, when old clients hit newly defined XML sub-tags (i.e. unknown to them), embedded in existing, known XML-tags, and such a client alters other known fields. Yes, the content of the unknown (hence untouched) XML-tags may become out of sync in relation to known tags which may be altered by the old client,
but I think ...
I. ... it is easily overcome by defining the XML-tags and their content properly, so they do not contain duplicate or dependent information. In case of orthogonal (i.e. completely unrelated/uncorrelated information), this is a non-issue.
II. ... furthermore, that is a minor issue even without such carefully specified XML-tags, compared to limiting the extensibility of the Kolab-Format to top-level XML-tags only, when rule 1 (above) is mandatory.
}
a. Duplicate information in XML-tag definitions
The contents of XML-tags MUST NOT contain duplicate information (even if it is expressed in different ways), except for nested (sub-) tags, which SHOULD NOT contain information which is already provided by a tag's content at a higher level in the XML-tree.
b. Dependent information in XML-tag definitions
The contents of XML-tags MUST NOT contain dependent information (even if it is expressed in different ways), except for adjacent XML-tags (i.e. on the same level in the XML-tree), which MAY contain dependent information (although this is nit recommended for new tags). Nested (sub-) tags (i.e. on a lower level in the XML-tree) implicitly contain information, which depends on the higher-level tags they are embedded in.
{I would love to see the "adjacent tags"-clause being taken out, but am unsure, if that is feasible.}
3. The depth of nesting XML-tags MUST be limited to 7 levels at most.
{A suggestion from Bernhard in order to limit the size of and parsing efforts for Kolab-objects, which otherwise may be used for Denial-of-Service attacks. I picked the "7" as an educated guess (i.e. some considerations done, but not at all exhaustive).}
a. Please discuss!
b. Do we need similar, but separated statements for XML-attributes?
Cheers
Florian
More information about the format
mailing list