Update: On "Preserving unknown XML-tags and their content" in the Kolab-Format specification

Florian v. Samson florian.samson at bsi.bund.de
Thu Jul 14 11:42:22 CEST 2011


Hi Georg,


Am Montag, 20. Juni 2011 um 10:28:12 schrieb Georg C. F. Greve:
>
> Thanks a lot on picking this up and driving the discussion forward!

Well, rather a "sorry for the late answer" from my side, and "me driving 
this" will be a slow ride, as I am extremely busy and on vacation for the 
next month.
Still, I will try not to let this thread go, and to continue to contribute 
to it.

> On Monday 20 June 2011 10.01:19 Florian v. Samson wrote:
> > 1. Kolab-clients SHOULD retain all unknown XML-tags and their embraced
> > (i.e embedded within the opening and closing tag) content, regardless
> > of their position in the XML-tag-tree and of their content (e.g.
> > sub-tags); But all Kolab-clients MUST retain all top-level XML-tags
> > along with their full content within an Kolab-object.
>
> This would be okay for me. What was your argument for choosing the
> slightly less forceful "should" instead of "must"? 

Hurdles for implementing this in multiple existing client libraries. 

> Where would you agree that not preserving tags is reasonable? 

Good question, which lead me to this conclusion:
As the already accepted KEPs demand quite invasive code (and partially 
design) changes in the various client libraries (actually the explanatory 
text in the proposal stated that in v0.2), which implement the 
interpretation and generation of Kolab-XML objects, and multiple proposed 
KEPs augment this tendency, my answer is "nowhere".

Hence I change this to MUST in the recent draft v0.3 (attached) and 
eliminated the now superfluous special cases and comments on that.

> > 1b. If a XML-tag appears multiple times within one Kolab-XML-object,
> > ALL occurrences of this XML-tag MUST be preserved.
>
> Ack.
>
> > 2. IMHO a non-issue: Eventual semantic issues.  But in order to avoid
> > them, clear guidance for future extensions (i.e. XML-tags) to the
> > Kolab-format are needed.
> > {Bernhard pointed out semantic issues, when old clients hit newly
> > defined XML sub-tags (i.e. unknown to them), embedded in existing,
> > known XML-tags, and such a client alters other known fields.  Yes, the
> > content of the unknown (hence untouched) XML-tags may become out of
> > sync in relation to known tags which may be altered by the old client,
> > but I think ...
> >  I. ... it is easily overcome by defining the XML-tags and their
> > content properly, so they do not contain duplicate or dependent
> > information.  In case of orthogonal (i.e. completely
> > unrelated/uncorrelated information), this is a non-issue.
> >  II. ... furthermore, that is a minor issue even without such carefully
> > specified XML-tags, compared to limiting the extensibility of the
> > Kolab-Format to top-level XML-tags only, when rule 1 (above) is
> > mandatory.
>
> Also, a question one may ask is in which way the alternative - so losing
> information - is preferrable. 

IMHO "never", as ...

> It may be possible to "fix" an object that was modified by an older, 
> unaware client. It is usually not possible to re-generate information.

.. you nicely pointed out.

Draft text for guidance provided in v0.3 (attached).

> > 3. The depth of nesting XML-tags MUST be limited to 7 levels at most.
> > {A suggestion from Bernhard in order to limit the size of and parsing
> > efforts for Kolab-objects, which otherwise may be used for
> > Denial-of-Service attacks.  I picked the "7" as an educated guess (i.e.
> > some considerations done, but not at all exhaustive).}
>
> I would be interested in the threat scenario.

Denial of Service for all clients accessing and interpreting that 
Kolab-object.

> The storage format is accessed only by authorized clients, 

Uh, who "authorises" Kolab-clients?

> so clients that have been authorized by the user to access the 
> information. 

No, definitely not.
- Step 1: A malicious Kolab-user creates a few Kolab-XML bombs, each with a 
nesting level > 10.000 (those Kolab-Objects will be a couple of 10 KB in 
size, but one could use alternatively more smaller Kolab-objects instead) 
and puts them into his shared IMAP-folder, to which he grants all users on 
this Kolab-server access.
- Step 2: All users on this Kolab-server will be DOSsed on their next sync, 
regardless of the individual client used.

Hence, no "authorised Kolab-client" involved, and not even a malicious one. 
The malicious Kolab-user can upload the forged Kolab-object and set the 
access rights on this IMAP-folder "List & Read" for all IMAP-users with any 
IMAP-tool or -client, including those which do not understand Kolab-XML.

> If the client is malicious - which seems to be assumed here - 

Not necessarily, see above.

> I don't see  how an XML nesting bomb is the most likely scenario when 
> the client might as well delete/falsify/make inconsistent the entire 
> object/database. 

Ugh, that is way more complicated: why should an attacker take that route, 
when there is an easier one.
And eliminating the easiest attack vector (and the most easy to eliminate) 
is always a step which makes formerly harder attack vectors becoming the 
easiest, thus interesting for attackers.

> > b. Do we need similar, but separated statements for XML-attributes?
>
> Yes, imho. But I am not sure it would need to be separate, the two seem
> closely enough related.

Can you (or someone) please come up with a suggestion (i.e. a paragraph WRT 
XML-attributes), which fits somewhere in this context and the draft text.


Cheers
	Florian
-------------- next part --------------
On "Preserving unknown XML-tags and their content" in the Kolab-Format specification, v0.3 (2011-06-24) 


Some thoughts and suggestions on preserving unknown (i.e. future) XML-tags and their content in the Kolab-Format specification, which IMO shall result in a KEP:

1. Kolab-clients MUST retain all unknown XML-tags and their embraced (i.e embedded within the opening and closing tag) content, regardless of their position in the XML-tag-tree and of their content (e.g. sub-tags) within an Kolab-object. 

1b. If a XML-tag appears multiple times within one Kolab-XML-object, ALL occurrences of this XML-tag MUST be preserved, regardless of their individual position in the XML-tree and their content. 

2. IMHO a non-issue: Eventual semantic issues.  But in order to avoid them, clear guidance for future extensions (i.e. XML-tags) to the Kolab-format are needed.
{Bernhard pointed out semantic issues, when old clients hit newly defined XML sub-tags (i.e. unknown to them), embedded in existing, known XML-tags, and such a client alters other known fields.  Yes, the content of the unknown (hence untouched) XML-tags may become out of sync in relation to known tags which may be altered by the old client, 
but I think ...
 I. ... it is easily overcome by defining the XML-tags and their content properly, so they do not contain duplicate or dependent information.  In case of orthogonal (i.e. completely unrelated/uncorrelated information), this is a non-issue.
 II. ... furthermore, that is a minor issue even without such carefully specified XML-tags, compared to limiting the extensibility of the Kolab-Format to top-level XML-tags only, when rule 1 (above) is mandatory.
}
   a. Duplicate information in XML-tag definitions
      The contents of XML-tags MUST NOT contain duplicate information (even if it is expressed in different ways), except for nested (sub-) tags, which SHOULD NOT contain information which is already provided by a tag's content at a higher level in the XML-tree.
   b. Dependent information in XML-tag definitions
      The contents of XML-tags MUST NOT contain dependent information (even if it is expressed in different ways), except for adjacent XML-tags (i.e. on the same level in the XML-tree), which MAY contain dependent information (although this is nit recommended for new tags).  Nested (sub-) tags (i.e. on a lower level in the XML-tree) implicitly contain information, which depends on the higher-level tags they are embedded in. 
      {I would love to see the "adjacent tags"-clause being taken out, but am unsure, if that is feasible.}

3. The depth of nesting XML-tags MUST be limited to 7 levels at most. 
{A suggestion from Bernhard in order to limit the size of and parsing efforts for Kolab-objects, which otherwise may be used for Denial-of-Service attacks against all clients accessing and interpreting that Kolab-object.
.  I picked the "7" as an educated guess (i.e. some considerations done, but not at all exhaustive).}


Florian


More information about the format mailing list