[RFC] SRS Section 1

Hui Zhou zhouhui at wam.umd.edu
Tue Feb 1 19:42:32 PST 2005

On Tue, Feb 01, 2005 at 06:18:39PM -0700, Kevin P. Fleming wrote:
>While other formats can also provide this, XML also has already-existing 
>tools for validating that a document conforms to a given design, which 
>is very valuable in a tool like this. 

XML by itself is very simple and straight forward. One can come up 
with a parser within hours almost in any language. I hate XML after it 
become a monster. Now whenever xml is used, come along with all those 
hype-like talking of validatation, xsl, RELAX-NG(which I even don't 
have slightest idea what is about). 

Lets look at the XML after w3c specifies it: First the parser need 
handle encodings, utf at the least. Do we need that for alfs? Then 
there are tags, attributes, and texts. I just failed to realize the 
need of attributes. I understand the need in HTML, but why XML need 
it? The simplest form is XML is just a tree structure, where only tags 
and data nodes are necessary. Then w3c specifies validation. For alfs, 
if the tool can parse the profile correctly, isn't that enough? Even 
after validation, how much confidence one could gain that a profile 
can be correct? There are millions of error human may make, such as 
typos in package name, how much percentage a validation will help you 
in the alfs case anyway? Then there are entities, do we need them? 
Certainly we can use them as the old profile did, but personally those 
entities have gave me more troubles than not using them at all. Just 
try hand editing one profile see if you have the same feeling. 
Frankly, I never finished reading the official spec, which is 
necessary for any conformant parser writer. Oh I just tried print 
preview in my Opera browser, and the xml spec page is so huge that it 
freezes my browser!

But that is not enough, they come up with XSL, XInclude, ..., all 
those fancy names make me feel I am falling behind. I learned perl in 
one day, python in half day, I am afraid I will never finish learning 

Libxml2 being w3c conformant, is a monster too. I read the blog entry 
of Daniel Villard a few weeks back, that he was using some random tool 
to detect possible vulnarability, it turns out tons of issues which 
kept him busy for 10 days. The point is it has reached a point of 
unmaintainable status. Basically to use libxml2 to parse some xml 
data, no matter how simple, one is relying the security of libxml2 
which is beyond normal programmer's auditing ability! 

So I am not really against XML, just wish people realize what they 
need, instead of buying what ever W3C sells. 

>Yes, XML is harder to edit by 
>hand, and more verbose and other formats. Each of us needs to decide 
>whether the additional effort required to edit/maintain the profile is 
>more or less valuable than the time savings generated when the profile 
>is run (many, many times over).

All the profile format as long as it is robust and parsable, can be 
run many many times over. So that's not the reason for XML.

>For that matter, I have long envisioned that the protocol between the 
>client and the server would _also_ be XML (and previous discussions 
>headed in that direction as well); 

Yes, If there is need to transfer profiles between client and server, 
use the same format, why double the job? Whether XML is best is quite 
questionable though.

>there's no particular reason not to 
>do so (there is not a large amount of communication, so bandwidth 
>consumption is not a concern), and it falls right into XML's primary 
>purpose: descriptive data communication. If you build your own protocol, 
>you have to deal with parsing/validation, versioning, serialization and 
>deserialization of data elements, etc. 

I have question with the need of validation. For all the other jobs, 
it is not that a big deal if one confine the protocol to one's need. 
If one desire add unnecessary whistles, XML already had tons.

>(and that's assuming you use a 
>text-based protocol... if you use a binary protocol, you also get to 
>deal with endianness, word length, etc.)

Again, one design the protocol for his own use is very simple and 
straight forward. Design the proctocol for public (which includes 
large percent of ignorants) is very difficult (needs a w3c committee) 
and more than not never gets right.

Hui Zhou

More information about the alfs-discuss mailing list