News database format migration to XML (Yeah, it's possible)

Anderson Lizardo andersonlizardo at yahoo.com.br
Sun Nov 30 10:36:36 PST 2003


Hi,

Sometime ago, Jeremy Utley suggested using a XML database to store news 
items (see 
http://archives.linuxfromscratch.org/mail-archives/website/2003-August/000457.html). 
On that time, I demonstrated this approach would have at least one 
inconvenient, as described at 
http://archives.linuxfromscratch.org/mail-archives/website/2003-August/000463.html. 
Now, I see this format as possible, if we use some Perl "magic".

Right now, we are using MIME databases. It's nice and works well, from 
the script POV. But is somewhat complex to modify it, as it has a 
strict structure. Anyway, MIME was not designed to be edited manually. 
Also, our current manage_news.pl script is becoming too complex and 
full of workarounds. XML adoption will minimize that a lot.

XML, on the other side, was designed to be readable and editable by 
humans. Anyone, with basic markup language knowledge, know how to 
modify a XML file. So we will give XML a try. The migration will not be 
difficult, as I can write a conversion script (mime2xml.pl) and I have 
already some idea about the new manage_news.pl script.

The format for the database will be RSS. It's well documented and Perl 
has a nice parser for it (XML::RSS). Note that it will not be a valid 
RSS, it will be _based on_ RSS. To avoid that tag encoding inconveniece 
described at 
http://archives.linuxfromscratch.org/mail-archives/website/2003-August/000463.html, 
the script will permit, on input, data like

<description>
<p>this is a test.</p>
</description>

Not valid on RSS specifications. On other words, the database will be 
valid XML (because XHTML is XML-based), but not valid RSS.

Suggestions are welcome, and I will start the implementation ASAP.
-- 
Anderson Lizardo




More information about the website mailing list