Draft of archive_news.pl script

Anderson Lizardo andersonlizardo at yahoo.com.br
Sun Aug 24 19:11:56 PDT 2003


Jeroen Coumans wrote:
> > With "hard" archiving (I mean move the old news to separate files)
> > it is still possible. I will send the finished archive_news.pl
> > script ASAP, so you can see how it works.
>
> Okay great :)

I've attached a tar.gz file with the following files:
* archive_news.pl - the Perl script for archiving
* create_archive.sh - A Bash script to automate the creation of the news 
archive. You have to create 
{alfs,blfs,lfs,hints}/archive-{top,bottom}.html files and uncomment the 
respective commands to be able to do a fully archiving of the website.
* archive-bottom.html and archive-top.html - a template specific to the 
archive. You may want to modify it.

To test this script, unpack the attached package on www/test/ and run 
the create_archive.sh script. It will create a news/ directory with the 
generalnews.txt's archive inside.

Some observations/issues:
- The links on the website (including that on the news contents) are 
relative URIs. Because of this, some links will become broken on the 
news archives. As a workaround, I inserted a <base href="../../" /> tag 
on the archive-top.html file. But now, fragment links to the current 
file, like "#header", will not work.

- To this script be able to parse the XHTML files correctly, the news 
have to be indented (like the current files on CVS). Eg:
<p>some text here</p>
	<h3 id="news1_id">News title</h3>
		<h4>Author Name - YYYY/MM/DD</h4>
		<p>This is the news content.</p>
		<p>Paragraph 2</p>
		...
	<h3 id="news2_id">Another news</h3>
		<h4>Author Name - YYYY/MM/DD</h4>
		<p>Paragraph</p>
		...
<p>Some text here</p>

This requirement exists because archive_news.pl cannot determine where 
the last news item ends (other news than the last one ends when a new 
"<h3>" tag is found). So, when it finds a tag without indentation, it 
assumes that the last news ends there.

- This is a crude hack, and I recommend we use the method described in 
http://archives.linuxfromscratch.org/mail-archives/website/2003-August/000469.html.

This script can be adapted to convert the current news to this format, 
and I could write a script to parse it.

The lessons (or lectures?) at university begin tomorrow, so I will have 
less time to work on LFS (basically, only weekends). But, as I can see, 
the most important things on TODO (needed for page launch) are done. 
:-)
-- 
Anderson Lizardo


_______________________________________________________________________
Desafio AntiZona: participe do jogo de perguntas e respostas que vai
dar um Renault Clio, computadores, câmeras digitais, videogames e muito
mais! www.cade.com.br/antizona
-------------- next part --------------
A non-text attachment was scrubbed...
Name: create_archive.tar.gz
Type: application/x-tar
Size: 4930 bytes
Desc: create_archive.tar.gz
URL: <http://lists.linuxfromscratch.org/pipermail/website/attachments/20030824/661e590b/attachment.tar>


More information about the website mailing list