Draft of archive_news.pl script

Jeroen Coumans jeroen at linuxfromscratch.org
Tue Aug 26 00:04:22 PDT 2003

Hi Anderson Lizardo. You said the following on 08/26/03 04:42:
> Jeroen Coumans wrote:
>>>- The links on the website (including that on the news contents)
>>>are relative URIs. Because of this, some links will become broken
>>>on the news archives. As a workaround, I inserted a <base
>>>href="../../" /> tag on the archive-top.html file. But now,
>>>fragment links to the current file, like "#header", will not work.
>>Can the relative URL's be converted to static URL's in the script?
>>This should make all links work again.
> The conversion is possible, but it doesn't fully resolve the problem. 
> See why:
> Suppose we have one file named http://www.lfs.org/lfs/news.html, with 
> these links:
> <a href="#header">Link1</a>
> <a href="../blfs/news.html">Link2</a>
> By default, the base url is the current directory/file. So, the links 
> above are converted to:
> <a href="http://www.lfs.org/lfs/news.html#header">Link1</a>
> <a href="http://www.lfs.org/lfs/../blfs/news.html">Link2</a>
> The links above work fine until we mv news.html to 
> http://www.lfs.org/news/lfs/YYYY/MM.html:
> <a href="http://www.lfs.org/news/lfs/YYYY/MM.html#header">Link1</a>
> <a href="http://www.lfs.org/news/lfs/YYYY/../blfs/news.html">Link2</a>
> Link1 still works, because the current file is actually the same from 
> before (just moved to another place). But Link2 don't. And if we change 
> the base url to http://www.lfs.org/lfs/news.html (the original one)? 
> The links are expanded to this:
> <a href="http://www.lfs.org/lfs/news.html#header">Link1</a>
> <a href="http://www.lfs.org/lfs/../blfs/news.html">Link2</a>
> All links work now, but the navigational link "#head" will now be a 
> anchor to another file, not the current one.
> Finally, the conversion is possible, but it will have at least one 
> side-effect: broken "navigational" links (like #rootcontent, 
> #generalnav, etc.). My suggestion is to remove these links from the 
> archive-{top,bottom}.html templates.

I keep trying:
instead of a base URL which is added to all links, wat about a 
conversion table which decides how links should be converted based on 
how they start?
- every link which starts with # will stay the same.
- links which start with ../ will be converted to 

This would take care of all issues right? We can correctly determine the 
implied location of relative links and convert these to absolute links. 
We can also differentiate page bookmarks and URL's to files. So this 
should be possible. The conversion should take place in the permanent 
archive then.

>>page creation.  BTW I assume that method allows for multiple <p>'s?
> Yep, and any other XHTML tag you want.

Good :)

> Yes. The method involves the following steps:
> 1) Someone should write a news item and insert it on top of news.txt
> 2) The Perl script, run daily from fcron, will parse each news.txt, 
> archive the news, and ouput the 5 most recent ones to temporary files 
> (eg: {lfs,blfs}/news.tmp).
> 3) Another script, made by you, will cat the correct files (including 
> Changelog, general news, etc.) and create the respective news.html.
> One important thing is that must be specific templates (that 
> *-{top,bottom}.html files) for current news and for archived news, 
> because of the differences on the base URL and on the specific 
> left-side menu.

No problem. I allready have an index.html for the news section ready.

> Another thing: on news sites, like Slashdot, Linux Today, etc., It's 
> common to see links to previous news. When we have to make a link to 
> old news, we should use a absolute URL like 
> http://www.lfs.org/news/lfs/2003/08.html#newsid. When the archives 
> become on-line, you should convert some ocurrences of links to old news 
> to this format.

Okay. Strip out the www. part, it's better to just use http://lfs.org

> Good!
>>BTW we can mail lfs-chat to see if there are other perl-coders
>>willing to work on this if your time is that limited...?
> No problem :-). This way, the TODO list will become empty more quickly.

I'll mail them then. Take care,

Jeroen Coumans (jeroen at linuxfromscratch.org)
FAQ: http://linuxfromscratch.org/faq/ (faq at linuxfromscratch.org)
Website: http://test.linuxfromscratch.org (website at linuxfromscratch.org)

More information about the website mailing list