Editing PDFs

Declan. Moriarty declan.moriarty at ntlworld.ie
Tue Jul 22 02:07:46 PDT 2003

On Tue, Jul 22, 2003 at 01:24:43PM +0800, Ng, Wey-Han enlightened us
> Quoting "Declan. Moriarty" <declan.moriarty at ntlworld.ie>:
> > On Mon, Jul 21, 2003 at 12:04:53PM -0600, Stephen Bosch enlightened
> > us thusly
> > > I always wince when I hear this -- many people want to do this,
> > > but
> > in
> > > fact there's already an Adobe PDF-compatible format designed to do
> > > this -- Adobe Forms. The nice thing about the PFF is that there is
> > > space in the protocol for modifiable blanks and the like.
> > 
> > What are you telling me? That the form should have been created
> > differently?
> More or less. There is a slightly different format from pdf that is
> called Adobe Forms, pff in short. Though I can't tell you more details
> apart from it exist. I have fill out forms posted by the US gov in
> pff, but have no idea what tools you can use to create it. Maybe
> search for it in the Adobe site, they should have at least some
> information about it.

Sorry to be harsh, but that's a non starter. The form is written in pdf.
I am not choosing the forms I want to fill in. They are chosen for me.

> > > >Gs I have, and Acroread. Gs supplies pdf2ps, and ps2pdf, as well
> > > >as pdfopt, which optimises the things sizewise, I gather. So I
> > > >could
> > go
> > > >this route: pdf --> ps --> editor --> ps --> pdf --> pdfopt, but
> > you
> > > >who know me will realise that I, my fonts, or locales would fall
> > over
> > > >somewhere on that route ;-).  I presume there is a wsyiwsg ps
> > > >file editor somewhere in linux?
> What do you mean your fonts or locales would fall over? If you mean
> some characters in your filled in text is left blank (meaning instead
> of "meaning" it would print "m an n " after you have filled in
> "meaning" with your app), then it is the problem with lazy download
> fonts. This is the problem with using softfonts and because it makes
> no sense to download fonts your document will not use, the print
> driver will only download font information for any character when it
> encounters it in your document.

Look, I'm basically a hardware guy: I can understand the chips that make
up your pc chips in terms of lines going high and low, and find hardware
faults on them. I can't read C, and have a poor record with mastering
obstinate software. I get wary when I see a processing chain that long.

> If you really need to edit your ps document, you might want to type a
> line of all the possible character you will use to fill in the blanks
> set to the same fonts you want it to be and make it white. So the
> fonts information will be "downloaded" when you print to the document
> and therefore available when you try to use it and will not show up as
> a funny line in your form. Just remember that line will have to come
> before the blanks you intend to fill or it will not work.
> > Er, I did look at postscript code. I got page after page of it when
> > a certain print program was loaded (mentioning no software by name
> > :).  And you are right - it is very messy.
> Let's not forget that postscript is a programming language and not a
> document format and I suspect pdf is similar in concept but somehow
> made portable. It is messy because what you see in the postscript file
> is instruction to the printer to draw lines/dots/whatever on the piece
> of paper.  Programs like gs translate it to a soft paper that
> ghostview can use to draw on the screen but at this point the document
> would have lost most of it's structure and formatting information.

> > Everything can write to ps. They call it printing; can nothing read
> > it?
> Read it? Yes. Manipulate/edit it? Maybe but difficult. There are a
> number of reason why it is difficult to edit. The two off the top of
> my head is:
> 1. Formatting information is not always captured in ps/pdf file only
> shapes are captured (a character is consider a shape in ps/pdf). Even
> though the text can be extracted easily, all association to the
> structure of the document is not. For example, how can one tell if
> that line is just a line or if it is the bother of a table?
> 2. Font set might not be complete due to the lazy download I have
> mentioned above. So because of #1 formatting information might not be
> in the document, it might not be apparent which fonts to use.
> Furthermore, font substitute might make resulting document look really
> different. Which is not what you want for a form filler, right?
> > Can gs output an editable format (like rtf)?
> See above.

I take that as a negative. I did see drivers for png, jpeg and a couple
of bitmap formats in ghostscript. So the picture is that all processing
can basically do is make a form of bitmap from a pdf or ps. This is fine
for printing, but no use for identifying the shapes in the bitmap, which
any rtf format editor would want to be able to do.
> > When I googled selectively I got nothing: nada; Zero. When I googled
> > less selectively, I got tripe. There's a lot of tripe on the
> > Internet.
> I would expect so. :>
> HTH.
> PS: Who can tell if I have done this before. :D
So I could work this way

pdf --> bmp -->add over text --> "print"(=.ps) --> ps2pdf.

The first stage could be done by ghostscript; Adding over text can
certainly be done in windows, so I presume gimp can do it also, and
print it to ps.and ghostscript supplies ps2pdf.

Just for the pig iron (I got around this task btw) I should try this


	With best Regards,

	Declan Moriarty.
Unsubscribe: send email to listar at linuxfromscratch.org
and put 'unsubscribe blfs-support' in the subject header of the message

More information about the blfs-support mailing list