Working With Adobe Acrobat Distiller Part 1
Adobe Acrobat is an electronic document (E-DOC) constructed in a proprietary hyper text markup language derived from the Adobe Page Description language used in Post Script printing.
Basically, what this means is that a PDF file is similar to a web page (in fact hyper text markup language spells HTML), only the mark up language (the tags) are different and quite complex.
The original commercial hyper text tool, which some of you may remember, was the quite popular Hyper Card for the old Macintosh 068 machines. It used to come free with Macs, but only Macintosh could read the data bases and animated manuals you created in this process.
Of course, there is also HTML 1.0, the web based hyper language that is the granddaddy to them all!
All these languages use tags. This is some type of marking not seen or read by a person viewing the page. We’ve covered making web pages over several issues and those reading our series will know some of the HTML types of tags. Anyone can create their own tag format. Most word processors are done this way with some hidden tag in separate file or with a marker that lets the machine know not to put it on the page. It is an information “aside” that makes the words you want the user to see have the color, size and intensity those words should have.
Adobe uses tags for everything, thus their file is bloated with their own code, much like what Front Page does to a web page by adding text modifiers for each line or word.
PDF files have been around for a while. I first heard of them in the early 1990’s from Kamron at Mac Market in Reseda. The beauty of these files is that they are cross platform and can be web based, thus you can construct catalogs in PDF format that can be viewed using Internet Explorer, Netscape or other compatible browser that take the Adobe plug-ins. Linux, Macintosh, Unix, SPARC, AIX and Windows users can also view the document on their machines using the same file and see the text and color pictures exactly as you intended.
This beats out a text based format like RTF (Rich Text Files), Word or Word Perfect in that you don’t always see what the writer intended. The fonts will vary. The picture placement might change. Some elements such as table may not load.
It also beats out HTML in that you can format the pages readily with contents, links, access to other files or documents, plus they can link up to the internet.
But to do this process successfully Adobe had to make some very unfriendly programming aspects.
PDF elements can be riddled with different tags and you can only work with a given tag by using a given tool. Thus in editing one word on a PDF page you may have to use three or four tools and change a variety of parameters. To discard a link with graphics and text in a formatted table can require several steps using many tools, otherwise remnants of some part of the PDF process will remain (and be accessible to the reader or use).
HTML is, however, not much different. To construct a graphical and text based link with different fonts and colors requires a lot of different tags. You must open and close these properly. When removing all or part of the link you should remove all the remaining modifiers just to be on the safe side. Thus the link:
Could have five to ten modifier tags
The way you make PDF files is with an expensive piece of software (even used they sell for $200) called a Distiller that works standalone to import a variety of file types or it connects to most of your software as a virtual printer . Your must remember that this file format is based heavily on Post Script files which are the original way we worked with laser printers back in the 1980s. This printing format came from Adobe who found a novel way to use relatively obsolete technology (Post Script is still around but True Type fonts sort of put the PS technology on the back burner).
Most of you are totally unaware that in printing a document as far back as the late 1970s a hidden code or programming language was sent by your software to the printer. The standard for printers up to today is known as the Epson standard. All printers have to emulate this standard if they want to work with unknown software!
When laser printers came out the HP LaserJet became the defacto standard for sensing printing codes for any document.
Adobe Post Script was an attempt to create another defacto standard and because of the popularity of HP and Epson it never really caught on. Then along came True Type scalable fonts and that sort of sealed the fate of Post Script as getting anymore foothold on the computer industry then an Amiga, Macintosh or Linux system would pose a threat to PC sales now or in the future.
Adobe, however, beat everyone else on the concept of cross-platform document exchange, building the first truly portable E-DOC format that was embraced by users. Adobe helped to create the concept of POD, E-zines and E-Books.
That was my introduction from a 1992 copy of VIP BASIC for Macintosh that came on CD ROM and had no manual. This $500 programming language was sold in a jewel case. It was the shape of things to come. Back then PDF was almost unheard of, except in deep computer sales and IT work. AOL only had a few million members, Internet Explorer was a joke (anyone who was anyone used Netscape 3.0 Gold) and all you got was 5 hours of web time a month for $9.95. Hot Dog was the choice of web developers. There was no front page. I wrote one of the first no code commercial web page makers (Ought Medal) that let 10 year olds put up web pages with no programming.
Adobe thrust PDF slowly on the world. They made readers for all significant platforms (Amiga and Atari ST are not supported, nor is the Commodore C-64), offered them for free. They got the professional programming world to start making their white papers and technical manuals in PDF format for free distribution over the web to registered users of their software or hardware.
Adobe lets you import a variety of documents including those directly off the internet, essentially making a hard copy web site complete with pictures and links. The only snag is that Acrobat makes this in standard page format, so long, contiguous web pages filled with names or text get broken into separate 8 ½ x 11 pages. This can be a problem for things like wide, wide web pages (tables that go on forever) or tall tables. These will get cut off, truncated or turned into multiple pages that can be a bit disjointed. Thus, if you work with internet based HTML pages you may have to re-work these pages a bit just for PDF conversion, because converting the PDF file is a difficult, if not impossible task!
PDF files are constructed more like data bases than word processing documents. This page, for example, is a continuous flow of words with a line feed (paragraph space) between the flow on occasions.
PDF files, on the other hand, store groups of words on a line by line basis and sometimes a given line will contain more than one set of word groups. This happens with all links to other pages or to the internet, as the link is kept separate.
There is, for example, no way to globally re-format the text on this page in Acrobat Distiller into justified text. You must do that in some other format, such as on a word processor or by adding an HTML modifier before Distilling.
What you see is what you get, but you can modify it a little after you set up the PDF page or file, but this modification is primarily for indexing and linking. It is very difficult, if not impossible to add, for example, a new picture into an existing page. It is much easier to do this in another program such as a Word or PageMaker.
Acrobat uses a type of Artificial Intelligence (AI) in the making of documents. It can often read existing tables of authorities, indexing and contents written in programs like PageMaker, Front Page or Word and it will convert these into PDF format.
It also converts web site links to other pages into PDF format, even if you add these pages at a later date using a different file name.
As an example we are working on our hard copy for this magazine and I imported some files directly off our site into Distiller. Then I found elements I left out, brought this in as separate files, then inserted them into the middle of the document. Adobe instantly made most of the links between documents on our contents pages and the actual files with no manual work required to change these from HTML links to PDF links.
This process, however, only works when you import the document as a call from another document. For example, I imported our “teaser” pages for each department and then clicked on the individual story links. Adobe imported these new pages, put them into the document (and you can do this out of sequence if you want) and converted the files links to pages within the PDF document. If I click on the MORE link in a teaser it takes me to the article found elsewhere in the PDF document.
I had to manually call in outside files. Our “backpage” links to different months or years. I converted this to different file months in the folder or on disk and Acrobat will open that new PDF file when you click on the picture link.
Acrobat also looks for “formatting” from your word processor. If you write using a formatted process, such as we do with the CSS files for style sheets, when Acrobat sees a heading or sub-heading it automatically indexes this into the contents. If you don’t use pre-formatted automation then you must manually do this process for contents.
Finally Acrobat Distiller allows you (in more modern versions) to lock a file and even lock out certain aspects, even to users with Distiller. You can prevent copy and paste. You can prevent editing. You can even prevent printing and all of this can be password protected using 128 bit encryption. This allow an author to make E-Books that can’t be readily copied. It allows them to “brand” book with a user’s name and address, which helps prevent them from doing Kazaa sharing since they can’t remove the brand without considerable expertise at hacking into a PDF file.
This makes Acrobat Distiller a very flexible commercial tool for this new millennium of digital information, giving you the option to share openly or share on a limited basis information.
Next issue we’ll get into the nuts and bolts of actually creating these different elements.