How to Stop "Pretty Printing" In HTML Output
We have a rather unique constraint in our on-line help system such that Engineering gets extremely nervous when the size of the help system on disk increases. The reason is that our help is deployed as part of a proprietary operating system that has no file system. In other words, the hardware appliance on which the operating system resides has hard drives but no abstract concept of a file system.
When the appliance is booted, the entire operating system is loaded into RAM. Because we're now switching to ePublisher Pro output, our help has more than doubled in size -- from approximately 15MB to more than 30MB.
At Round Up 2009, Ben suggested that I eliminate "pretty printing" in the Dynamic HTML output. While I've never tried it I expect this simple, one-parameter override to Pages.xsl will also work on other output formats.
Before and After
For an idea of what "pretty printing" is, consider the following examples.
Before (that is, with pretty printing):
<table cellspacing="0" summary=""> <tr> <td> <a href="toc.html"><img src="images/toc.gif" alt="Table of Contents" border="0" /></a> </td> <td> <a href="Action_AuthGuest_h.html"><img src="images/prev.gif" alt="Previous" border="0" /></a> </td> <td> <a href="Action_ForceAuth_h.html"><img src="images/next.gif" alt="Next" border="0" /></a> </td> <td> <a href="ix.html"><img src="images/index.gif" alt="Index" border="0" /></a> </td> </tr> </table>
Notice that all the tags are indented from the left margin, making the HTML look nice and clean; hence the name "pretty printing".
After:
<table cellspacing="0" summary=""><tr><td><a href="toc.html"><img src="images/toc.gif" alt="Table of Contents" border="0" /></a></td><td><a href="Action_AuthGuest_h.html"><img src="images/prev.gif" alt="Previous" border="0" /></a></td><td><a href="Action_ForceAuth_h.html"><img src="images/next.gif" alt="Next" border="0" /></a></td><td><a href="ix.html"><img src="images/index.gif" alt="Index" border="0" /></a></td></tr></table>
Now, while the preceding example isn't pretty, who do you expect to read it anyway? <g>
Why It Matters to Me
As I previously alluded, every KB and MB can be critical for our help system. I don't want to exclude content but getting rid of white space in the HTML seems harmless enough.
What Effect Did it Have?
The total size of the help system (images included) decreased by 5.4MB (15%) on an on-line help system consisting of 2,158 HTML files. (The number and size of images didn't change.) This is a really nice decrease for us and it costs us nothing except some white space.
How Did I Do It?
There are two main things:
- One-parameter change to pages.xsl.
- Deploy pages.xsl to the formats override directory.
For background about what I changed, see the description of the indent parameter here: http://www.w3schools.com/xsl/el_output.asp
Changing pages.xsl
This is based on using Dynamic HTML output but I expect you can do the same thing on any output format that supports pretty printing. To be on the safe side, I backed up my original pages.xsl first and then as always, I copied it to my project's formats override directory before I edited it.
To find your project's format override directory:
- Start ePublisher Pro.
- Open your help project.
Click View > Format Override Directory.
Double-click the Transforms folder if it exists; if not, create it. Here's where it will be located: <path-to-wep-file>\Formats\Dynamic HTML\Transforms
To back up the original pages.xml:
Locate this folder: <webworks-epub-pro-install-path>\Formats\Dynamic HTML\Transforms. For example, C:\Program Files\WebWorks\ePublisher Pro\Formats\Dynamic HTML\Transforms.
- Click pages.xsl.
- Press Control+C (Copy).
- Press Control+V (Paste). This creates a file in the same directory; the file is named Copy of pages.xsl. In the event of disaster you can fall back to using this file to start completely over.
To do the exciting part:
- Copy pages.xsl from its original location to your project's format override directory.
- Open pages.xsl in a text editor.
- Locate the line that starts with:
<xsl:variable name="VarWriteResult"
I changed the fourth parameter from yes to no, based on the description of Document(node-set nodeSet, string path, (etc) here: http://wiki.webworks.com/DevCenter/Documentation/ExtensionObjects?highlight=%28wwexsldoc%29
Before:
<xsl:variable name="VarWriteResult" select="wwexsldoc:Document($VarResult, $ParamSplit/@path, wwprojext:GetFormatSetting('encoding', 'utf-8'), 'xhtml', '1.0', 'yes', 'no', 'no', '-//W3C//DTD XHTML 1.0 Transitional//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd')" />
After:
<xsl:variable name="VarWriteResult" select="wwexsldoc:Document($VarResult, $ParamSplit/@path, wwprojext:GetFormatSetting('encoding', 'utf-8'), 'xhtml', '1.0', 'no', 'no', 'no','-//W3C//DTD XHTML 1.0 Transitional//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd')" />
- Save your changes to pages.xsl and generate your help project.
If you have problems, read again the aforementioned references, check your syntax, and try again. Also, make sure you're editing pages.xml in the format override folder and not under the WebWorks installation folder.
References Again For Your Convenience
http://wiki.webworks.com/DevCenter/Documentation/ExtensionObjects?highlight=%28wwexsldoc%29
(Find sec. 4.4, "ExslDocument").