wvHtml – Target Dir

In what I hope is a first of many problem solving posts, I show you how to use the wvHtml command to work with documents located in directories other than the current working directory.

Our requirement:

To display Word documents in Netscape 7 on Redhat – convert to be platform independent

The solution:

wvWare – specifically wvHtml

What it does:

Converts Word documents to HTML 4.0

Website:

http://wvware.sourceforge.net/

Testing platform:

RedHat with command line entry

Basic Usage of wvHTML

You can download wvWare from the SourceForge site.

After installation, the commands can be access from any path as described in the basic usage guide.

The following code (when run in the directory containing your document) will take word_in.doc, convert it into an HTML file called word_out.html and place it in the current directory.

wvHtml word_in.doc word_out.html

Quality of wvHTML Output

Don’t expect a perfect copy of your Word document, or even valid HTML4.0 (it adds an attribute called ‘name’, if you use the inbuilt heading styles), what you will get is a super fast representation of all text with basic formatting, images, tables and lists show up in the output:

Test Word Document
Sample Output

HTML Additions

  • Uses the HTML 4.0 Transitional doctype:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
  • Adds the correct content encoding:
    My test on Greek characters:
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
    English characters:
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15">
  • Adds their own meta tag:
    <meta name="generator" content="wvWare/wvWare version 1.0.0">
  • The title is added as the 1st line of the Word document
  • Adds a commented out footer claiming that the page is valid (commented out, I assume, because it does not produce valid HTML):
    <address>
    <a href="http://wvware.sourceforge.net/"><img src="wvSmall.gif" height=31 width=47 align=left border=0 alt="wvWare"></a>
    <a href="http://validator.w3.org/check/referer"><img src="vh40.gif" height=31 width=88 align=right border=0 alt="Valid HTML 4.0!"></a>
    Document created with <a href="http://wvware.sourceforge.net/">wvWare/wvWare version 1.0.0</a>
    </address>

Head Banging Moment – error: ‘word_in.doc’ unreadable

As with a lot of documentation, it only fully explains how to use the command from the current directory and not, how I need to use it, by running it dynamically from another application. It does say to use --targetdir when referencing other directories, but gives no explanation to the full command.

I was getting an error message as above, because I wasn’t referencing the whole path to the input word document. After much searching and experimentation, this is how you do it:

wvHtml --targetdir=/path/to/file /path/to/file/word_in.doc word_out.html

Tags:

One Response to “wvHtml – Target Dir”

  1. Jamie (not so famous li'll bro) says:

    I would write something educated / helpful / intelligent but i have zero idea what any of the above means. But i thought i’d take the opportunity to delurk!

Leave a Reply