HTMLflow - HTML flow charter and link checker ver 01.00 10/23/95

HTMLflow is a product and copyright of On-the-Net,LLC.

See HTMLflow home page for current versions and more developments.

Mail questions and comments to tech@on-the-net.com

The HTMLflow program actually runs on the server on the real web documents. This allows a more extensive level of testing then allowed through fetching pages as http links.

Unix Versions:
Command line and Web form

Dos Versions:
Command link with parameter file.

Installation
UNIX versions:
Download proper platform version, HTMLflow includes a platform specific executable program and web pages/cgi scripts to control the HTMLflow program.

To install:

  1. Expand the provided Unix archive (html.tar.gz or html.tar.Z) , with a command such as "tar xvf html.tar.Z") in the directory that htmlflow is going to run.
  2. Edit the initial variables in runflow:
            #!/usr/bin/perl
            #        first line has Location of Perl:
            $config_path = "./";   # Real path for config file
            $url_path  = "/a-flow/"; # Url path for control form
    	$cgi_path  = "/cgi-bin/a-flow/";   # Url path for cgi script
    	$pgm_path  = "";                 # Path to HTMLflow
    	
  3. Make sure that the directory is a legal cgi-bin directory and that the directory is a+rw for the web server.
  4. Initialize the runflow.html page by typing 'runflow' in the directory.
  5. Run HTMLflow by calling runflow.html page with a URL like:
    http://localhost/a-flow/runflow.html

-- Dos version:
Installation:
Place HTMLflow.exe in a directory in the PATH

Use:
The dos version of HTMLflow uses the same command line options as the UNIX version, except that the limits of DOS command line require an alternative. This is to use a parameter file with the command line values.

The parameter file is specified by using the command: htmlflow @param-file files-to-test.htm

The param-file is a text file with one value on each line. A flag (-x) is one value and a matching argument is a second value on a second line.

Ex:
htmlflow -l \mypages\ \mypages\index.htm or
htmlflow @parm \mypages\index.htm with the file parm containing:
-l
\mypages\

Flag definitions:
(The latest set of flags as be determined by running htmlflow with no parameters)
-a file: outfile of All url references
-b file: Block definition control file
-d dir: Real directory (must end in / or \) -l dir: Real local directory for unspecified paths (must end in / or \)
-e file: Outfile of all errors
-f file: Outfile of all file referenced -i file: Default index name
-m file: outfile of missing files
-r file: outfile of structure report
-v path: Virtual home
-x[#]: Debug level
-X file: outfile of external references

Operations:
HTMLflow traces the all local (and soon external) links in all html documents connected to the documents specified. It looks to make sure all files specified exist. In the process it creates a structure chart, as text or HTML and a number of other analysis files.

Input files:
HTMLflow starts with a initial file or directory with an implied index.htm file. It traces all links and checks these files exist and in turn traces those files. To be able to map anchor URLs to physical files, HTMLflow must be able to relate each real file to a virtual URL. This is done by providing a real directory value and a matching virtual home path.

For example, http://www.host.com/index.html might be /www/mypages/index.html on a UNIX system, or c:\pages\index.htm on a dos system.

To do this, HTMLflow must be run with the command line:

htmlflow -dl /www/mypages/ /www/mypages/index.html

If the virtual home dir is not specified with the '-v' switch, it is assumed to be '/'. This example uses '-ld' because the real directory for the home directory and local references for unspecified pages are this directory. This is the normal situation.

Block definitions:
For many web sites, all pages start and end with a common block of URL links. This is to allow easy navigation and a consistent look to each page. This common 'blocks' of links cause a structure charge to be very cluttered and confusing. HTMLflow allows defining these blocks in a 'Block definition control file'. The control file has a very simple format: the name of each block is on the left margin and the URL for each entry in the block is indented. Blanks lines are ignored and comments are allowed on lines starting with '#'.

A blocks file is specified by (-b) on the htmlflow command line:
-b block-file-name

For example: A blocks file:

HEAD1
                left.htm
                right.htm
                home.htm
FOOT1
                left.htm
                next.htm
                home.htm
                mailto:webmaster@mysys.com

Default index.html file:
When a local url specifies just a directory, the 'default index file' is used as the name of the HTML file to be processed.
The built-in value is 'index.html', it can be changed using (-i), for example:
-i home.htm

Output files:
Structure tree report: -r[h] report-filename (-r for text or -rh for html)
The structure tree consists of three sections: Structure chart, Module index and block definitions, if any blocks. Each module is listed with all hyper references shown with nesting show on the first reference to each module. When the report is generated as html, the module references are linked to each module's details and the links which are the contents of each module are active links to the real contents.

The module index is an alphabetical listing of all modules and their line number in the structure tree, on the html version, these are hot links.

List of all URL references: -a list-filename
        Specify a real directory: -d dir
        Specify a local directory for unspecified files: -l dir
        List of all errors: -e list-filename
        List of all files referenced: -f list-filename
        List of all missing files: -m list-filename
                The missing file list has no directory path, this
can be used to locate URLs with the correct filename but the
wrong path.
List of all external references: -X list-filename This list can be used to check for invalid external URLs more efficiently.

Report file sizes:
BIG !! - this program is really designed to be used locally on a server or on a set of local test files. For example, a structure tree for a web site with 240 files is 250K. This only takes a few seconds to load on a directly connected system but quite a few over a slow link.