The "Tag Manual" for JafSoft's text conversion utilities

This documentation can be downloaded as part of the documentation set in .zip format


Previous page Back to Contents List Next page

2. Using the pre-processor

The preprocessor was originally introduced into AscToHTM to allow users more flexibility in the HTML they generate.

The pre-processor allows AscToHTM and AscToRTF to be used as an authoring tools, as opposed to a simple text conversion or migration tool.

Preprocessor lines are not normally output to the HTML or RTF generated. Instead they are used to modify the conversion process in a number of ways.


2.1 Marking up sections of text

The pre-processor can be used to mark sections in your document so that the program will correctly process them as you wish.

Note:
The software does attempt to spot much user-formatted text automatically, but this is a difficult area and prone to error. Hence the use of these directives can reduce the error rate on such occasions.

Examples include :-

SECTION

This directive is used to divide the document up into named sections that may then be conditionally included/excluded from a particular conversion.

BEGIN/END_TABLE

BEGIN/END_DELIMITED_TABLE

BEGIN/END_COMMA_DELIMITED_TABLE


These pairs of directives are used to bracket tables of various types in the source text. The software will attempt to detect plain text tables, but if this goes wrong adding these commands can correct the analysis

Within these tables you can use other TABLE pre-processor commands to tailor the HTML generated (see "The TABLE commands").

BEGIN/END_CONTENTS

Used to mark up a contents list in the source document. The software will attempt to automatically detect the presence and location of any contents list in the document, but the algorithm can be problematic, and only really works for numbered headings.

BEGIN/END_HTML

Delimits a section of raw HTML code to be copied to the output file unchanged.

BEGIN/END_CODE

BEGIN/END_DIAGRAM

BEGIN/END_PRE


Delimits sections of pre-formatted text. CODE refers to software samples whilst DIAGRAM refers to ASCII art. PRE is the more general "pre-formatted" text, although currently all 3 have the same implementation.

BEGIN/END_IGNORE

Delimits text that should be ignored. This could be anything from comments to copyright statements in the original source file that shouldn't appear in the converted document.


2.2 Commands that influence the indexing of the document

Certain directives can be used to alter the document properties. Often these affect how the document will be searched and indexed.

In HTML these mostly lead to tags in the <HEAD>..</HEAD> of each page. Often these tags produce no visible effect.

In RTF these lead to field in the document properties being filled in.

Examples include :-

TITLE

DESCRIPTION

KEYWORDS

STYLE_SHEET (HTML Only)


The DESCRIPTION and KEYWORDS commands may be continued on subsequent lines provided they also begin with the same $_$_<command> directive.


2.3 Useful one-line pre-processor commands

A large number of one-line directives exist. Those for tables are listed the section on The TABLE commands. Others include

CONTENTS_LIST

HTML_LINE

INCLUDE

LINERULE

NAVIGATION_BAR

TOC



2.4 Useful in-line tags

A large number of in-line tags are available. These can be used to produce a number of useful effects. They include :-

BR (line break)

GOTO

HYPERLINK

TIMESTAMP

SPACES

SUPER and SUB

VARIABLE



2.5 The TABLE commands

These directives are used to tailor the HTML generated in any tables AscToHTM creates. They are placed either

At the top of the file
Directives placed here become defaults for the whole file, and will replace any policies that have been set (see the section on "Table Generation" in the AscToHTM manual)

Inside a BEGIN_TABLE ... END_TABLE section
Directives placed here will apply only to the table marked up by these commands (see 7.1.2).

The table commands are described (naturally enough) in the following table.

Directive
Value
Effect
TABLE_ALIGN
Align
Specifies the alignment of the whole table.
TABLE_BGCOLOR
Colour
Colour of background
TABLE_BORDER
Number
Size of border. 0 = None
TABLE_BORDERCOLOR
Colour
Colour of border
TABLE_CAPTION
Text
Table caption. Added centred at the top
TABLE_CELL_ALIGN
Align
Specifies the default alignment of
 

cells. Left, right or center
TABLE_CELLSPACING
Number
Spacing between cells.
TABLE_CELLPADDING
Number
Padding inside each cell
TABLE_COLO(U)R_ROWS
(none)
If present this specifies that the
 

odd and even rows of the table should
 

be coloured differently. See also the
 

"Colour data rows" policy.
TABLE_CONVERT_XREFS
(none)
If present, indicates that any section
 

cross-references in the table may
 

be converted to hyperlinks
 

(see also the policy line
 

"Convert TABLE X-refs to links")
TABLE_EVEN_ROW_COLO(U)R
Colour
When data rows are to be coloured
 

this specifies the colour of the
 

even numbered rows.
TABLE_HEADER_ROWS
Number
Number of header rows. These
 

will be placed in <TH> .. </TH> markup
TABLE_HEADER_COLS
Number
Number of header columns.
 

These will be marked up in bold
TABLE_IGNORE_HEADER
(none)
If present, indicates that the first
 

few line (i.e. the header) should be ignored
 

when calculating the column structure of the table.
 

See also policy "Ignore table header during analysis"
TABLE_LAYOUT
Layout
Explicit structure of table in terms of
 

number of columns and their widths.
 

See also policy "Default TABLE layout"
TABLE_MAY_BE_SPARSE
(none)
If present, indicates that the TABLE
 

may be sparse (see also the policy
 

"Expect sparse tables")
TABLE_MIN_COLUMN_SEPARATION
Number
Number of spaces to be taken as a
 

column separator when analysing the
 

table (see also the policy
 

"Minimum TABLE column separation").
TABLE_ODD_ROW_COLO(U)R
Colour
When data rows are to be coloured
 

this specifies the colour of the
 

odd numbered rows.
TABLE_WIDTH
Text
The width of the table (see also the
 

policy "Default TABLE width")

Colours should be HTML Colours which will placed in the various attributes of the <BODY> tag and other. The program simply transcribes your value into the output file.


2.6 The CHANGE_POLICY command

NOTE:
This feature has the potential to cause mayhem, and as such is offered to users on a "as is" basis. That is, we offer no support for getting this feature to have the effect a user may desire.

This directive allows you change a particular policy in part of a document. This is a potentially powerful feature, allowing you to tailor the conversion of your file in different sections of that file, or to embed the policy particular to a file in commands inserted at the top of the file itself.

The syntax of the command line is

$_$_CHANGE_POLICY <Policy Line>

where <Policy_line> is a policy line as it would appear in a policy file, and (usually) as it appears in the Policy manual.

For example the following would all be valid directives

        $_$_CHANGE_POLICY Background Colour : red
        $_$_CHANGE_POLICY Ignore multiple blank lines : Yes

Although how and when they would take affect will depend on the policy.

For example, the background colour would only take effect if splitting the file up, and only on the next file generation. This works, BTW, so if anyone wants to split a file into many pages, all different colours, then be my guest.

There are a many caveats to this behaviour :-

Not all policies may be changed in this way. In particular policies that open other policy files are not supported. Even if a policy if "changed", it does not follow that changing the policy will have an effect.

It is unlikely that this feature can be sensibly used to influence the analysis of file, other than when placed at the top of the file only. If such a manner it is simply an alternative to using a separate policy file.

Output policies are referenced at different times. Only those that are referenced after the line is read from the source file may be influenced, thus things like output file name may have no effect.

Not all policies once changed, can be changed back. This is particularly of policies that contain values to be added to a list. This is an issue that may be addresses in later versions.

Messing with policies can cause unpredictable behaviour. For example if you alter the section splitting parameters, then the chances of a section cross-reference elsewhere in the document being calculated as a correct hyperlink diminishes.

That's why this feature is offered UNSUPPORTED

To further complicate matters, AscToHTM uses a readahead, write behind buffer which means that you may need to experiment with the placing of your policy change to within 40 lines (the size of the buffer).

This problem is alleviated since version 3.2.


2.7 Definition blocks and variables

Using pre-processor tags you can define "blocks" of text known as "definition blocks".

Definition blocks allow blocks of output to be defined out of sequence, that is the content is defined in one location, and then may be instantiated at a number of different locations.

A definition block has the form

        $_$_DEFINE_BLOCK <block name>
        ..
        text that forms the block
        ..
        $_$_END_BLOCK

The text inside the block may contain in-line tags, but it cannot contain any other tag directives.

To invoke a block use the EMBED_BLOCK or INSERT_BLOCK
commands.

One tag that is particularly useful inside blocks is the VARIABLE tag. You can define variables throughout the document and then quote them inside a define block.

A possible example of use would be the addition of "page" footers. You could define the text that goes inside a page footer, and include in it a variable called PAGE_NUMBER. You can then re-define the PAGE_NUMBER and output a new page boundary with the commands

        $_$_DEFINE_VARIABLE PAGE_NUMBER 21
        $_$_INSERT_BLOCK PAGE_FOOTER

having previously defined a PAGE_FOOTER block.

It should perhaps be pointed out that "pages" are anathema to HTML, but should you want this feature this is a possible implementation.


2.8 HTML colours

Some tags accept colour values. These values should be HTML colours which - for example - may be placed in the various attributes of the <BODY> tag.

You can enter any value acceptable to HTML. Normally a value is expressed as a 6-digit hexadecimal value in the range 000000 (black) to FFFFFF (white), but certain colours such as "white", "blue", "red" etc may also be recognised by HTML. AscToHTM simply transcribes your value into the output file. The list of colours recognised in the HTML standard is

Colour
HTML Hex value
Black
#000000
Silver
#C0C0C0
Gray
#808080
White
#FFFFFF
Maroon
#800000
Red
#FF0000
Purple
#800080
Fuchia
#FF00FF
Green
#008000
Lime
#00FF00
Olive
#808000
Yellow
#FFFF00
Navy
#000080
Blue
#0000FF
Teal
#008080
Aqua
#00FFFF

Only these values will be converted by the software to the equivalent names. Other names exist outside the standard which may not be universally supported.


2.9 English/American spellings

As far as possible tags support both British English and American English spellings. This mainly occurs with the word "colour" (or "color"), so for example the directives

$_$_TABLE_ODD_ROW_COLOUR ....

and

$_$_TABLE_ODD_ROW_COLOR ....

are equivalent.



Previous page Back to Contents List Next page

Valid HTML 4.0! Converted from a single text file by AscToHTM
© 1997-2001 John A. Fotheringham
Converted by AscToHTM