The "Tag Manual" for JafSoft's text conversion utilities

This documentation can be downloaded as part of the documentation set in .zip format


Previous page Back to Contents List Next page

1 Introduction

JafSoft Limited have produced the following programs

These programs share the same text analysis engine, and should do a good job of understanding the structure of the text document and replicating it in the target format.

However frequently users will want to have control over how the output looks and occasionally the text analysis will go wrong. For that reason the software supports

This document describes the tagging system that the software supports.


1.1 Overview of the pre-processor

During the analysis process the software reads the source files line-by-line. The pre-processor recognises special keywords in two ways :-

Directives

"Directives" consist of a single line in the source file beginning with the string "$_$_" followed by a recognised keyword and any additional "attributes" that the directive supports.

In-line tags

In-line tags, as the name implies, can occur anywhere in the source lines. They are enclosed between the special strings "[[" and "]]". Between these strings the tag consists of a keyword and then any attributes that tag supports.

In both cases the tag or directive cannot be split over multiple lines, that is directives must be on a line by themselves, and in-line tags must be wholly contained on a single line.

Examples of a directive and in-line tags are shown below.

      $_$_LINERULE 75%

      The BR tag means this text will be broken [[BR]] into two lines.

becomes :-


The BR tag means this text will be broken
into two lines.


Some tags can be expressed in either directive or in-line form.


1.2 Directives

Directives exist on a line by themselves in the source text. They have the form

$_$_<keyword> <attribute_list>

where the "$_$_<keyword>" must occur at the start of the line and the <keyword> must be recognised.

If the "$_$_<keyword>" is not at the start of the line, the directive is ignored and treated as end-user text. This device has been used to date to aid conversion of AscToHTM's own documentation to HTML.

The format of the <attribute_list> depends on the particular tag, but a general description is given in 1.4.


1.3 In-line tags

In-line tags may occur anywhere within end-user text, but not on directive lines or inside other in-line tags. In-line tags have the form

[[<keyword> <attribute_list>]]

that is the <keyword> and its <attribute_list> are between "[[" and "]]" delimiters. Initially the start and end delimiters must lie on the same line of the source text.

The <keyword> must be recognised, even if the tag is not yet fully implemented.

Note, the delimiters "[[" and "]]" are themselves dynamically configurable.

The format of the <attribute_list> depends on the particular tag, but a general description is given in 1.4.


1.4 Tag attribute lists

The <attribute_list> should be a comma-delimited set of attribute values. The number and types of attributes expected will depend on the tag concerned.

Some tags allow attributes to be optionally omitted, and a default value used instead. If the attribute being omitted is not the last on the list, then a place-saving comma should be supplied.

Examples

TAG 1,2,3,4
// full list
TAG 1,,3,4
// argument 2 is missing
TAG 1,,3,
// arguments 2 and 4 are missing
TAG 1,,3
// arguments 2 and 4 are missing
TAG 1
// arguments 2, 3 and 4 are missing

If a mandatory argument is missing (one for which no default value is permitted), a TAG_ERROR will be signalled.

Each attribute in the list will be of a particular type. Supported types are

If a particular attribute value is incorrect for the expected type, then a TAG_ERROR will be signalled.

String attributes should be be placed in quotes if they contain commas themselves. It's probably good practice to place them in quotes in any case. Quotes within string attributes should be doubled up e.g.

"This string has the word ""quotes"" in quotes"

becomes

This string has the word "quotes" in quotes

Here are some examples

TAG 1,2
TAG ,2

// arg 1 is '1', arg 2 is '2'
// arg 1 is missing, arg 2 is '2'
TAG "one",2
TAG "say one",2

// arg 1 is 'one', arg 2 is '2'
// arg 1 is 'say one', arg 2 is '2'
TAG "a,b", 2
TAG "say ""one"""", 2

// arg 1 is 'a,b', arg 2 is '2'
// arg 1 is 'say "one"', arg 2 is '2'
TAG
TAG ,
TAG ,,

// both arguments missing, defaults will be used
// both arguments missing, defaults will be used
// both arguments missing, defaults will be used


1.5 Error handling

1.5.1 Unrecognised and unimplemented tags

Unrecognised and unimplemented tags will be signalled via messages. These messages will be given different severities. The software allows messages to be filtered by severity and type, so in this way the testing and production versions of the software can be made to report differently.


1.5.2 Parse errors

All failures to fully parse a tag will be reported. The actual error recovery (if any) will vary on a tag-by-tag basis. Usually the tag will simply be ignored, removed from the end-user text and signalled as in error. Occasionally a certain default behaviour may be possible.

For example, a failure to fully parse the attribute list of a LINERULE tag would probably default to outputting a simple <HR> into the HTML, rather than completely ignoring the LINERULE request.


1.6 Tagging restrictions

Directives must be on a line by themselves.

You can have multiple in-line tags on one line, but no single in-line tag may be spread over multiple lines.


Previous page Back to Contents List Next page

Valid HTML 4.0! Converted from a single text file by AscToHTM
© 1997-2001 John A. Fotheringham
Converted by AscToHTM