This document is about a type of markdown format that
I use for my own documentation. This document also serves as
a test document for the scripts «mark.latex.pss» and «mark.html.pss»
etc. Those 2 scripts use a word-by-word parsing technique with the
pep/nom pattern engine to recognise the patterns in the document.
It is also important that this document should include 'insignifican’t'
patterns in order to test the ability of the mark scripts to
reduce tokens properly. Since this is a mark-down format, errors
in formatting should not cause the parser to crash or produce no output.
That means the parser should recover gracefully in all cases and at
least produce plain-text output (if not properly formatted).
This format is simply one that I enjoy using, and I don’t claim that
it has any general merit or that it is in any way 'better' than
the standard markdown or CommonMark format.
Images
Images can be inserted into documents starting with a double square
bracket [[ then followed by an image filename such as ../img/parsetree.png
The image tokens like [[ ]] and ../image/parsetree.png need to be
space delimited. An example image:
Images can also have a caption quotation, a width specifier and a
page position (float) indicator, in that order. Apart from the image
file name the other attributes are all optional, so we can specify
just the position indicator for example
The position indicators are currently >>> (float right) <<< (float left)
and ccc (center align).
Image widths can be measured in % pt cm mm or em
An example image format.
apple2e.pnglogo.spirals.pngpp.interactive.screenshot.pnglogo.circles.jpglogo.tricircle.pnglogo.lang.ibm.pngparsetree.pngAn image with 50% page width
(old versions) LaTeX doesn’t always like dots in file names, but mark.latex.pss
should deal with this.
A centered image at 20% width A centered imageAn image floating left at 20% width SpiralsAn image floating right at 40% width An image with width measured in 60pt points
Whitespace in the image format should be ignored (except within quotes)
So the following is a valid image format.
[[
../img/logo.tricircle.png
>>>
]]
See if it works
Captions for images
Captions for an image can be provided after the file name
with text within
""" (3 quotes).
Only single line captions are
allowed at the moment.
An example is
The Apple 2e logo
Keyboard keys
Typing Enter or Ins should render as some sort of
keystroke. This is not very important.
Code lines and code blocks
Code blocks are delimited by at least 3 '-' characters starting a line
and with at least 3 ',' characters. So ---- and ,,,,, do not delimit
a code block because the ---- does not start a line.
Code lines are delimited by >> also starting a line.
Code lines and code blocks can be preceded by a line starting with
an asterisk . The asterisk line is considered to be the description
of the following code. Here are some examples:
Some logo code to make a square.
repeat 4 [ fd 40 rt 90 ]
Logo code to make an octogon
repeat 8 [
fd 40 rt 45
]
Section headings
Section headings consist of all upper-case lines. Quotes are
also allowed in section headings like this
COMMAND "PUSH"
«subsection» headings
Subsections have the same format as section headings but end
the line with 4 dots like this ....
Currently no sub-sub-section headings are supported, because I
don’t use them.
Links and file names
File names and folders should be automatically linked or formatted
by looking at the file name extension and maybe a leading slash.
So /books/pars/index.txt should be marked up as a filename.
Any text beginning with www. or http:// or https:// etc should
probably be linked in HTML output or rendered differently
in other formats.
Source code for the pep/nom system is at http://bumble.sf.net/books/pars/
The site www.craftinginterpreters.com has links to a good book
about compiler and interpreter writing.
Lists
This 'plain-text' format supports ordered, unordered and definition
lists, but not, currently nested lists. Ordered lists start with o/-
at the beginning of a line and each item starts with a - dash
character. Unordered lists start with u/- and definition lists start
with d/- The lists are terminated with a blank line.
Ordered lists
First item in an ordered list
second item
within lists other markup should be rendered
like filenames such as file.png and even images
Empty items may make a list nest in latex.
need to investigate.
A list item with a code block
repeat 4 [ rt 90 ]
second item containing a code line and description
Markup in lists
continue; break;
In the nomdoc format an emline* token is problematic
because it looks forward for the next token, which
causes problems in lists
Here is trouble.
- within lists other markup should be rendered
like filenames such as file.png and even images
Unordered lists
unordered lists start with u/-
each item has a '-' dash and
can span multiple lines
the list is terminated with a blank line.
lists cannot be nested at the moment.
Definition lists
term
definition
item
each item (term and definition) begins with a dash '-' character,
and can span multiple lines.
colon
The ':' character is used to delimit the definition
term from the definition. The newline character also can be
used to start the definition. Definitions can contain other
markup like code definitions
repeat 8 [ rt 45 fd 20 ]
And also filenames like mark.format.txt
no definition
empty
delimiter
starts with a 'd/-'
the definition listends with
a blank line (a line consisting of only whitespace).
if a list has nothing in it, it should produce and empty list
empty
Special words
It can be enjoyable to markup certain words like LaTeX with
special formatting, maybe even including a small icon. Candidates
would be Instagram, CommonMark, LaTeX, Pep
Date lists
Often I write a series of entries under dates in format
such as 12 aug 2022 on a line by itself. These can be parsed and
translated just like the ordered/unordered/definition lists
These need to end with a special token since they can contain
blank-lines. Also dates like jan 2000 on a new line should
become dates. Month names can be a 3 letter abbreviation like:
jan feb mar apr may jun jul aug sept oct nov dec
or else the full month name in English like:
january february march april may june july august
september october november december
The recognition of the month name is case insensitive so:
jan feb march april may june
should all be seen as month names.
However, invalid dates like 33 jan 2001 should not be parsed as
dates. A date list must begin with a blank line and end with
a special token like [/dates] or [/date] or maybe just a double
blankline.
The date must be in the order; day month year so
mar 23 2000
is not considered a valid date, nor is
1999 aug 20 35 mar 2000
is also not a valid date, because the day number is out of range.
Following is a set of test lists, to ensure that the datelist* token
is parsing correctly.
An completely empty datelist
10 aug 2001
...
A list with a single word
10 aug 2001
test
A list with a single word and blank line
10 aug 2001
test
A list with 2 empty dates
10 aug 2001
...
...
A date list with blank lines
10 aug 2001
blank lines
...
...
A list with a code line in it
3 aug 2001
include code in list
second list. see mark.format.txt for info.
Maybe include lists in datelists. But not star lines
at the moment.
...
A datelist with a star line just before the end
2 aug 2001
...
Datelist with a list in a dateA datelist with star/code block
2 aug 2001
things done
debug
think
sed s///g
1 January 2010
worked on this system. Blank lines should be allowed within
date lists. The date-list has no special start token, just
a valid date. The first date in the list needs to be on
a line by itself and with a blank line above it. The next
date items can have text after them.
some code
Thought about a logo language parser and drawing with it
in TCL/TK or java.
worked on this file mark.format.txt to provide documentation
for the mark.latex.pss script.
aug 2022
dates without a day number are not (currently) valid dates.
But the date needs to start a line so that:
1 jan 2022 is valid e there is this text after it.
The date list has a special end token '[/dates]'
Glossaries and «faqs»
Although I don’t use these lists much it could be handy to
have them in the format. The translation should be similar to
definition lists.
see the «palindrome.pss» file or the /tr folder.
" multiword quoted text""and"
" no end quote
""""
Open the document «mark.html.pss» or inspect pep.c
<new>
www.glintbox.org/file.txt
[[ <code>/test.html</code>
]]