An explanation of the text-format which is rendered by the NOM script eg/text.tohtml.pss
The world definitely doesn’t need another minimal markup format. That much we know. But NOM is good at parsings and transpiling certain grammars, and less good at others (to put it politely). And since I write quite a bit, I like to use my own format which looks good to me.
As with other “minimal markup” formats, the whole idea is to allow the author to write and not get distracted for the fancy formatting or the visual clutter of markup languages (like HTML, LATEX , postscript etc).
This document, apart from explaining the text format which can be rendered into [html], also serves as a test document for the NOM script /eg/text.tohtml.pss
Links are pretty important in web docs, so I have included lots of ways to do this.
There are lots and lots of link formats which I find convenient. Here is a list:
add
command which is on this site at www.nomlang.org
"search engine" www.google.comand it should render as search engine
"Gauss" wp://carl.friedrich.gaussand you should get Gauss
rosetta://balanced.bracketsor even
rosetta:balanced.bracketsAnd you will get a link the the balanced_brackets code on that site. And yes you can put some quoted text in front of it like this
"the balance brackets problem" rosetta://balanced.bracketsand you should get the balance brackets problem
eg/text.tohtml.pss
into
GO or
JAVA or
TCL or [ruby] or
[javascript] and run it without even touching the
PEP
interpreter? I know, bizarre hey? But true.
"the nom language" http://nomlang.organd you will get the nom language as you would expect.
The formatter makes a link to a local file just with some quoted text and the file name
"The text.tohtml.pss format" text.tohtml.format.txt will render as The text.tohtml.pss format
or a www url: "A www link" www.c3.org A www link
rosetta:// rosetta: link to rosettacode.org problem
urbandict:// urban dictionary
oed:// oxford english dict online
Also Wikipedia links should work with wp:// and wp: All the links can be preceded with some double quoted text (with a maximum length of one line) which will become the visible text of the link. For example
"click here" wp://parser for more info
"BNF" wp:backus-naur_form
which should render as BNF
google://"distance hobart to colombia"
google:"distance hobart to colombia"
"distance to hobart" google:"distance hobart to colombia"
www.google.com/search?q=distance+hobart+to+colombia
www.google.com/search?q=distance+hobart+to+colombia
distance to hobart
The double quote must immediately follow the :
I can link to a the documentation for a Nom script language command
with push
or The Nom language 'push' command
The pop; command is to pop
the stack
The stack is also a part of the PEP machine
The tape is also part of the machine
We can link to the wikipedia context-free language page
Or look up the definition of gargle which is a throat noise.
Lookup the urban dictionary eg automagically But you don’t have to write :// if you don't want to. Just write: urbandict:blamestorming For example: blamestorming
Actually the Urban-dictionary just uses a space for multiple words like this: url?term=humble brag or url?term=humble%20brag but this is a slight problem for our plain-text parser /eg/text.tohtml.pss because it is a word parser (mainly). I could copy the code for the google:// schema to solve this (by “parsing ahead” ) but I am not sure if I will.
Unordered lists have been implemented (18 mar 2025) and are indicated by items starting with the '- ' word which starts the line. There is no list-start token (just - ). Lists are terminated by a blank line or by the end of the file.
Lists should be able to contain any other element such as links, images, code-lines and blocks and even headings (which is rather silly). There is no 'list heading' format at the moment.
Here I am following the markdown style of using asterisks “*” to delimit emphasis or a word or also of a phrase. The emphasis cannot extend over several lines.
Also you can use “** double asterisks**” which should boldly emphasize text text like this double asterisks or just emphasise a single word
I have limited these formats to a single line so that the entire document doesn’t become italic or italic-bold if you forget to put in the last “*” character. The same applies to double quoted text.
*italic text* or italic *word*
**bold italic text** or bold italic **word**
As John McEnroe used to say to the umpires You cannot be serious!!
There are no “lists” at the moment, ordered, unordered or otherwise. I will only add them if I really am going to use them. And I probably will
#: is a comment
>--- (starting the line, end with space)
Which should look like this (in Html just a <hr/> element).
>> (not starting a line)
force a break
and make html put
a <br/> element
there.
I try to render apostrophes in words like doesn’t can’t isn’t in some slightly elegant way. Also, since I often leave them out I try to insert them is words like
isnt doesnt thats its whats and so on which should look like isn’t doesn’t that’s it’s what’s
Writers of LATEX and html formatters seem to spend too much time trying to produce nice “quotes". It can be nice to change” dont into don't with curly quotes, or <cant> into can’t etc. A normal apostrophe is like this's.
I’m you’re he’s she’s it’s we’re they’re aren’t can’t couldn’t didn’t doesn’t hadn’t hasn’t haven’t
Quoted text can occur before all links which changes the link text eg: The nom command
Multiline quotes begin with """ starting the line. and also end with """ any where. These quotes will be rendered as a quotation .
Also, we can precede a multiline quote with a line beginning with “*” and that will be formatted as the citation for the blockquote. I think the """ triple quotes has to start a line and be followed by at least 1 space, but the terminating """ doesn’t have to.
* John McEnroe, Tennis player
"""
you cannot be serious
"""
Which should look like this.
You cannot be serious! John McEnroe, Tennis player
The actual (quite nice) formatting of these “quotes” is done by the css in /site.blog.css which should be available to all blogs which are created with the blog script
My plain text image format uses < and > to indicate the image and it’s attributes. The format is
<:corners:size:alignment:filepath.ext>
The primitive static site generator script /webdev/copy.blog.sh also substitutes the image “/image/test.aux” for a random image from the /image/ folder.
<:R:>>:/image/test.aux>
I use the format “caption text” <image.file.ext> or <image.file.ext> for no caption. And maybe even <:9:image.file.ext> for a width specification
For the css attributes to make the image circular, it 1st needs to be cropped to a square. Otherwise it will look like an egg etc Images can be included in a document.
enclose the image file name in <> brackets: <imagefile.jpg>
Include a caption with quoted text "An image caption" <imagefile.jpg>
The image above has no attributes. It will probably just go to the left hand side of the page.
Images without captions are working. Because there is a bit of a tricky interaction between the <figure> tag, and the <img> tag. The figure tag is required for the caption but the way to float left or right is slightly different. Also, we need to have a width for images in <figure> tags. I wonder if images float to left of the text above them. Or below them. The answer is, they float around the text below them, I think. And if paragraph tags <p/> make any difference.
Also, it is a good idea to have a sensible default size for an image
otherwise it can take over the whole HTML page.
Apart from a caption images can have a set of optional attributes
This text is between the circular horse image and the circular duck which is floating right.
I don’t think you can use em measurements for images, but I am not sure. The image below is a duck and star with no float.
<:0:>>:image/doodle.network.200w.jpg>
<:o:>>:image/imagename.jpeg> # does the same
<:O:>>:imagename.jpeg>
<:<<:50%:imagename.jpeg>
<imagename.jpeg>
duck (small rounded corner, float left)
The text is supposed to “flow around” floating images. A centre aligned image with big rounded corners with 50% width But actually :cc: will do nothing because I am not parsing it, just discarding it.
Text should not flow around centre aligned images.
Attributes need to appear in the correct order but all attributes are optional.
<:corners:width:alignment:imagename.ext>
O|o circular image
# width is multiple of 5em?
# 5emxN width specification.
r R small/large rounded corners
>> << cc float right/ float left/ centre align
<:>>:image/file.jpg>
One of the points of this format is to document code. So I need to be able to include code in the text document. For blocks of code (multiple lines) I start the line with --- (at least 3 dashes) and end the block with a line starting with ,,, (at least 3 commas).
For a single line of code I start the line with
(with at least 1
space after)
The format allows providing a caption for the code (above it) by including a line starting with “*” (followed by a space) above the code line or block.
nom
script that deletes whitespace
read; [:space:] { clear; }
print;
nom
script
pep -e "read; print; clear;" -i "abcd"
To display and possibly pretty print NOM code use
---+ and ,,,,to delimit the code instead of ---
The pretty printing is done by /eg/nom.snippet.tohtml.pss after the script /eg/text.tohtml.pss is run.
Markup HTML as an <abbr> with a title (tool tip) There is a big list of this acronyms like XML C++ JAVA PEP NOM MINIX stdin stdout etc
The main purpose of them is to make the text kinda more interesting and to provide a quick explanation of an acronym or a technical (programming) idea.
I also had the idea of providing interesting word-derivations as an HTML title= attribute but I haven’t really done any yet except for heuristic . Again, the idea here is just to have fun, none of this is particularly necessary.
It would probable be nice to include an icon after a name such as Wikipedia. Maybe a format like [wikipedia] could do this.
Filename like /pars/tr/translate.go.pss should be rendered differently. Also they will be linked if there is some quoted text before them.
Normally in my documents, I use all capital lines as a first level heading. Like this:
SECTION 3For second level headings I use all capitals followed by 4 dots:
SUBSECTION 1 ....
But for the sake of parsing simplicity, I am just using MARKDOWN headings with 3 levels. So lines starting with # or ## or ### (followed by a space).
The HTML formatting is produced by a nom script called text.tohtml.pss which is a simplification of an older script called mark.html.pss and A LaTeX formatter
The HTML formatter can be translated into other languages such as c, go, java, javascript, perl, python, ruby, and tcl. This can be done using the translation scripts in /tr/ folder on the nomlang.org site . These scripts are also hosted on sourceforge at bumble.sf.net/books/pars/tr
pep -f tr/translate.go.pss text.tohtml.pss > text.tohtml.go
go build text.tohtml.go
echo ' "nom" http://nomlang.org ' | ./text.tohtml
# or
cat doc.txt | ./text.tohtml > doc.html
pep -f tr/translate.py.pss text.tohtml.pss > text.tohtml.py
chmod a+x text.tohtml.py
cat doc.txt | ./text.tohtml.py > doc.html
Similar translators exist for JAVA javascript, TCL c (non-unicode) and others. )
There are plenty of other good “plain text” formats and they are all useful for avoiding the torture of writing and rewriting HTML or LATEX or XML or some other maddening markup format. And let's not even talk about wysiwyg word processors.
The big formats are: markdown and the wikimedia wiki format (from wikipedia). I would like to write translators for my format into markdown and wikimedia but I will wait until the formatter script is more or less “mature” and has lists and ALL CAPITAL HEADINGS like this.
This format (and script) was begun in Feb 2025 with quite humble intentions but has steadily grown an has been quite simple to debug. This is because I use a mainly word-by-word parsing technique (with some exceptions) which keeps the grammar quite simple. This is why the script /eg/text.tohtml.pss has been quite successful and will almost certainly be a permanent replacement for /eg/mark.html.pss and /eg/mark.latex.pss which became, to be brutally honest, infernally complex because the grammar was complex.
The format has been successfully used on the NOM documentation site www.nomlang.org and also on a general blog that I write called www.shrob.org .