Pep and Nom

home | documentation | examples | translators | download | blog | all blog posts

the "nom" language and another minimal markup format

An explanation of the text-format which is rendered by the NOM script eg/text.tohtml.pss

The world definitely doesn’t need another minimal markup format. That much we know. But NOM is good at parsings and transpiling certain grammars, and less good at others (to put it politely). And since I write quite a bit, I like to use my own format which looks good to me.

As with other “minimal markup” formats, the whole idea is to allow the author to write and not get distracted for the fancy formatting or the visual clutter of markup languages (like HTML, LATEX , postscript etc).

This document, apart from explaining the text format which can be rendered into [html], also serves as a test document for the NOM script /eg/text.tohtml.pss

links

Links are pretty important in web docs, so I have included lots of ways to do this.

There are lots and lots of link formats which I find convenient. Here is a list:

To render a url as a link just write it beginning with http:// etc. Delimited by spaces. To change the link text include a quoted word or phrase before it.

The formatter makes a link to a local file just with some quoted text and the file name

"The text.tohtml.pss format" text.tohtml.format.txt will render as The text.tohtml.pss format

or a www url: "A www link" www.c3.org A www link

fake schema links

All the following are ok

    rosetta:// rosetta: link to rosettacode.org problem
    urbandict://  urban dictionary
    oed:// oxford english dict online
  

Also Wikipedia links should work with wp:// and wp: All the links can be preceded with some double quoted text (with a maximum length of one line) which will become the visible text of the link. For example

visible link text
 "click here" wp://parser for more info

make a wikipedia link
 "BNF" wp:backus-naur_form

which should render as BNF

make a google search link

    google://"distance hobart to colombia"
    google:"distance hobart to colombia"
    "distance to hobart" google:"distance hobart to colombia"
  

www.google.com/search?q=distance+hobart+to+colombia
www.google.com/search?q=distance+hobart+to+colombia
distance to hobart

The double quote must immediately follow the :

I can link to a the documentation for a Nom script language command with push or The Nom language 'push' command

The pop; command is to pop the stack The stack is also a part of the PEP machine The tape is also part of the machine

We can link to the wikipedia context-free language page

Or look up the definition of gargle which is a throat noise.

Lookup the urban dictionary eg automagically But you don’t have to write :// if you don't want to. Just write: urbandict:blamestorming For example: blamestorming

Actually the Urban-dictionary just uses a space for multiple words like this: url?term=humble brag or url?term=humble%20brag but this is a slight problem for our plain-text parser /eg/text.tohtml.pss because it is a word parser (mainly). I could copy the code for the google:// schema to solve this (by “parsing ahead” ) but I am not sure if I will.

lists

Unordered lists have been implemented (18 mar 2025) and are indicated by items starting with the '- ' word which starts the line. There is no list-start token (just - ). Lists are terminated by a blank line or by the end of the file.

Lists should be able to contain any other element such as links, images, code-lines and blocks and even headings (which is rather silly). There is no 'list heading' format at the moment.

emphasis

Here I am following the markdown style of using asterisks “*” to delimit emphasis or a word or also of a phrase. The emphasis cannot extend over several lines.

Also you can use “** double asterisks**” which should boldly emphasize text text like this double asterisks or just emphasise a single word

I have limited these formats to a single line so that the entire document doesn’t become italic or italic-bold if you forget to put in the last “*” character. The same applies to double quoted text.

emphasise text

   *italic text* or italic *word*
   **bold italic text** or bold italic **word**
  

As John McEnroe used to say to the umpires You cannot be serious!!

lists

There are no “lists” at the moment, ordered, unordered or otherwise. I will only add them if I really am going to use them. And I probably will

miscellaneous

#: starting a line is a comment at the moment. It is removed from output
 #: is a comment

make a horizontal "rule" (line) in the document
 >---  (starting the line, end with space)

Which should look like this (in Html just a <hr/> element).


force a line break, this could be a "poor mans list"
 >> (not starting a line)
So we could
force a break
and make html put
a <br/> element there.

apostrophes

I try to render apostrophes in words like doesn’t can’t isn’t in some slightly elegant way. Also, since I often leave them out I try to insert them is words like

isnt doesnt thats its whats and so on which should look like isn’t doesn’t that’s it’s what’s

quotes

Writers of LATEX and html formatters seem to spend too much time trying to produce nice “quotes". It can be nice to change” dont into don't with curly quotes, or <cant> into can’t etc. A normal apostrophe is like this's.

I’m you’re he’s she’s it’s we’re they’re aren’t can’t couldn’t didn’t doesn’t hadn’t hasn’t haven’t

Quoted text can occur before all links which changes the link text eg: The nom command

multiline blockquotes

Multiline quotes begin with """ starting the line. and also end with """ any where. These quotes will be rendered as a quotation .

Also, we can precede a multiline quote with a line beginning with “*” and that will be formatted as the citation for the blockquote. I think the """ triple quotes has to start a line and be followed by at least 1 space, but the terminating """ doesn’t have to.

a blockquote with a citation, example

    * John McEnroe, Tennis player
    """ 
      you cannot be serious 
    """
  

Which should look like this.

You cannot be serious! John McEnroe, Tennis player

The actual (quite nice) formatting of these “quotes” is done by the css in /site.blog.css which should be available to all blogs which are created with the blog script

images

My plain text image format uses < and > to indicate the image and it’s attributes. The format is

 <:corners:size:alignment:filepath.ext>

The primitive static site generator script /webdev/copy.blog.sh also substitutes the image “/image/test.aux” for a random image from the /image/ folder.

put a random image on the page, with big rounded corners, floating left
 <:R:>>:/image/test.aux>

I use the format “caption text” <image.file.ext> or <image.file.ext> for no caption. And maybe even <:9:image.file.ext> for a width specification

For the css attributes to make the image circular, it 1st needs to be cropped to a square. Otherwise it will look like an egg etc Images can be included in a document.

making images.
 enclose the image file name in <> brackets: <imagefile.jpg>
 Include a caption with quoted text "An image caption" <imagefile.jpg>

The image above has no attributes. It will probably just go to the left hand side of the page.

Images without captions are working. Because there is a bit of a tricky interaction between the <figure> tag, and the <img> tag. The figure tag is required for the caption but the way to float left or right is slightly different. Also, we need to have a width for images in <figure> tags. I wonder if images float to left of the text above them. Or below them. The answer is, they float around the text below them, I think. And if paragraph tags <p/> make any difference.

Also, it is a good idea to have a sensible default size for an image otherwise it can take over the whole HTML page. Apart from a caption images can have a set of optional attributes

circular horse (caption, 20em width, default float)

This text is between the circular horse image and the circular duck which is floating right.

A duck (50% border radius, floating right)

duck star (missing 1st ':',small rounded corners,no float,caption)

I don’t think you can use em measurements for images, but I am not sure. The image below is a duck and star with no float.

duck star: big rounded corners,float left, caption

make a circular/ellipse image that floats to the left with width 100px
 <:0:>>:image/doodle.network.200w.jpg>
 <:o:>>:image/imagename.jpeg> # does the same

a circular image with width not specified and a caption
 <:O:>>:imagename.jpeg> 

make a rectangular image that floats right taking up 30% of page width
 <:<<:50%:imagename.jpeg>

make an image with its original size
 <imagename.jpeg>

duck (small rounded corner, float left)

The text is supposed to “flow around” floating images. A centre aligned image with big rounded corners with 50% width But actually :cc: will do nothing because I am not parsing it, just discarding it.

Text should not flow around centre aligned images.

image text attributes

Attributes need to appear in the correct order but all attributes are optional.

image text attribute order
 <:corners:width:alignment:imagename.ext>

valid image text attribute values.

    O|o circular image
    # width is multiple of 5em?
    # 5emxN width specification.
    r R small/large rounded corners
    >> << cc float right/ float left/ centre align
  

Demonstrating optional attributes.
 <:>>:image/file.jpg>

code

One of the points of this format is to document code. So I need to be able to include code in the text document. For blocks of code (multiple lines) I start the line with --- (at least 3 dashes) and end the block with a line starting with ,,, (at least 3 commas).

For a single line of code I start the line with
(with at least 1 space after)

The format allows providing a caption for the code (above it) by including a line starting with “*” (followed by a space) above the code line or block.

a nom script that deletes whitespace

    read; [:space:] { clear; } 
    print;
  

run a one line nom script
 pep -e "read; print; clear;" -i "abcd"

nom code

To display and possibly pretty print NOM code use

 ---+ and ,,,,
to delimit the code instead of ---

The pretty printing is done by /eg/nom.snippet.tohtml.pss after the script /eg/text.tohtml.pss is run.

acronyms

Markup HTML as an <abbr> with a title (tool tip) There is a big list of this acronyms like XML C++ JAVA PEP NOM MINIX stdin stdout etc

The main purpose of them is to make the text kinda more interesting and to provide a quick explanation of an acronym or a technical (programming) idea.

etymologies

I also had the idea of providing interesting word-derivations as an HTML title= attribute but I haven’t really done any yet except for heuristic . Again, the idea here is just to have fun, none of this is particularly necessary.

icons and logos

It would probable be nice to include an icon after a name such as Wikipedia. Maybe a format like [wikipedia] could do this.

filenames

Filename like /pars/tr/translate.go.pss should be rendered differently. Also they will be linked if there is some quoted text before them.

headings

Normally in my documents, I use all capital lines as a first level heading. Like this:

 SECTION 3
For second level headings I use all capitals followed by 4 dots:
 SUBSECTION 1 ....

But for the sake of parsing simplicity, I am just using MARKDOWN headings with 3 levels. So lines starting with # or ## or ### (followed by a space).

implementation

The HTML formatting is produced by a nom script called text.tohtml.pss which is a simplification of an older script called mark.html.pss and A LaTeX formatter

The HTML formatter can be translated into other languages such as c, go, java, javascript, perl, python, ruby, and tcl. This can be done using the translation scripts in /tr/ folder on the nomlang.org site . These scripts are also hosted on sourceforge at bumble.sf.net/books/pars/tr

how to build the html formatter script with go

    pep -f tr/translate.go.pss text.tohtml.pss > text.tohtml.go
    go build text.tohtml.go
    echo ' "nom" http://nomlang.org ' | ./text.tohtml
    # or 
    cat doc.txt | ./text.tohtml > doc.html
 

create a python version of the formatter and run it

   pep -f tr/translate.py.pss text.tohtml.pss > text.tohtml.py
   chmod a+x text.tohtml.py
   cat doc.txt | ./text.tohtml.py > doc.html
 

Similar translators exist for JAVA javascript, TCL c (non-unicode) and others. )

other text formats

There are plenty of other good “plain text” formats and they are all useful for avoiding the torture of writing and rewriting HTML or LATEX or XML or some other maddening markup format. And let's not even talk about wysiwyg word processors.

The big formats are: markdown and the wikimedia wiki format (from wikipedia). I would like to write translators for my format into markdown and wikimedia but I will wait until the formatter script is more or less “mature” and has lists and ALL CAPITAL HEADINGS like this.

history

This format (and script) was begun in Feb 2025 with quite humble intentions but has steadily grown an has been quite simple to debug. This is because I use a mainly word-by-word parsing technique (with some exceptions) which keeps the grammar quite simple. This is why the script /eg/text.tohtml.pss has been quite successful and will almost certainly be a permanent replacement for /eg/mark.html.pss and /eg/mark.latex.pss which became, to be brutally honest, infernally complex because the grammar was complex.

The format has been successfully used on the NOM documentation site www.nomlang.org and also on a general blog that I write called www.shrob.org .