## The "Nom" language and another minimal markup format An explanation of the text-format which is rendered by the [nom] script file://eg/text.tohtml.pss >:image/doodle.horse.duck.swirl.300.jpg> The world definitely doesn't need another *minimal markup* format. That much we know. But [nom] is good at parsings and transpiling certain grammars, and less good at others (to put it politely). And since I write quite a bit, I like to use my own format which looks good to me. As with other "minimal markup" formats, the whole idea is to allow the author to write and not get distracted for the fancy formatting or the visual clutter of markup languages (like HTML, [latex] , postscript etc). This document, apart from explaining the text format which can be rendered into [html], also serves as a test document for the [nom] script file:///eg/text.tohtml.pss ## LINKS Links are pretty important in web docs, so I have included lots of ways to do this. There are lots and lots of link formats which I find convenient. Here is a list: - just write the *filename* with a leading */* like this /eg/text.tohtml.format.html and that should become a link - use *link://something* and *file://something* but this seems a bit obsolete now given the above. - link to a [nom] command with *nom://* or *nom:* for example: write *nom://add* to link to the documentation for the nom://add command which is on this site at www.nomlang.org - just write the url with a *www.* at the start like this *www.nomlang.org* which gets rendered at www.nomlang.org - now amazingly, you can precede any of the above formats with some quoted text like this: *"words and text"* (it does have to be double quoted) so you can just write >> "search engine" www.google.com and it should render as "search engine" www.google.com - And all this should work in lists too. - But we arent finished - not nearly. There are lots of "convenience" formats like link to wikipedia with *wp://* or *wp:* . So just write *wp://carl.friedrich.gauss* and it should render as a link to wp://carl.friedrich.gauss and you may have noticed that I dont like writing *underscores* so just write dots instead - and you can put some link text in front of it like >> "Gauss" wp://carl.friedrich.gauss and you should get "Gauss" wp://carl.friedrich.gauss - But there are lots of others too ... like link to the urban dictionary with *urbandict://* like the word urbandict://palaver or to the Oxford English dictionary with *oed://* etc - But wait there is more. There is *pep://* for the [pep] machine, there is *nomsyn://* for [nom] syntax and so on and so forth. - Then there is the *rosetta://* convenience schema, which links to the www.rosettacode.org site. just write >> rosetta://balanced.brackets or even >> rosetta:balanced.brackets And you will get a link the the rosetta://balanced.brackets code on that site. And yes you can put some quoted text in front of it like this >> "the balance brackets problem" rosetta://balanced.brackets and you should get "the balance brackets problem" rosetta://balanced.brackets - again amazingly, the quoted text doesnt even have to be on the same line as the url or the fake url. It can be on the line before - all this power and convenience is brought to you by the fact that [nom] is a proper parsing/compiling system, not some dodgy regex engine with extensions. So the formatting script, which is /eg/text.tohtml.pss can handle all this without any problem. - Did you know that you can convert the formatting script eg/text.tohtml.pss into [go] or [java] or [tcl] or [ruby] or [javascript] and run it without even touching the [pep] interpreter? I know, bizarre hey? But true. - If you dont want to autolink a file name then you can write it without a starting */* slash, or put it in quotes or *emphasise* it with asterisks. - And of course if you start a url with *http://* or *https://* or *nntp://* (lets not forget good old newsgroups) then even will get nicely linked as it should. so just write: >> "the nom language" http://nomlang.org and you will get "the nom language" http://nomlang.org as you would expect. - I think I have actually run out of linking formats, but dont worry, more will probably get added at some point. - See below for hopefully more detailed info about all this, if you really need it, but I dont see why you would. To render a url as a link just write it beginning with http:// etc. Delimited by spaces. To change the link text include a quoted word or phrase before it. The formatter makes a link to a local file just with some quoted text and the file name *"The text.tohtml.pss format" text.tohtml.format.txt* will render as "The text.tohtml.pss format" text.tohtml.format.txt or a *www* url: *"A www link" www.c3.org* "A www link" www.c3.org ### FAKE SCHEMA LINKS * All the following are ok ----- rosetta:// rosetta: link to rosettacode.org problem urbandict:// urban dictionary oed:// oxford english dict online ,,,, Also Wikipedia links should work with *wp://* and *wp:* All the links can be preceded with some double quoted text (with a maximum length of one line) which will become the visible text of the link. For example * visible link text >> "click here" wp://parser for more info * make a wikipedia link >> "BNF" wp:backus-naur_form which should render as "BNF" wp:backus-naur_form * make a google search link ---- google://"distance hobart to colombia" google:"distance hobart to colombia" "distance to hobart" google:"distance hobart to colombia" ,,,, google://"distance hobart to colombia" >> google:"distance hobart to colombia" >> "distance to hobart" google:"distance hobart to colombia" >> The double quote must immediately follow the : I can link to a the documentation for a Nom script language command with nom://push or "The Nom language 'push' command" nom://push "The pop; command" nom://pop is to nom://pop the stack The "stack" pep://stack is also a part of the *PEP* machine The pep://tape is also part of the machine We can link to the "wikipedia context-free language page" wp://context.free.languge Or look up the definition of "gargle" oed://gargle which is a oed://throat noise. Lookup the urban dictionary eg urbandict://automagically But you dont have to write *://* if you don't want to. Just write: *urbandict:blamestorming* For example: urbandict:blamestorming Actually the Urban-dictionary just uses a space for multiple words like this: *url?term=humble brag* or *url?term=humble%20brag* but this is a slight problem for our plain-text parser file:///eg/text.tohtml.pss because it is a *word* parser (mainly). I could copy the code for the *google://* schema to solve this (by "parsing ahead" ) but I am not sure if I will. ## LISTS Unordered lists have been implemented (18 mar 2025) and are indicated by items starting with the '- ' word which starts the line. There is no list-start token (just *- * ). Lists are terminated by a blank line or by the end of the file. Lists should be able to contain any other element such as links, images, code-lines and blocks and even headings (which is rather silly). There is no 'list heading' format at the moment. - The 1st line: Simple lists only contain text - With one line per item. - The parsing of lists was interesting to say the least. - How to make a list in this particular *plain-text* document format. List items can span multiple lines and can contain any other type of element such as a centered image <:cc:/image/image.test.jpg> - lists are terminated with blank lines - which are lines containing only spaces or nothing. ## EMPHASIS Here I am following the "markdown" www.markdownguide.org/basic-syntax style of using *asterisks* "*" to delimit emphasis or a word or also of a phrase. The emphasis cannot extend over several lines. Also you can use "** double asterisks**" which should boldly emphasize text text like this ** double asterisks** or just emphasise a single **word** I have limited these formats to a single line so that the entire document doesnt become *italic* or **italic-bold** if you forget to put in the last "*" character. The same applies to double quoted text. * emphasise text ---- *italic text* or italic *word* **bold italic text** or bold italic **word** ,,,, As John McEnroe used to say to the umpires **You cannot be serious!!** ## LISTS There are *no* "lists" at the moment, ordered, unordered or otherwise. I will only add them if I really am going to use them. And I probably will ## MISCELLANEOUS * #: starting a line is a comment at the moment. It is removed from output >> #: is a comment * make a horizontal "rule" (line) in the document >> >--- (starting the line, end with space) Which should look like this (in Html just a
element). >---- * force a line break, this could be a "poor mans list" >> >> (not starting a line) So we could >> force a break >> and make html put >> a
element there. ## APOSTROPHES I try to render apostrophes in words like doesn't can't isn't in some slightly elegant way. Also, since I often leave them out I try to insert them is words like *isnt doesnt thats its whats * and so on which should look like isnt doesnt thats its whats ## QUOTES Writers of [latex] and html formatters seem to spend too much time trying to produce nice "quotes". It can be nice to change *dont* into don't with curly quotes, or into can't etc. A normal apostrophe is like this's. I'm you're he's she's it's we're they're aren't can't couldn't didn't doesn't hadn't hasn't haven't Quoted text can occur before all links which changes the link text eg: "The nom command" nom://push ## MULTILINE BLOCKQUOTES Multiline quotes begin with """ starting the line. and also end with """ any where. These quotes will be rendered as a **quotation** . Also, we can precede a multiline quote with a line beginning with "*" and that will be formatted as the *citation* for the blockquote. I think the *"""* triple quotes has to start a line and be followed by at least 1 space, but the terminating *"""* doesnt have to. * a blockquote with a citation, example -------- * John McEnroe, Tennis player """ you cannot be serious """ ,,,, Which should look like this. * John McEnroe, Tennis player """ You cannot be serious! """ The actual (quite nice) formatting of these "quotes" is done by the css in file:///site.blog.css which should be available to all blogs which are created with the "blog script" file:///webdev/copy.blog.sh ## IMAGES My plain text image format uses < and > to indicate the image and its attributes. The format is >> <:corners:size:alignment:filepath.ext> The primitive static site generator script /webdev/copy.blog.sh also substitutes the image "random.img" for a random image from the /image/ folder. * put a random image on the page, with big rounded corners, floating left >> <:R:>>:random.img> I use the format "caption text" or for no caption. And maybe even <:9:image.file.ext> for a width specification For the css attributes to make the image circular, it 1st needs to be cropped to a square. Otherwise it will look like an egg etc Images can be included in a document. * making images. >> enclose the image file name in <> brackets: >> Include a caption with quoted text "An image caption" The image above has no attributes. It will probably just go to the left hand side of the page. <:R:>>:image/doodle.horse.duck.swirl.300.jpg> Images without captions are working. Because there is a bit of a tricky interaction between the
tag, and the tag. The *figure* tag is required for the caption but the way to float left or right is slightly different. Also, we need to have a width for images in
tags. I wonder if images float to left of the text above them. Or below them. The answer is, they float around the text below them, I think. And if paragraph tags

make any difference. <:<<:image/doodle.horse.duck.swirl.300.jpg> Also, it is a good idea to have a sensible default size for an image otherwise it can take over the whole HTML page. Apart from a caption images can have a set of optional attributes "circular horse (caption, 20em width, default float)" <:o:4:image/doodle.horse.duck.swirl.300.jpg> This text is between the circular horse image and the circular duck which is floating right. "A duck (50% border radius, floating right)" <:o:>>:image/doodle.duck.star.900.jpg> "duck star (missing 1st ':',small rounded corners,no float,caption) " I dont think you can use *em* measurements for images, but I am not sure. The image below is a duck and star with no float. "duck star: big rounded corners,float left, caption" <:R:<<:image/doodle.duck.star.900.jpg> * make a circular/ellipse image that floats to the left with width 100px >> <:0:>>:image/doodle.network.200w.jpg> >> <:o:>>:image/imagename.jpeg> # does the same * a circular image with width not specified and a caption >> <:O:>>:imagename.jpeg> * make a rectangular image that floats right taking up 30% of page width >> <:<<:50%:imagename.jpeg> * make an image with its original size >> duck (small rounded corner, float left) <:r:<<:image/doodle.duck.floating.300.jpg> The text is supposed to "flow around" floating images. A centre aligned image with big rounded corners with 50% width But actually :cc: will do nothing because I am not parsing it, just discarding it. <:R:cc:image/doodle.ink.ripples.600.jpg> Text should not flow around centre aligned images. ## Image text attributes Attributes need to appear in the correct order but all attributes are optional. * image text attribute order >> <:corners:width:alignment:imagename.ext> * valid image text attribute values. ----- O|o circular image # width is multiple of 5em? # 5emxN width specification. r R small/large rounded corners >> << cc float right/ float left/ centre align ,,,,, * Demonstrating optional attributes. >> <:>>:image/file.jpg> <:<<:image/doodle.horse.dog.swirl.900w.jpg> ## Code One of the points of this format is to document code. So I need to be able to include code in the text document. For *blocks* of code (multiple lines) I start the line with --- (at least 3 dashes) and end the block with a line starting with ,,, (at least 3 commas). For a single line of code I start the line with >> (with at least 1 space after) The format allows providing a *caption* for the code (above it) by including a line starting with "*" (followed by a space) above the code line or block. * a nom script that deletes whitespace ------- read; [:space:] { clear; } print; ,,,,, * run a one line nom script >> pep -e "read; print; clear;" -i "abcd" ### NOM CODE To display and possibly pretty print [nom] code use >> ---+ and ,,,, to delimit the code instead of --- The pretty printing is done by /eg/nom.snippet.tohtml.pss after the script /eg/text.tohtml.pss is run. ## ACRONYMS Markup [html] as an with a title (tool tip) There is a big list of this acronyms like [xml] [c++] [java] [pep] [nom] [minix] [stdin] [stdout] etc The main purpose of them is to make the text urbandict://kinda more interesting and to provide a quick explanation of an *acronym* or a technical (programming) idea. ## ETYMOLOGIES I also had the idea of providing interesting word-derivations as an [html] *title=* attribute but I haven't really done any yet except for [heuristic] . Again, the idea here is just to have fun, none of this is particularly necessary. ## ICONS AND LOGOS It would probable be nice to include an icon after a name such as Wikipedia. Maybe a format like [wikipedia] could do this. ## FILENAMES Filename like /pars/tr/translate.go.pss should be rendered differently. Also they will be linked if there is some quoted text before them. ## HEADINGS Normally in my documents, I use all capital lines as a first level heading. Like this: >> SECTION 3 For second level headings I use all capitals followed by 4 dots: >> SUBSECTION 1 .... But for the sake of parsing simplicity, I am just using [markdown] headings with 3 levels. So lines starting with # or ## or ### (followed by a space). ## IMPLEMENTATION The [html] formatting is produced by a nom script called file://text.tohtml.pss which is a simplification of an older script called "mark.html.pss" mark.html.pss and "A LaTeX formatter" mark.latex.pss The "HTML formatter" /eg/text.tohtml.pss can be translated into other languages such as c, go, java, javascript, perl, python, ruby, and tcl. This can be done using the translation scripts in "/tr/" file:///tr/ folder on the http://nomlang.org site . These scripts are also hosted on sourceforge at http://bumble.sf.net/books/pars/tr * how to build the html formatter script with go ------- pep -f tr/translate.go.pss text.tohtml.pss > text.tohtml.go go build text.tohtml.go echo ' "nom" http://nomlang.org ' | ./text.tohtml # or cat doc.txt | ./text.tohtml > doc.html ,,, * create a python version of the formatter and run it ----- pep -f tr/translate.py.pss text.tohtml.pss > text.tohtml.py chmod a+x text.tohtml.py cat doc.txt | ./text.tohtml.py > doc.html ,,,, Similar translators exist for [java] javascript, [tcl] c (non-unicode) and others. ) ### OTHER TEXT FORMATS There are plenty of other good "plain text" formats and they are all useful for avoiding the torture of writing and rewriting [html] or [latex] or [xml] or some other maddening markup format. And lets not even talk about *wysiwyg* word processors. The big formats are: markdown and the wikimedia wiki format (from wikipedia). I would like to write translators for my format into markdown and wikimedia but I will wait until the formatter script is more or less "mature" and has lists and ALL CAPITAL HEADINGS like this. ### HISTORY This format (and script) was begun in Feb 2025 with quite humble intentions but has steadily grown an has been quite simple to debug. This is because I use a mainly *word-by-word* parsing technique (with some exceptions) which keeps the grammar quite simple. This is why the script file:///eg/text.tohtml.pss has been quite successful and will almost certainly be a permanent replacement for file:///eg/mark.html.pss and file:///eg/mark.latex.pss which became, to be brutally honest, *infernally* complex because the grammar was complex. The format has been successfully used on the [nom] documentation site www.nomlang.org and also on a general blog that I write called www.shrob.org .