Alexander the Great: What wish can I grant you?
Diogenes: Stand out of my light.
Diogenes
documentation for the syntagma language
The following information was extracted from /tr/syntagma.pss but
at the moment it is the best documentation for the language that
is available.
begin blocks only execute once at the beginning of the script
begin {
# create a time variable. The names $line,$char,$counter and the
# $1,$2, etc are reserved. Variables must be declared in the
# begin block.
var $time;
var $server = 'ssh://etc';
}
single line comments allowed
#<star> multiline comments between these <star>#
print 'hello'; exit 3; quit; delete;
at eof, delete the pattern space, print text and exit with code '4'
EOF { delete; print 'yes'; exit 4; }
delete all instances of 'green' in the pattern space text.
delete 'green';
ignore all whitespace (and delete it)
ignore [:space:];
ignore: [:space:]; # the same
ig [:space:]; # the same
delete: [:space:]; # the same
delete one char from the left of the pattern space
ltrim;
print “line: $line, char: $char” ; # interpolate line number with $line etc
use the accumulator counter
print “counter is $counter” ;
non interpolating between single quotes
print ' $delimiter is a syntagma system variable';
concatenation of everything with the dot operator
user variables wont interpolate.
begin { var $name := 'smooth'; }
a = b c {@1 := “$1+$2” .$1.' and '.$counter.' name:'.$name; }
double quotes are allowed and interpolate variables
print “problem at line $line \n” ;
print text with a newline at the end
println “hi";
”
interpolate special variables in the print string, but only
in double quotes.
println “the line count is $line” ;
make token 'capword' if the text begins with A-Z
[:alpha:]+ { capword: begins [A-Z]; }
star lexing for zero or more characters, with get class
[0-9]+ { get [.]*; ends “.” { println “integer with dots” ; }}
different ways to make literal tokens
you have to define them before you can use them in rules
match empty { exit; }
match 'abcd' { print 'hi'; exit; }
match not empty { print 'Extra char on line $line'; exit 2; }
matching within text blocks
[a-z]+ {
# match a,aa,aaa,aaaa etc, same as [a]
alist: only 'a';
# match a,ba,ab,aa,bb,bbb etc, same as [ab]
ablist: only 'ab';
ablist: [ab]; # same as above
list: [ab]; the same
}
[:digit:]+ {
# 'not only doesnt work because it compiles to ![0] in nom.
0number: begins '0' and not only [0];
}
for all alphabetic sequences, if the text begins with '<'
and ends with '>' then, if the text begins with '<' make a
"link" parse token, and if not, make a "tag" parse token
[:alpha]+ {
match begins '<' and ends '>' {
link: begins '<a ';
tag: ; }
}
match empty { print 'missing char at char $char'; exit 2; }
punct: NOT [:alnum:]+ ; # negated classes
x: not 'a';
register: '[' to ']' ; # 1st item is only 1 char presently
name: [.:] to '.' ; # from '.' or ':' to the next '.'
item: “/” to “:end” ; # 2nd item can be a string
item: '/' TO '/' ; # ?? same but thows error if no end '/'
file: '/' between [:space:] # up to but not including any space char.
[:alpha:]+ {
keyword: 'is'|'to'|'go';
name: 'tree';
name: [:alpha:]; # this is the default, no plus required
}
[a-z]+ {
num: 'one'|'two';
print an error message and quit if no matches
print 'invalid word\n'; exit 2;
}
negated class blocks
NOT [:space:]+ {
key: '/find/';
print 'not a space'; exit;
}
space: ' ';
newline : '\n';
literal: [;:] ; # def of literal tokens (only in lex part)
------------------------
the parsing section - these rules must all come after the
lexing rules above
check the value of the second word in this parse rule
phrase = word word {
“green” == $2 { ...}
[:space:] == $2 { ...}
not begins “the” == $story { ...}
}
alternation same length RHS sequences and rule block
a b = '[' c ']' | '[' c ']' { print “alternation\n"; } ”
alternation with unequal length sequences, but no rule block.
a b = c d | e f g;
optionals between <...>
a = b < x y | p q > c;
look-behind syntax with +(...)
expression = +('/'|'*') number ;
look-behind syntax with +(...) and a rule-block. The attribute variable
refers to the 1st token after the lookbehind.
expression = +('/'|'*') number { @1 = “($1)"; }
”
lookahead syntax with +(...)
a = ex '*' ex +('/'|'*') ;
look ahead with negative rules but tokens must be quoted which
is silly unless we are dealing with literal tokens
a b = c d +(not “f” and not “j");” a = c d +(not ';' and not '.');
look ahead syntax with code block. The attributes of '.' ',' and x
are automatically copied to their new positions on the stack.
a b = x y z +('.' x | ',' x) {
@1 := “$1 or $2” ; @2 := “$1 and $2” ;
println “found xyz followed by .x or ,x” ;
exit 1;
}
lookahead with
o = colour shape +(';' | block) ;
option = name digit ';' ; # use literal char token in parse rule
object = colour ':' shape; # lit token, but must define earlier
() = space word; # just delete tokens in the parse section
check if the stack contains only a list token at end of file
eof { stack (list) { print “list found\n” ; }}
eof { stack (a b | x y) { print “list found\n” ; }}
eof { () = x y { println “at eof parse stack ends with 'x y'” ; }}
check if the parse stack is list or number or float.
eof { parse (list|number|float) { print “list found\n” ; }}
EOF: words = words word; # only reduces at end of stream
EOF {
name = first second;
}