
#*  

ABOUT 

  Creating an [ebnf] style language with [nom] as the compile target.
  I will use a W3C ebnf style with no commas between tokens.

  It would be nice to have a more natural language that *targets* [nom]

  This is compiling simple [ebnf] to [nom] . This is the first example of
  using nom as the target of a nom script. Another strange oed://corollary
  arises: that we can use this new language to implement a compiler for
  itself 

BUGS

  textmatches have dodgy parsing.?
  
  there is a problem with the replace in lookahead if there is a  pattern.
  This could be solved with a replacestart command in pep/nom.
  
  lit: 'a'; compiles the same as a. is this ok? 

  I really need "replace ^'text' 'new';" and "replace 'old'$ 'new';
  where ^ and $ are anchors to the start and end of the workspace.
  without this, alot of my compilation is potentially buggy.
  For example, with lookahead compilation, I want to replace the first
  few tokens of a sequence, but I cant be sure I am only replacing the
  first.

  
DONE

  some conditions 
  Compiling uneven alternation sequences using the tape variable LHS and the ';'
  token attribute to save the partially compiled code.
  - finished attrule parsing to assign to @2 @3 etc.
  - variables like $server, which will then interpolate in strings
  - but user defined vars dont interpolate.
  - done: make altgroup work for rsequence ( altgroup | etc
  - done: star matches: eg [:space:]* { ... } this would match 0 or more characters
    of this class. it is useful in lexing. but it shouldnt be used on the RHS
    of a lex rule: eg: space: [:space:]*; # bad!  because the space token could
    contain nothing, which is silly, and also this could cause an infinite 
    loop.
  - done: condition: concat ==/!= quoted. tricky, same sysvariables.


TODO
 
  - conditions /../ with rule blocks. then conditions and alternations
    and lookaheads, lookbehinds, altgroups and so on.
  - conditions within rule blocks, supercedes old textmatch parsing.
  - a = b c (d|e) (x y|p q); which parses as
    LHS '=' rsequence '(' altbuild ) (altgroup);
    This is pretty easy because I have already written lots of rules like 
    this.

TESTING

  * first working program
  >> pep -f syntagma.pss -i "[:alpha:]+{color:'blue'|'green';delete;}digit:[0-9]+;lit:';';delete; option = color digit ';'; eof { print 'at eof\n';exit 0;} " > junk.pss

  The syntagma program below seems to compile correctly with
  this syntagma.pss script. See the phrases section for lots of syntagma
  syntax.

  * an example program, with lexing and a grammar rule 
  --------
    # comments can be written with '#'
    # * multiline comments are also
       ok, following a ebnf style * #
    [:alpha:]+ {
      color:'blue'|'green'|'red'|'orange';
      word: *;    # define default token within a lex block
    }

    space: [:space:];    # a space token contains a single \r\n\t or ' ' etc
    integer: [0-9]+ ; 
    # double quotes, single quotes or classes can be used
    # lit is a special lexing keyword
    lit: ';'|":"|[@#$];  

    # keywords like 'not','empty' or 'delete' are not case sensitive
    NOT EMPTY { delete; }  

    # parse rules start here. The syntagma grammar knows how to 
    # work out where lexing ends and parsing begins. The parse rule assignment
    # operator is '='
    option = color digit ';' ; 

    # parse rules can have alternations with |
    # literal tokens like ':' can be used in parse rules, but must be defined
    # above with the 'lit' or 'literal' lexing keyword
    position = integer ':' integer | integer ';' integer ; 
    # parse-rules can have code that executes when the rule matches
    position = integer ':' integer | integer ';' integer {
      print "found position at line $lines!\n"; 
    }
    EOF { print 'at eof\n'; exit 0;} 
  ,,,,

  >> pep -f syntagma.pss -i 'com = word param; block = word newword;'
  * sample output of syntagma.pss when compiling with nom script 
  ------+
    # sample input BNF rules (white-space doesnt matter):
    #   com = word param ; 
    #   block = word newword ;
    # output (produced by this script)
    pop;pop;
    "word*param*" {
      clear; add "com*"; push; .reparse
    }
    push;push;
    pop;pop;
    "word*newword*" {
      clear; add "block*"; push; push; push; .reparse
    }
    push;push
  ,,,,

  This is pretty cool, because we now have a ebnf-to-nom compiler
  that produces executable and translatable (to go/java/tcl/python/ruby etc)
  [nom] code. The language has a lexing and parsing syntax.

  The syntagma language may not be as efficient as hand coded [nom] because
  it does redundant "pushes" nom://push and "pops" nom://pop between
  code blocks, but it is easier to write and probably less prone to 
  errors. 

  * compiling syntax for syntagma
  ----
    link = quotedtext url { @1 = "<a href=$1>$2</a>"; }
  ,,,,

  I may also allow '.' as a string concatenator. $1 refers to 
  the attribute of the first token on the RHS right-hand-side of the 
  bnf grammar rule. The compiling block takes the place of the ';' in
  the syntax above.

ALTERNATION

  Alternation has now been implemented, including for unequal
  length alternation branches (with no following code block) (18 may 2026)
  Also alternation (all branches same length) with code blocks such as

  >> a = b e | c d { print "parsing"; }

  I have been thinking about how to implement alternation in syntagma
  and nom* for quite some time, and I have come up with a few promising
  ideas. Originally I thought that this was going to be impossible
  or nearly impossible (but it isnt)

  * parsing acrobatics and alternation
  --------
    example1: a = b c | d e f ;
    compile: 
      pop;pop; "b*c*" { clear; add "a*"; push; .reparse } push;push;
      pop;pop;pop; "d*e*f*" { clear; add "a*"; push; .reparse } push;push;push;
    example2: a = b c | d e { #1 = "$1 and $2"; }
    compile:
      pop;pop; "b*c*","d*e*" { 
        clear; get; add " and "; ++; get; --; put;
        clear; add "a*"; push; .reparse 
      } push;push;
  ,,,

  The second example is probably only useful if there are the same
  number of parse tokens in each branch of the alternation ?

  * use a variable on the tape, like this in nom
  >> begin { mark 'LHS'; ++ }

NOTES

  Below is a remarkably simple way to implement 'lookahead' in 
  syntagma.

  phantom tokens within blocks are nice. This can be used to enforce
  what sort of things can go in that block, eg, lexrules, textrules

  Since the nom* engine or pep* is completely text based, there
  is only one data type, so [:digit:]+ matches a string of digits,
  but they remain text.
  
  Classes are very simple such as [a-z] or [abc] or [:alnum:] so
  you cant actually combine them like [a-gxyz]. that wont work.
  This is because nom classes are (too) simple. 

TOKENS

  textrules* can only be used inside a block like [:alpha:]+ { ... } because
  the compiler first has to read a block of text to match multi-character
  text. lexrules can occur anywhere in the lexing setion, or in blocks in the
  parsingsection.  Actions can be multiple

   LHS*     left-hand-side of the ebnf rule, before the '='
   RHS*     right-hand-side of the rule, but I am not using this at the moment.
   altsequence*  a built rsequence (altbuild) group
   altbuild*     used for building an rsequence followed by (alternation)
   alttail* an alternation of sequences, tailwise reduction for unequal    
            eg: a | b | c { ... }
   altgroup*   alternation groups in (..) or <...> or after = This 
               can be combined with alttail when unequal length branch
               sequences are encountered.
   sysvar*     a variable like $counter $lines $server. can be user defined.
   attvar*     a numeric variable like $1,$2 etc refering to an token attribute
   class*      a simple class of characters [a-z] [abcd] [:space:]
   charquoted* is a single quoted character like: 'x'
   charmatch*  matches a single char. eg: not "x"
   textmatch*  matches string like: not begins "tt"
   condition*  a parse rule condition 
   andcondition*  an AND concatenation of conditions
   orcondition*  an OR concatenation of conditions
   quoted*     text between quotes: 'and' 
   concat*     a concatenated string. Interpolated strings become this
   text*       a non interpolated string (between ' and ' not " and ")
               I think this just becomes a concat*
   interp*     makes special vars interpolate in text
   sequence*   a sequence/list of tokens before the '=' in a rule
   rsequence*  a sequence of parse tokens after the '='
   leftattvar* attribute of token on LHS of rule.
   token*    one grammar token (alphabetic word)
   action*   print,delete,quit etc can go in the lex block or rule block
   attrule*  an assignment to a token attribute, eg @1 = $1.$2;
   ruleblock*  code within the {...} after a rule*
   rule*     one grammar parse rule like 'command = name semicolon ;' 
   ruleset*  a list of grammar rules 
   lexrule*     lexing rule, eg: word: [:alpha:]+ ; 
   lexruleset*  a set of lexing rules 
   textrule*    lexing rule involving text like 'and'
   textruleset* a set of textrules (and lexrules) - equivalent to ruleblock* for
                the lexing blocks.
   notset*   used in lookaheads for negativity.
   andmatch*   using AND logic with classes/charquoted/quoted 
             example: [a-z] and begins 'x'
   ormatch*    an OR set of classes,quotes etc eg, 'a'|'word'|[a-z]|'b'
   charset*  an OR set of chars eg: 'a'|'b'|'\n' 
   :=*       for attribute assignment.

   literal tokens:
   to       for lexing up to and including end delimiter
   between  for lexing before an end delimiter
   begins
   ends     for text ends-with
   and      for AND logic

   ... many others
   .  used for string concatenation, eg: $server:="http://".$3;
   {..} grouping of actions, parse rules, lexrules etc 
   (..) for grouping, especially of alternation, eg: a = b (c|d) e; (not implemented)
   /../ for parse-rule conditions, eg: a = b c /$1==$1/;
   +(..) lookahead groups.
   |  alternation or OR logic
   +  for classes and lexing 
   =  for grammar reduction
   :  for tokenisation (lexing) assignment
   ;  for statement end
   
PROPOSED PHRASES

  'text' does not interpolate
  "text" does.

PHRASES

  lexing rules are indicated by the ':' assignment, and 
  grammar rules by the '=' assignment.

  The following phrases compile.
  ------

  # begin blocks only execute once at the beginning of the script
  begin {
    # create a time variable. The names $line,$char,$counter and the 
    # $1,$2, etc are reserved. Variables must be declared in the 
    # begin block.
    var $time;
    var $server = 'ssh://etc';
  }

  # single line comments allowed
  #<star> multiline comments between these <star>#
  print 'hello'; exit 3; quit; delete; 
  # at eof, delete the pattern space, print text and exit with code '4'
  EOF { delete; print 'yes'; exit 4; }

  # delete all instances of 'green' in the pattern space text.
  delete 'green'; 

  # ignore all whitespace (and delete it)
  ignore [:space:]; 
  ignore: [:space:];  # the same
  ig [:space:];       # the same
  delete: [:space:];  # the same

  # delete one char from the left of the pattern space
  ltrim; 
  print "line: $line, char: $char"; # interpolate line number with $line etc

  # use the accumulator counter
  print "counter is $counter";

  # non interpolating between single quotes
  print ' $delimiter is a syntagma system variable';

  # concatenation of everything with the dot operator
  # user variables wont interpolate.
  begin { var $name := 'smooth'; }
  a = b c {@1 := "$1+$2 ".$1.' and '.$counter.' name:'.$name; }

  # double quotes are allowed and interpolate variables
  print "problem at line $line \n";

  # print text with a newline at the end
  println "hi"; 

  # interpolate special variables in the print string, but only 
  # in double quotes.
  println "the line count is $line";

  # make token 'capword' if the text begins with A-Z
  [:alpha:]+ { capword: begins [A-Z]; }

  # star lexing for zero or more characters, with get class
  [0-9]+ { get [.]*; ends "." { println "integer with dots"; }}

  # different ways to make literal tokens
  # you have to define them before you can use them in rules
  lit: 'x'; lit: [0-4]|'a'|'b'|';' ;
  literal: [(){}] ;  # braces as literal tokens

  # lex zero or more alphanumeric characters from the input stream
  get [:alnum:]*; 
  # lex zero or more non space characters from the input stream
  get not [:space:]*; 

  # make multicharacter literals and non-literals within a lex block 
  [:=]+ {
    lit: ':'|'=';
    assign: ':='; equals: '==';
    * { 
      println "strange syntax on line $line"; exit 1;
    }
  }

  # multicharacter literals 
  [:alpha:]+ { lit: "while"|"if"|"begin"|"end"; }

  begins '<' and ends '>' { print "tag"; tag: *; }
  # make token 'word' for all alpha numeric sequences 
  word: [:alnum:]+ ;
  name: * ;          # default lex rule

  # I think I will dispose of the 'match' keyword
  match empty { exit; }
  match 'abcd' { print 'hi'; exit; }
  match not empty { print 'Extra char on line $line'; exit 2; }

  [a-z]+ { 
    # match a,aa,aaa,aaaa etc, same as [a]
    alist: only 'a';
    # match a,ba,ab,aa,bb,bbb etc, same as [ab]
    ablist: only 'ab';
    ablist: [ab];    # same as above 
    list: [ab];  the same 
  }

  [:digit:]+ {
    # 'not only doesnt work because it compiles to ![0] in nom.
    0number: begins '0' and not only [0]; 
  }

  # for all alphabetic sequences, if the text begins with '<'
  # and ends with '>' then, if the text begins with '<' make a 
  # "link" parse token, and if not, make a "tag" parse token
  [:alpha]+ {
    match begins '<' and ends '>' { 
      link: begins '<a ';
      tag: *;
    }
  }

  match empty { print 'missing char at char $char'; exit 2; }
  punct: NOT [:alnum:]+ ;  # negated classes
  x: not 'a';
  register: '[' to ']' ;   # 1st item is only 1 char presently
  name: [.:] to '.' ;      # from '.' or ':' to the next '.' 
  item: "/" to ":end" ;  # 2nd item can be a string
  item: '/' TO '/' ;     # ?? same but thows error if no end '/'  
  file: '/' between [:space:]  # up to but not including any space char.
  [:alpha:]+ {
    keyword: 'is'|'to'|'go'; 
    name: 'tree';
    name: [:alpha:];   # this is the default, no plus required
  }
  [a-z]+ {
    num: 'one'|'two';
    # print an error message and quit if no matches
    print 'invalid word\n'; exit 2;
  }

  # negated class blocks
  NOT [:space:]+ { 
    key: '/find/';
    print 'not a space'; exit; 
  }  

  space: ' ';
  newline : '\n';
  # literal: [;:] ;      # def of literal tokens (only in lex part)

  # ------------------------
  # the parsing section - these rules must all come after the
  # lexing rules above

  # check the value of the second word in this parse rule
  phrase = word word {
    "green" == $2 { ...}
    [:space:] == $2 { ...}
    not begins "the" == $story { ...}
  }

  # alternation with same length sequences
  block = '[' statement ']' | '[' statementset ']' ;

  # alternation same length RHS sequences and rule block
  a b = '[' c ']' | '[' c ']' { print "alternation\n"; } 

  # alternation with preceding token sequence.
  a = b c d (x y| p q| s t);

  # alternation with preceding token sequence and ruleblock
  a = b c d (x y| p q| s t) { println "in rule-block"; }

  # alternation with preceding and succeeding token sequence and ruleblock
  a = b c d (x y| p q| s t) n m { println "in rule-block"; }

  # alternation with preceding and succeeding token sequence
  a = b c d (x y| p q| s t) n m; 

  # 2 alternation groups with token sequence in middle. 
  a = (x y| p q| s t) n m (x|y); 

  # 2 alternation groups with token sequence in middle and ruleblock
  a = (x y| p q| s t) n m (x|y) { print "lots of alternation"; } 

  # 2 alternation groups with 2 preceding token sequences 
  a = c d (x y| p q| s t) n m (x|y);

  # 2 alternation groups with 2 preceding token sequences with rule block
  a = c d (x y| p q| s t) n m (x|y) { print "lots of alternation"; } 

  # alternation with unequal length sequences, but no rule block.
  # this uses a tail-reduction technique with the alttail* token
  a b = c d | e f g;

  # alternation group that start the right-hand side of a rule
  x = (x y|p q) a b c { 
    @1:="xxx"; 
    println "x"; 
  }

  # also, with no rule block
  x = (x y|p q) a b c;

  # optionals between <...>  ?
  a = b < x y | p q > c;

  # conditional parse rule reductions with /.../
  pal = char pal char /$1 == $3/;

  # conditions with text matches and attribute variables.
  a = c d e /$2 matches [:space:]/;
  a = b c /$2 not begins "AB"/;
  a = b c /matches [abc] $2/;
  a = b c d /$3 ends ">"/;

  # conditions with system variables
  a = b c /$line ends "0"/;

  # conditions and user variables
  a = b c /$server begins "http://"/;

  # compound conditions with AND logic 
  a = b c /$line matches [0-3] and $2 not == "x"/;

  # compound conditions with OR logic 
  a = b c /$line matches [0-3] or $2 not == "x"/;

  # conditions with not equals != !== or not ==
  a b = c d e /$1 != $3/;
  a b = c d e /$1 != "http://$server"/;
  a b = c d e /"($3)" not == $1/;

  # interpolated strings and user variables are ok in conditions 
  # but not system variables (yet) eg $line $char $counter
  a b = c d e /"($1)" == $3/;

  # user defined variables can be compared to attribute variables.
  a b = c d e /$server == $3/;

  #   

  # look-behind syntax with +(...)
  expression = +('/'|'*') number ;

  # look-behind syntax with +(...) and a rule-block. The attribute variable
  # refers to the 1st token after the lookbehind.
  expression = +('/'|'*') number { @1 = "($1)"; } 

  # lookahead syntax with +(...)
  a = ex '*' ex +('/'|'*') ;
  # look ahead with negative rules but tokens must be quoted which
  # is silly unless we are dealing with literal tokens
  a b = c d +(not "f" and not "j");
  a = c d +(not ';' and not '.');

  # look ahead syntax with code block. The attributes of '.' ',' and x
  # are automatically copied to their new positions on the stack.
  a b = x y z +('.' x | ',' x) {
    @1 := "$1 or $2"; @2 := "$1 and $2"; 
    println "found xyz followed by .x or ,x";
    exit 1;
  }

  # lookahead with 
  o = colour shape +(';' | block) ;
  option = name digit ';' ;  # use literal char token in parse rule
  object = colour ':' shape; # lit token, but must define earlier
  () = space word;       # just delete tokens in the parse section

  # check if the stack contains only a list token at end of file
  eof { stack (list) { print "list found\n"; }}
  eof { stack (a b | x y) { print "list found\n"; }}
  eof { () = x y { println "at eof parse stack ends with 'x y'"; }}
  # check if the parse stack is list or number or float. 
  eof { parse (list|number|float) { print "list found\n"; }}
  EOF: words = words word;   # only reduces at end of stream
  EOF {
    name = first second;
  }

  ,,,,

HISTORY

   6 june 2026
     added ruleblocks for conditions after simple rsequences.

   5 june 2026
     added and logic and or logic with conditions within /../ but 
     these conditions will also work inside ruleblocks, whereas textmatches
     by themselves will work in lexing blocks.

     trying to develop the semantics of text-matches and conditions.
     text-matches can be used after lexing commands, but conditions
     are used in parse rules, and involve a variable such as 
     $1, $line, etc.

   4 june 2026
     wrote some condition syntax, which at the same time breaks
     my old textmatch syntax, but it wasnt much good. need to fix.

   3 june 2026

     Made lots and lots of alternation syntax using (altbuild) and altgroup
     in different combinations. Working with the perl interpreter as well.

     changed orsets to ormatches and andsets to andmatches. This is 
     because there are similar logic functions for tokens which are 
     subtily different and I want to distinguish them. For example
     'begins' doesnt make much sense with a token.

   2 june 2026

     made a look-behind (the opposite of look-ahead) syntax as follows
     >> a = +(x|y) b c;
     so "b c" will only reduce to "a" if "b c" is preceded by "x" or "y"
     This was very easy to implement, much easier than look-ahead, but it
     should make scripts neater in annoying cases like Wirth's PL/0

     Debugged somewhat string concatenation and interpolation.

   1 june 2026

     may have fixed a bug in lookaheads: eg, consider the syntagma
     parse rule: n = a b +(x y | p q);

     * nom compilation of syntagma rule, bug!
     ------
     pop;pop; B"a*b*" {
       E"x*y*",E"p*q*" {
         # .... more code
       }
     }
     push;push;
     ,,,,
     The problem is that this will match reduce "a b fox y" or 
     "a b slop q". The solution is to write this: E"*x*y*",E"*p*q*"

     adding multicharacter literals, which can be useful for keywords, if 
     you want to parse that way and not have a 'keyword' token. 
     >> [:alpha:]+ { lit: "begin"|"end"|"if"; * { println "error"; exit 1; }}

  29 may 2026

    - done: made 3 types of vars $attvar eg $1, $uservar eg: var $server = '...';
      and sysvar,eg: $counter, $delimiter,$escapechar.
    - done: concat processing. eg: concat '.' attvar|concat '.' sysvar| 
      concat '.' concat... '.' is the concatenation operator for strings and
      variables. Uservars have to user .$name. because i dont think they
      can interpolate.
    - done:aliases: igmar, igmap, igmal, igmaj, igmapy all aliases for 'interpretation'
      engines for syntagma. See /tr/engine.perl.sh

    - rename alt* to alttail* tailwise reduction
    - make altgroup* front wise reduction of alternations
    - allow uruleblocks for unbalanced alternations, but dont allow attrule*
      within them
    - conditional syntax; a=b c /$1==$2/{ stuff; }
    - token match syntax: a b c { do something }
    - showvars() function: display counter, lines, chars etc.
    - block types doc{..} for help/documentaton test{..} for tests and outputs
      error{..} for error trapping. begin{..} done already.

  26 may 2026
    fixed an ignore bug. need to check for everything.

  25 may 2026

    attribute copy appears to be working! quite tricky.

    working on lookahead rules with a rule code block. nearly 
    complete accept for lookahead attribute copy. Just need to 
    get 'push;' list from the LHS token, but may need a variable.
    Or do a fancy "clop;" etc using .reparse continually until
    only "push;" left????
  24 may 2026

    need to turn "charquoted*" into "token*" on the RHS of parse
    rules. Then 'not token*' becomes 'nottoken' 

    optionals seem easier. the hardest is lookahead with rule
    blocks. Optionals can have a block, put it can only have
    actions and lexrules not attrules because we dont know 
    how long the sequence is.

  23 may 2026

    made begin blocks and vardefs etc
    made alternation groups, working for stack(altgroup) but need 
    to parse with " lhs = rsequence (altgroup| etc ". This is 
    so I can build the nom compiled code. I can store the compiled
    code in the altgoup token.

    made a println printline function. made print and println work with
    interpolated text (concat*) token. reformed the comparison syntax to 
    allow no comparison == operator. made some debug rules in the error
    section. 
    todo:

     - begin blocks
     - variable declarations with "var $name;" or "var $name := 'text'; "
     - var decs should go in the begin block.

  22 may 2026
    made a string interpolation token interp*
    need to make begin blocks. redo $1=='green' parsing to make
    it more flexible.
    need to do var declarations in the begin block.

    * made a check attribute value and variable syntax like this
    -------
      "green" == $2 { ...}
      [:space:] == $2 { ...}
      not begins "the" == $story { ...}
    ,,,

  21 may 2026
    lookahead rules with no ruleblock seem to be compiling well.
    Need to add ruleblock, also alternations.
    Also, need to add +(not token) syntax, and 
    +(not ';' and not x) which is a negative lookahead syntax.

  19 may 2026
    made @1,@2 etc. work

    had the idea of lookahead grouping in rules eg
    >> a = b (c|d|e);
    so b will reduce to a, but only if b in followed by c,d or e
    The parse stack would be: sequence '=' rsequence lookahead ;

    This would compile as
    >> "b*c*","b*d*","b*e*" { replace "b*" "a*"; push; push; .reparse }
    but there is a problem with the replace if there is another b* pattern.

  18 may 2026
    wrote the example script /eg/s.url.pss which shows lots of nice
    syntagma syntax.

    implemented unequal length alternation lists with no following 
    code block, such as
    >> a = b c d | e f | g | h | i j ;
    The compilation technique is nothing short of amazing even to me,
    who wrote it. The parse-reduction is actuall 'tail-wise' so 
    that the branches of the alternation start reducing when the 
    ';' literal token is seen. Each branch has a list of 'pop's
    saved in the preceding '|' token attribute, which also indicates
    how many tokens are in that branch. For example, with 
    "...| h | i j ..." the "ij" branch has "pop;pop;" saved in the 
    previous '|' literal token and the "h" branch has "pop;" saved 
    in its previous | token. 

    So, the nom code, compares the 2 pop lists in each '|' token
    and if they are different (meaning the token sequence lists are
    of different lengths) then it immediately compiles the nom
    code for "i j" and saves it in the ';' token attribute (actually
    it adds it to that attribute). So the following 
    >> pop;pop; "i*j*"{ ...LHS...} 
    is added to the ';' attribute and parsing continues.
    But in order to get the code for the LHS* token it actually 
    has to use a "tape variable", which is just a named tape array
    cell at the top of the tape. This is because of the following
    parse sequence
    >> LHS '=' rsequence | rsequence | ... | resequence | rsequence ';'
    Because of the tail-reduction, nom has no idea where LHS is on 
    the stack, and we have to do tail-reductions because of code blocks

    

  17 may 2026
    making attrules for assigning token attributes with @1 := "$2 .. $3";
    lots of progress. alternations with code blocks working.
    rewrote rule parsing, which is now much better and allows
    alternation. I think the nom//until command should really have
    a class argument as well as text, eg: until [abc];
    
  15 may 2026
    added double quotes eg token: "a"|"b";
    still cant do alternation in parse rules like:
    >> colour = r g b | c m k ;
    but the alternation notes section for a way to do it.

    tidied up parsing of 'a' to 'b' etc. made 'match' sometimes
    optional (need to complete). made ignore rule. made better
    grammar* final token parsing. still need a way to match parse
    stack at eof? or try:
    ------
      eof { token = token { print 'yes'; }}
    ,,,,

  14 may
    added 'only': only 'a' means [a]+
    add AND logic, eg: 123number: [:digit:] AND begins [123] 
    which lexes the token '123number' if the text consists only
    of digits and begins with 1,2 or 3.

    added "begins" and "ends" and "not begins" and "begins not" 
    and so on.
  13 may 2026
    I think this is almost good enough to write a sed syntax checker
    as an example of what it can do.

    also need to do, actual composition rules like 
    >> obj = colour shape { $0 := $1.'\n'.$2 ; }

    lots of new syntax, eg: match, star '*' match empty {}
    {} = space word ;  delete token sequences
    word: * ;   # default lexing rules, matches everything even empty

  12 may 2026
    started to adapt this from the toybnf.pss script. Alot
    of progress, all sorts of lexing syntax is now working - see
    the phrases section above. lots and lots of progress - literal 
    tokens, actions like print,exit,delete etc

*#
  begin {
    # I need this variable for variable length sequence alternation
    # such as: a = b c | d ;
    mark "LHS"; ++;
    mark "pushlist"; ++;   # one push for each token in the LHS
  }

  read; put;
  # line-relative char numbers, but this is overridden by
  # the [:space:] hoover.
  [\n] { nochars; }

  # multiline comments follow the format of nom. (* ... *) look nicer 
  # but I may want to do something with () later
  "#" { 
    (eof) { clear; .reparse } read; 
    !B"#*" { "#\n" { clear; .reparse } whilenot [\n]; } 
    B"#*" { 
      clear; add "starting at line "; lines; put;
      clear; until "*#";
      !E"*#" { 
        clear; add "unterminated multiline comment '#* ... *#'\n"; get;
        print; zero; a-;a-; quit;
      }
    }
    put; clear; add "comment*"; push; .reparse 
  }

  # ignore white-space
  [:space:] { while [:space:]; clear; }

  # literal tokens, () for lookahead token set grouping
  # many of these literal tokens contain "pop;" list which is
  # put there by the rsequence token rules and the notset token    
  # so I will clear the attribute 
  [@0+:{}|().] { add "*"; push; --; put; ++; .reparse }

  # these are used for optionals. like () and +() and | they
  # can also contain a pop; list which indicates the length of the 
  # rsequence which follows. 

  '<','>' { add "*"; push; --; put; ++; .reparse }


  # lex '=' '==' etc
  '=' { while [=]; add "*"; push; --; put; ++; .reparse }

  # I store unequal alternation sequence compiled code here 
  ';' { clear; add " "; put; clear; add ";*"; push; }

  # alternation corresponds directly to noms ',' operator
  # I store pop; lists here, so I need to add the "," nom OR 
  # operator by hand
  '|' { clear; put; add "|*"; push; }

  # used for negation especially with '!='
  '!' { clear; put; add "!*"; push; }

  # the condition operator 
  # example: a = b c /$1==$2/;
  '/' { clear; put; add "/*"; push; }

  # the star means everything or anything, not sure about this?
  '*' { clear; add "!''"; put; clear; add "star*"; push; }

  # variables 
  # examples: $1 $2 or $name 
  # 
  "$" { 
    clear; while [:alnum:]; put;
    [:digit:] { 
      nop;
      clip; !"" {
        clear; add "Attribute values ($1,$2,$3...) maximum $9\n";
        print; zero; a-;a-; quit;
      }
      get; 

      # mushroom replacement technique
      replace "9" "++;8;--"; replace "8" "++;7;--";
      replace "7" "++;6;--"; replace "6" "++;5;--";
      replace "5" "++;4;--"; replace "4" "++;3;--";
      replace "3" "++;2;--"; replace "2" "++;1; --";
      replace "1" " get"; add ";"; 
      # remove extra space from lone get.
      " get;" { clop; } put;

      clear; add "attvar*"; push; .reparse
    }


    # special variables. Maybe should have a different syntax
    "line","char","counter","text" {
      # can i get the system delimiter? and escape char?

      # integer accumulator
      "counter" { clear; add "count;"; }
      # automatic number of lines read from input
      "line" { clear; add "lines;"; }
      # automatic number of characters read from input
      "char" { clear; add "chars;"; }
      # current text of workspace. This can be ommited often
      "text" { clear; add ""; }
      put; clear; add "sysvar*"; push; .reparse
    }

    # I can make the fetch code here, or make it when the var* token
    # is actually used. Same applies above. I am relying on 
    # replace '; get;' '; put;'; 
    # for assignment??
    
    clear; add 'mark "here"; go "'; get; add '"; get; go "here";'; put;
    clear; add "uservar*"; push; .reparse 
  }

  # digits for token attribute assignment 1-9,
  [1-9] { put; clear; add "digit*"; push; .reparse }

  # [:digit:] { while [:digit:]; put; clear; add "number*"; push; .reparse }

  [:alpha:] { 
    # add the default nom parse token delimiter '*'
    while [:alpha:]; put;

    # these are keywords, but I dont like the capital errors
    # case insensitive. This means that tokens cant use these
    # words??
    lower;

    "begin","parse","stack","only","and","or","begins","ends",
    "var","match","matches","txt","empty","not","to",
    "check","ig","ignore","lex","next","between","twixt","lit","literal","eof",
    "print","println","trim","ltrim","rtrim","delete","del","exit","quit" { 

      # put the nom command in the attribute
      # fix: divide into 'commands' and others.
      # but should function work on variables, like trim($1) etc????
      "ltrim" { clear; add "clop"; put; clear; add "ltrim"; } 
      "rtrim" { clear; add "clip"; put; clear; add "rtrim"; } 
      "trim" { clear; add "clip; clop"; put; clear; add "trim"; } 
      "exit","quit" { clear; add "quit"; put; clear; add "exit"; }
      "del","delete" { clear; add "clear"; put; clear; add "delete"; }
      "ig" { clear; add "ignore"; }

      "var" { clear; add "declare"; } 
      "parse" { clear; add "stack"; }
      "and" { clear; add "."; put; clear; add "and"; } 
      "matches" { clip; clip; } 
      "begins" { clear; add "B"; put; clear; add "begins"; } 
      "ends" { clear; add "E"; put; clear; add "ends"; } 
      "not" { clear; add "!"; put; clear; add "not"; } 
      "empty" { clear; add "''"; put; clear; add "empty"; } 
      "eof" { clear; add "(eof)"; put; clip; clop; } 
      "twixt" { clear; add "between"; }
      "literal" { clear; add "lit"; }
      add "*"; push; .reparse
    }
    # case sensitive
    clear; get;
    # normal token
    add "*"; put; clear; add "token*"; push; 
  }

  
  "'" { 
    until "'"; put; 
    "''" { clear; add "empty single quote\n"; print; zero; a-;a-; quit; }
    !E"'" { clear; add "unfinished single quote\n"; print; zero; a-;a-; quit; }
    clip; clop; clip; 
    # either 'x' or '\n' etc
    "","\\" { clear; add "charquoted*"; push; .reparse }
    clear; add "quoted*"; push; .reparse
  }

  # double quoted text. this text is interpolated with variable.
  # single quotes are not. But I use the same parse-token possibly unwisely.
  '"' { 
    until '"'; 
    '""' { clear; add "empty double quote\n"; print; zero; a-;a-; quit; }
    !E'"' { clear; add "unfinished double quote\n"; print; zero; a-;a-; quit; }
    # dont convert to single quotes, 
    put; clip; clop; clip; 
    # either "x" or "\n" etc
    "","\\" { clear; add "charquoted*"; push; .reparse }
    clear; add "quoted*"; push; .reparse
  }

  "[" { 
    until "]"; put; 
    "[]" { clear; add "empty class\n"; print; zero; a-;a-; quit; }
    !E"]" { clear; add "unfinished class\n"; print; zero; a-;a-; quit; }
    clear; add "class*"; push; .reparse 
  }

  # unlexed character
  !"" { 
    put; clear;
    add " 
     [syntagma]
       The character '"; get; add "' at line "; lines; add " is not
       (currently) part of the amazing syntagma* syntax. But you
       could put it in quotes, or you could check your work and
       see if its you not me. 
    \n";
    replace "\n    " "\n";
    print; zero; a-;a-; quit;
  }

parse>
  # show the parse-stack reductions. a doubled hash makes it easier
  # to remove from the output with sed '/^##/d'
  add "## line:"; lines; add " char:"; chars; add " "; print; clear; 
  unstack; print; stack; 
  (eof) { add " EOF"; } 
  # show last attribute value if required for debugging.
  # add " ("; --; get; ++; add ")"; replace "\n" "\n##    ";
  add "\n"; print; clear;

  # ---------------
  # ERROR parsing. search for 'one token' etc to find these

  # -------------------
  # errors: one token
  pop;

  # -------------------
  # errors: two tokens
  pop;

  "begins*not*","ends*not*" {
    clear; add "
    # ----------------------------------------
    # Syntagma flow:
    #   the 'not' logic word should proceed not succeed the 'begins'
    #   and 'ends' matching words.
    #   example: $1 not begins '<a'  # good
    # ----------------------------------------
    "; replace "\n    " "\n"; add "\n";
    print; zero; a-;a-; quit;
  }


  "rsequence*=*" {
    clear; add "
    # ----------------------------------------
    # Syntagma :
    #   missing semi-colon after rule? Found an out of place parse-rule 
    #   reduction operator (=) at line "; lines; add ".
    #   example: a = b c    x = y z;   # missing ';' after 'a = b c'
    # ----------------------------------------
    #*
    "; replace "\n    " "\n"; add "\n";
    add "rsequence* - "; get; add "\n";
    add "        =* - "; ++; get; add "\n*#\n";
    print; zero; a-;a-; quit;
  }

  # incorrect sequences
  E"*}*" { 
    B":=*",B"=*",B"+*",B"sysvar*",B"uservar*",
    B"attvar*",B"quoted*",B"concat*" {
     clear; add '
     # ----------------------------------------
     # Syntagma :
     #   misplaced } or some other weird and wonderful problem with your
     #   syntax. Yes you. nothing to do with me, the infallible parser
     #   writer. 
     #   example: a = b c { @1 := "($1,$2)"; } # nice use of braces 
     #     wrong: a = b c { @1 := "($1,$2)" }  # not so nice (missing ;) 
     # ----------------------------------------
     #*
     '; replace "\n    " "\n"; add "\n";
     add "rsequence* - "; get; add "\n";
     add "        =* - "; ++; get; add "\n*#\n";
     print; zero; a-;a-; quit;
    }
  }

  # incorrect sequences
  E"*;*" { 
    B"rule*",B"attrule*",B"lextrule*",B"textrule*" {
     clear; add '
     # ----------------------------------------
     # Syntagma :
     #   duplicated ; or other syntax error. blocks between {..} do
     #   not require a terminating ; in syntagma. please desist.
     #   example: a = b c;     # nice use of semicolon
     #     wrong: a = b c{ print "hi"; } ;  # extra trailing ; 
     # ----------------------------------------
     #*
     '; replace "\n    " "\n"; add "\n";
     add " <rule>* - "; get; add "\n";
     add "      ;* - "; ++; get; add "\n*#\n";
     # show message and quit with non-zero exit code.
     print; zero; a-; a-; quit;
    }
  }


  ":=*concat*".(eof) {
    clear; add '
    # ----------------------------------------
    # Syntagma :
    #   concatenation and interpolation of double quoted strings. 
    #   example: @1 := "$2,$3".$line; 
    # ----------------------------------------
    #*
    '; replace "\n    " "\n"; add "\n";
    add "     =* - "; get; add "\n";
    add "concat* - "; ++; get; add "\n*#\n";
    # show message and quit with non-zero exit code.
    print; zero; a-; a-; quit;
  }

  !B"beginblock*".E"vardef*" {
    clear; add "
    # ----------------------------------------
    # Syntagma :
    #   variable delarations need to go in the begin block
    #   example: begin { var $name := 'bob'; } 
    # ----------------------------------------
    #*
    "; replace "\n    " "\n"; add "\n";
    add "      ?* - "; get; add "\n";
    add " vardef* - "; ++; get; add "\n*#\n";
    print; zero; a-; a-; quit;
  }

  # incorrect + sequences 
  B"+*".!"+*" {
    !E"(*".!E";*".!E"|*".!E"{*" {
      # build error message.
      clip; replace "*" " was followed by a "; put; 
      clear; add "\n# [strange syntax]: A "; get; put; print;
      clear; add "
      # ----------------------------------------
      # Syntagma advice:
      #   A lookahead alternation group is enclosed in +(...) 
      #   The '+' is also used to express 'one or more' in lexing rules
      #   If the '+' is enclosed in quotes, then it can be used as
      #   a literal token.
      #
      #   example: a = b c +(x|y);  # good, parse rule with lookahead tokens 
      #   example: e = e '+' e +(')' | ';');  # good, + and literal '+'
      # ----------------------------------------
      "; replace "\n    " "\n"; add "\n";
      print; zero; a-; a-; quit;
    }
  }

  # semicolon errors and others
  "lexrule*;*","attrule*;*","rule*;*","ruleset*;*",
  ";*;*",";*=*",";*:*","(*;*" {
    clip; replace "*" " was followed by a "; put; 
    clear; add "\n# [strange syntax]: A "; get; put; print;
    clear; add "
    # ----------------------------------------
    #   semi-colons ';' are used to terminate
    #   statements, lexing rules, parsing rules, and pretty much 
    #   everything in syntagma syntax. You dont need 2 of them in a 
    #   row, just 1 is enough. 
    #
    #   example: lit:[-+*/];  # good, only 1 semi.
    #   example: a = b|c|d;   # good, '=' with preceding sequence 
    # ----------------------------------------
    "; replace "\n    " "\n"; add "\n";
    print; zero; a-; a-; quit;
  }

  "leftattvar*=*","uservar*=*","sysvar*=*" {
    clear; add "
    #*
    # ----------------------------------------
    # Syntagma kibbitzer:
    #   The assignment operator for variables and attributes is := not =
    #   (which is the parse-rule reduction operator)
    #   example: begin { var $name := 'hoop'; }  # good
    #   example: begin { var $name = 'bilk'; }  # not good
    # ----------------------------------------
    "; replace "\n    " "\n"; add "\n";
    add "sysvar* - "; get; add "\n";
    add "     =* - "; ++; get; add "\n*#\n";
    print; zero; a-; a-; quit;
  }

  !B"delete*".!B"ignore*".
  !B"uservar*".!B"sysvar*".!B"attvar*".!B"leftattvar*".!B"token*".!B"lit*" {
    E"*:*" {
      clear; add "
      # ----------------------------------------
      # Syntagma :
      #   ':' is the lexing assignment operator. It should be
      #   preceded by a token name or the keyword 'lit'. 
      #   ':' is also used in the attribute assignment token := 
      #   Please dont use 'lit' any other keyword as a token name 
      #   because the syntagma script will not compile, sorry.
      #    example:  lit: '.'|',';  # correct
      #    example: name: '.'|',';  # correct
      #    example: @1 := '$1 / $2';# correct
      #      wrong: var: [a-z]+;    # var is a keyword
      #
      #    keywords are:
      #     begin,parse,stack,only,and,begins,ends,var,match,txt,empty,not,to,
      #     check,ignore,next,between,twixt,lit,literal,eof,
      #     print,println,trim,ltrim,rtrim,delete,del,exit,quit
      # ----------------------------------------
      "; replace "\n    " "\n"; add "\n";
      add " ?* - "; get; add "\n";
      add " :* - "; ++; get; --; add "\n";
      print; zero; a-;a-; quit;
    }
  }

  "begins*class*" {
    clear; 
    add "# The 'begins' keyword cannot be combined with text classes \n";
    add "# on line "; lines; add "\n";
    print; zero; a-;a-; quit;
  }

  "|*)*" {

    clear; add "
    # ----------------------------------------
    # Syntagma :
    #   an alternation | followed by a group bracket ) is 
    #   probably an error. What do you thing?
    #   example: x = y z (a|b|c);
    # ----------------------------------------
    "; replace "\n    " "\n"; add "\n";
    add " |* - "; get; add "\n";
    add " )* - "; ++; get; --; add "\n";
    print; zero; a-;a-; quit;

  }

  # -------------------
  # errors: three tokens
  pop;

  #*
  (eof){ 
    "textmatch*attvar*","charmatch*attvar*" {
      clear; add '
      # ----------------------------------------
      # Syntagma:
      #   conditions with text-matches. quoted text, in this
      #   context, will not interpolate.
      #
      #   example: $2 begins "time" 
      #   example: $2 not begins "time" 
      # ----------------------------------------
      '; replace "\n    " "\n"; add "\n";
      add " textmatch* - "; get; add "\n";
      add "    attvar* - "; ++; get; add "\n";
      print; zero; a-;a-; quit;
    }
  }
  *#

  "|*rsequence*+(*" {
    clear; add "
    # ----------------------------------------
    # Syntagma flow:
    #   alternation before a lookahead group hasn't been implemented (yet)
    #   The issue occurred on line "; lines; add ". 
    #
    #   example: a = b c +(x y|p q);      # good
    #   example: a = b|c +(x y|p q);      # not so good.
    # ----------------------------------------
    "; replace "\n    " "\n"; add "\n";
    print; zero; a-;a-; quit;
  }

  B"+(*lookgroup*",B"(*altgroup*" { !"+(*lookgroup*".!"(*altgroup*" {
    !E")*".!E"|*" {
      replace "*" " "; ++; ++; ++; put; --; --; --;
      clear; add "
      # ----------------------------------------
      # Syntagma: 
      #   brackets appear to be mismatched: +( and ( should be
      #   terminated with )
      # ----------------------------------------
      "; replace "\n      " "\n"; add "\n";
      add "  (* or +(* - "; get; add "\n";
      add "     group* - "; ++; get; --; add "\n";
      add "         ?* - "; ++; ++; get; --; --; add "\n"; 
      add "parse stack - "; ++; ++; ++; get; --; --; --; add "\n";
      print; zero; a-;a-; quit;
    }}
  }

  B"<*altgroup*".!"<*altgroup*" {
    !E">*".!E"|*" {
      replace "*" " "; ++; ++; ++; put; --; --; --;
      clear; add "
      # ----------------------------------------
      # Syntagma: 
      #   brackets appear to be mismatched: [ should be
      #   terminated with ]
      # ----------------------------------------
      "; replace "\n      " "\n"; add "\n";
      add "         <* - "; get; add "\n";
      add "  altgroup* - "; ++; get; --; add "\n";
      add "         ?* - "; ++; ++; get; --; --; add "\n"; 
      add "parse stack - "; ++; ++; ++; get; --; --; --; add "\n";
      print; zero; a-;a-; quit;
    }
  }

  # -------------------
  # errors: four tokens
  pop;

  (eof){ 
    "concat*==*attvar*","uservar*==*attvar*","sysvar*==*attvar*" {
      clear; add '
      # ----------------------------------------
      # Syntagma slide:
      #   parsing and compiling condition* tokens
      #
      #   example: a = b c / $2 == "($line)" /
      #   example: a = b c / $2 == $server /
      #   example: a = b c / $line == $1 /
      # ----------------------------------------
      '; replace "\n    " "\n"; add "\n";
      add "  concat* - "; get; add "\n";
      add "      ==* - "; ++; get; add "\n";
      add "  attvar* - "; ++; get; add "\n\n";
      print; zero; a-;a-; quit;
    }
  }

  "lit*:*class*star*","token*:*class*star*" {
    clear; add "
    # ----------------------------------------
    # Syntagma persuasion:
    #   The kleene-star '*' in this context means 'zero or more' of
    #   something. But you dont want a token that is nothing, at least
    #   not yet...surely?
    #
    #   example: lit: [-=()];      # good
    #   example: word: [:alpha:]+; # also ok 
    #   example: word: [:alpha:]*; # not so good, could be nothing
    # ----------------------------------------
    "; replace "\n    " "\n"; add "\n";
    add "#     class - "; ++; ++; get; add "\n";
    add "#  operator - + \n";
    print; zero; a-;a-; quit;
  }



  "lit*:*class*+*" {
    clear; add "
    # ----------------------------------------
    # Syntagma koan:
    #   literal tokens probably should not be substantiated from  
    #   a class sequence because there will be too many of them.
    #   So please dont use the 'one-or-more' operator '+' after
    #   the class.
    #
    #   example: lit: [-=()];      # good
    #   example: lit: [a-z];       # also ok 
    #   example: lit: [-=()]+;     # not so good.
    # ----------------------------------------
    "; replace "\n    " "\n"; add "\n";
    add "#     class - "; ++; ++; get; add "\n";
    add "#  operator - + \n";
    print; zero; a-;a-; quit;
  }


  # -------------------
  # errors: five tokens
  pop;

  # -------------------
  # errors: six tokens
  pop;

  # -------------------
  # errors: seven tokens
  pop;

  # -------------------
  # errors: 8 tokens or less
  pop;

  # -------------------
  # errors: 9 tokens or less
  pop;

  # -------------------
  # errors: 10 tokens or less
  pop;

  # -------------------
  # errors: 11 tokens or less
  pop;

  # no lexing 
  (eof) {

    # incomplete programs 
    "condition*","token*","notset*","nottoken*","notsequence*","var*","attvar*",
    "attrule*","tomatch*","betweenmatch*","pattern*","ormatch*",
    "charset*","andmatch*","class*","quoted*","charquoted*","number*" {
      swap; add "\nis a "; get; add " token"; add "\n\n"; 
      print; zero; a-;a-; quit;
    }

    "charmatch*","textmatch*" {
      clear; add "
      # ----------------------------------------
      # Syntagma :
      #   a charmatch* matches one character, eg: not 'x'
      #   a textmatch* matches text, eg: not begins '<a' 
      #*
      "; replace "\n    " "\n"; add "\n";
      add "  <match>* - "; get; add "\n*#\n";
      print; zero; a-;a-; quit;
    }

    # check the content of left-hand-side rule token sequences
    # there are two helper variables at the top of the tape array. 
    # example: a b c = ...
    "sequence*" {
       clear;
       add "LHS sequence="; get; add "\n";
       add "'pushlist' cell="; mark "."; go "pushlist"; get; go "."; add "\n";
       add "'LHS' cell="; mark "."; go "LHS"; get; go "."; add "\n";
       add "\n\n"; print; zero; a-;a-; quit;
    }

    # altgroups combined with a preceding sequence
    # "LHS*=*rsequence*(*altgroup*|*","LHS*=*rsequence*(*altgroup*)*",
    "LHS*=*rsequence*(*altbuild*|*","LHS*=*rsequence*(*altbuild*)*" {
      clear; add "
      # ----------------------------------------
      # Syntagma: 
      #   sequences with alternations
      #   example: a = b c (x y|p q|
      #   example: a = b c (x y|p q|r s)
      # ----------------------------------------
      #*
      "; replace "\n      " "\n"; add "\n";
      add "        LHS* - "; get; add "\n";
      add "          =* - "; ++; get; add "\n";
      add "  rsequence* - "; ++; get; add "\n"; 
      add "          (* - "; ++; get; add "\n"; 
      add "   altbuild* - "; ++; get; add "\n"; 
      add "    |* or )* - "; ++; get; add "\n*#\n"; 
      print; zero; a-;a-; quit;
    }

    "LHS*=*rsequence*+(*lookgroup*)*{*ruleblock*" {
      clear; add "
      # ----------------------------------------
      # Syntagma: 
      #   lookahead tokens with code 
      # ----------------------------------------
      "; replace "\n      " "\n"; add "\n";
      add "       LHS* - "; get; add "\n";
      add "         =* - "; ++; get; add "\n";
      add " rsequence* - "; ++; get; add "\n"; 
      add "        +(* - "; ++; get; add "\n"; 
      add " lookgroup* - "; ++; get; add "\n"; 
      add "         )* - "; ++; get; add "\n"; 
      add "         {* - "; ++; get; add "\n"; 
      add " ruleblock* - "; ++; get; add "\n"; 
      print; zero; a-;a-; quit;
    }

    "LHS*=*rsequence*<*altgroup*>*rsequence*",
    "LHS*=*rsequence*<*rsequence*>*rsequence*",
    "LHS*=*rsequence*<*notset*>*rsequence*" {
      clear; add "
      # ----------------------------------------
      # Syntagma: 
      #   optionals between <...>
      # ----------------------------------------
      "; replace "\n      " "\n"; add "\n";
      add "       LHS* - "; get; add "\n";
      add "         =* - "; ++; get; add "\n";
      add " rsequence* - "; ++; get; add "\n"; 
      add "         <* - "; ++; get; add "\n"; 
      add "rseq/group* - "; ++; get; add "\n"; 
      add "         >* - "; ++; get; add "\n"; 
      add " rsequence* - "; ++; get; add "\n"; 
      print; zero; a-;a-; quit;
    }

    # print each token and attribute value.
    "textmatch*{*" {
      clear; 
      add "textmatch* - "; get; add "\n";
      add "        {* - "; ++; get; --; add "\n"; 
      print; zero; a-;a-; quit;
    }

    "starmatch*{*" {
      clear; 
      add "#* \n";
      add "starmatch* - "; get; add "\n";
      add "        {* - "; ++; get; --; add "\n"; 
      add "*#\n";
      print; zero; a-;a-; quit;
    }

    # left hand side of parse rule.
    "LHS*=*" {
      clear; 
      add " LHS* - "; get; add "\n";
      add "   =* - "; ++; get; add "\n"; 
      print; zero; a-;a-; quit;
    }


    # print each token and attribute value. +( should have a list of pops
    # this should help debugging lookahead syntax
    "(*altgroup*)*","(*altgroup*|*","<*altgroup*>*","<*altgroup*|*" {
      clear; add "
      # ----------------------------------------
      # Syntagma: 
      #   alternation groups are used for optionals, lookaheads, and 
      #   alternation within a rule. Usually each 'branch' of the alternation
      #   needs to have the same number of tokens or literal characters.
      # ----------------------------------------
      "; replace "\n      " "\n"; add "\n";
      add "   <* or (* - "; get; add "\n";
      add "  altgroup* - "; ++; get; --; add "\n";
      add ">* |* or (* - "; ++; ++; get; --; --; add "\n"; 
      print; zero; a-;a-; quit;
    }

    # print each token and attribute value. +( should have a list of pops
    # this should help debugging lookahead syntax
    "+(*lookgroup*)*","+(*lookgroup*|*" {
      clear; 
      add "       +(* - "; get; add "\n";
      add "lookgroup* - "; ++; get; --; add "\n";
      add "  |* or )* - "; ++; ++; get; --; --; add "\n"; 
      print; zero; a-;a-; quit;
    }

    # alternation groups and optionals 
    "LHS*=*rsequence*(*altgroup*)*","LHS*=*rsequence*<*altgroup*>*",
    "LHS*=*rsequence*(*rsequence*)*","LHS*=*rsequence*<*rsequence*>*" {
      clear; add "
      # ----------------------------------------
      # Syntagma:
      #   nearly a parse rule. <..> is used for optionals and 
      #   (..) is used for grouping alternatives
      # ----------------------------------------
      "; replace "\n      " "\n"; add "\n";
      add "      LHS* - "; get; add "\n";
      add "        =* - "; ++; get; add "\n";
      add "rsequence* - "; ++; get; add "\n";
      add "  <* or (* - "; ++; get; add "\n";
      add " altgroup* - "; ++; get; add "\n";
      add "  >* or )* - "; ++; get; add "\n"; 
      print; zero; a-;a-; quit;
    }

    # debug look-infront syntax
    "LHS*=*+(*lookgroup*)*rsequence*" {
      clear; add "
      # ----------------------------------------
      # Syntagma:
      #   The lookbehind syntax
      #   example: a = +(x y|p q) b c d;
      #   example: a = +(x y|p q) b c d { ... }
      # ----------------------------------------
      #*
      "; replace "\n      " "\n"; add "\n";
      add "      LHS* - "; get; add "\n";
      add "        =* - "; ++; get; add "\n";
      add "       +(* - "; ++; get; add "\n";
      add "lookgroup* - "; ++; get; add "\n";
      add "        )* - "; ++; get; add "\n"; 
      add "rsequence* - "; ++; get; add "\n*#\n";
      print; zero; a-;a-; quit;
    }
    # debug lookahead groups
    "LHS*=*rsequence*+(*lookgroup*)*;*" {
      clear; add "
      # ----------------------------------------
      # Syntagma:
      # ----------------------------------------
      "; replace "\n      " "\n"; add "\n";
      add "      LHS* - "; get; add "\n";
      add "        =* - "; ++; get; add "\n";
      add "rsequence* - "; ++; get; add "\n";
      add "       +(* - "; ++; get; add "\n";
      add "lookgroup* - "; ++; get; add "\n";
      add "        )* - "; ++; get; add "\n"; 
      add "        ;* - "; ++; get; add "\n"; 
      print; zero; a-;a-; quit;
    }

    "textmatch*==*var*" {
      clear; add "
      # ----------------------------------------
      # Syntagma syntax:
      #  the program is incomplete. 
      # ----------------------------------------
      "; replace "\n      " "\n";
      # print each token and attribute value. 
      add "textmatch* - "; get; add "\n";
      add "       ==* - "; ++; get; --; add "\n";
      add "      var* - "; ++; ++; get; --; --; add "\n"; 
      print; zero; a-;a-; quit;
    }

    "LHS*=*rsequence*" {
      clear; add "
      # ----------------------------------------
      # Syntagma:
      #   you wrote a partial program. add a ';' to complete the rule
      #   and a lexing rule as well.
      # ----------------------------------------
      "; replace "\n      " "\n";
      add "      LHS* - "; get; add "\n";
      add "        =* - "; ++; get; --; add "\n";
      add "rsequence* - "; ++; ++; get; --; --; add "\n"; 
      print; zero; a-;a-; quit;
    }


    "{*action*}*" {
      clear; add "
      # ----------------------------------------
      # Syntagma syntax:
      #  the program is incomplete. 
      # ----------------------------------------
      "; replace "\n      " "\n";
      # print each token and attribute value. 
      add "      {* - "; get; add "\n";
      add " action* - "; ++; get; --; add "\n";
      add "      }* - "; ++; ++; get; --; --; add "\n"; 
      print; zero; a-;a-; quit;
    }

    "(*notset*)*" {
      clear; add "
      # ----------------------------------------
      # Syntagma syntax:
      #  a 'notset' is for negative lookaheads and groups
      #  example: (not (a b) and not (b c))
      # ----------------------------------------
      "; replace "\n      " "\n";
      # print each token and attribute value. 
      add "      (* - "; get; add "\n";
      add " notset* - "; ++; get; --; add "\n";
      add "      )* - "; ++; ++; get; --; --; add "\n"; 
      print; zero; a-;a-; quit;
    }




    "rule*","ruleset*" {
      clear;
      add "\n";
      add "# ----------------------------------------\n";
      add "# Syntagma friendly advice: \n";
      add "#   You need at least 1 lexing rule with your parsing rules\n";
      add "# Example (a well-known esoteric language):\n";
      add "#   lit: '['|']';         # lex literal tokens []\n";
      add "#   inst: [-+><.,];       # lex instructions -+><.,\n";
      add "#   block = '[' inst ']' | '[' prog ']' | '[' ']'; # a parse rule \n";
      add "#   prog = inst inst | inst block | prog inst | prog block; \n";
      add "#   eof { ()=prog { print 'valid BF program \\n'; exit;}} \n";
      add "# ----------------------------------------\n\n";
      get; add "\n\n"; 
      print; zero; a-;a-; quit;
    }

    "textrule*","textruleset*" {
      clear; add "
      # ----------------------------------------
      # Syntagma syntax therapy:
      #  text-rules are for using within blocks either in the lexing phase
      #  or the parsing phase, here is an example:
      #    [:alpha:]+ { shape: 'circle'|'square'; word:*; }
      # ----------------------------------------
      "; replace "\n     " "\n";
      get; add "\n\n"; print; zero; a-;a-; quit;
    }

    # interpolated text
    "print*concat*" {
      clear; add "
      # ----------------------------------------
      # Interpolated text: 
      #    example:  
      #    print '$1:$2'; 
      # ----------------------------------------
      "; replace "\n     " "\n";
      ++; get; --; add "\n\n"; print; zero; a-;a-; quit;
    }

  }


  push;push;push;push;push;push;push;push;push;push;push;

  # end of error parsing
  #-----------------------
  # 1 token parsing
  pop;

  # currently ignoring comments but it would be nice to transfer
  # to compiled nom code.
  "comment*" { clear; .reparse }

  #-----------------------
  # 2 token parsing
  pop;

  # not equal to in conditions 
  "!*=*","!*==*","not*==*" { clear; add "!=*"; push; .reparse }

  # example: /$1=="<a>" and $line matches [1-3]/
  "condition*and*" {
    clear; add "andcondition*and*"; push;push; .reparse
  }

  "condition*or*" {
    clear; add "orcondition*or*"; push;push; .reparse
  }


  "class*star*" {
    clear; 
    add 'while '; get; replace "while !" "whilenot ";
    add '; put;'; put;
    clear; add "starmatch*"; push; .reparse 
  }

  # introducing new tokens, charmatch, andcharmatch, or charmatch (charset)
  # these are to make the grammar more logical and to accomodate condition* 
  # parsing, which also uses textmatch and charmatch.
  # example: not "x";

  # other sorts of conditions, but I dont know how to handle else logic
  # properly.
  # example: $3 not begins "xx"
  "textmatch*attvar*","textmatch*sysvar*","textmatch*uservar*" {
    clear; ++; add "clear; "; get; --; add "\n  "; 

    # contains a trick to imitate if/else logic
    get; add " { !'TRUE' { d; } add 'TRUE'; } !'TRUE' { d; add 'FALSE'; }"; 
    put;
    clear; add "condition*"; push; .reparse
  }

  "not*charquoted*" {
    clear; add "!"; ++; get; --; put; 
    clear; add "charmatch*"; push; .reparse
  }

  "not*quoted*" {
    clear; add "!"; ++; get; --; put; 
    clear; add "textmatch*"; push; .reparse
  }

  # this is considered a text-match because it refers to multiple
  # characters. 
  # example: begins "x";
  "begins*charquoted*","begins*quoted*" {
    clear; add "B"; ++; get; --; put;
    clear; add "textmatch*"; push; .reparse
  }

  "begins*textmatch*" {
    clear; ++; get; 
    B"B",B"!B" { 
      clear; 
      add "Syntagma: \n";
      add "Duplicated 'begins' keyword? at line "; lines; add "\n";
      print; zero; a-; quit; 
    } 
    B"E",B"!E" { 
      clear; 
      add "Syntagma: \n";
      add "The 'ends' and 'begins' words cannot be used together \n";
      add "on the same text (at line "; lines; add "\n";
      print; zero; a-; quit; 
    } 
    clear; add "B"; get; --; put;
    clear; add "textmatch*"; push; .reparse
  }

  "ends*charquoted*" {
    clear; add "E"; ++; get; --; put;
    clear; add "textmatch*"; push; .reparse
  }

  "ends*textmatch*" {
    clear; ++; get; 
    B"E",B"!E" { 
      clear; 
      add "Syntagma: \n";
      add "Duplicated 'ends' keyword? at line "; lines; add "\n";
      print; zero; a-; quit; 
    } 
    B"B",B"!B" { 
      clear; 
      add "Syntagma: \n";
      add "The 'ends' and 'begins' words cannot be used together \n";
      add "on the same text (at line "; lines; add "\n";
      print; zero; a-; quit; 
    } 
    clear; add "E"; get; --; put;
    clear; add "textmatch*"; push; .reparse
  }

  "not*textmatch*" {
    clear; ++; get; 
    B"!" { 
      clear; 
      add "Syntagma: \n";
      add "Duplicated 'not' keyword? at line "; lines; add "\n";
      print; zero; a-; quit; 
    } 
    clear; add "!"; get; --; put;
    clear; add "textmatch*"; push; .reparse
  }

  "not*token*" {
    clear; add '!"'; ++; get; --; add '"'; put;
    clear; add "nottoken*"; push; .reparse
  }

  "(*nottoken*","+(*nottoken*","(*notsequence*","+(*notsequence*" {
    clear; add "(*notset*"; push; push; .reparse   
  }

  # a phantom beginblock
  "begin*{*" {
    add "beginblock*"; push; push; push; .reparse
  }
  
  # a phantom beginblock
  "beginblock*action*","beginblock*lexrule*","beginblock*vardef*" {
    clear; get; !"" { add "\n"; } ++; get; --; put;
    clear; add "beginblock*"; push; .reparse
  }

  # integrate the begin block into the script.
  "start*lexrule*" {
    clear; get; ++; get; --; put;
    clear; add "lexruleset*"; push; .reparse
  }

  # some simple literal token combinations

  # +( will be the lookahead group token. This will also store the
  # list of pop;pop;... just like = and | and ( - if I do alternation groups
  # which I will.
  "+*(*" { clear; put; add "+(*"; push; .reparse }

  # this is the token attribute assignment operator
  ":*=*" {
    clear; add ":=*"; push; .reparse 
  }

  # simplifying parse rules with context token unification
  # example: begins 'x' and ends 'y' {
  # example: 'a'|'b'|[1-9] {
  "andmatch*{*","charset*{*","ormatch*{*","quoted*{*","star*{*",
  "empty*{*","charquoted*{*","class*{*" {
    clear; add "textmatch*{*"; push; push; .reparse
  }

  # simplifying parse rules with context token unification
  # example: begins 'x' and ends 'y' == $1 {...}
  # example: "green" == $colour {...}
  # compile: clear; mark "here"; go "colour"; get; go "here"; "green" {...}
  # example: begins "green" == $3 {...}
  # compile: clear; ++;++; get; --;--; B"green" {...}

  # comparisons of variables with textmatches like classes, strings, etc
  # or just comparison with the pattern-space. But I want to parse
  # conditions properly and also interpolate
  # fix: conditions
  B"andmatch*",B"ormatch*",B"star*",B"empty*",B"class*" {
    E"==*",E"var*",E"attvar*" {
      E"==*" { clear; add "textmatch*==*"; }
      E"var*" { clear; add "textmatch*var*"; }
      E"attvar*" { clear; add "textmatch*attvar*"; }
      push; push; .reparse
    }
  }

  # example: $count '1' (or) == empty (or) $1 [0-9]
  # fix: this rule needs to be adjusted for new condition* parsing.
  B"var*",B"attvar*" {
    E"andmatch*",E"ormatch*",E"star*",E"quoted",E"empty*",E"class*" {
      push; clear; add "textmatch*"; push; .reparse
    }
  }

  # reverse the order of comparison
  # example: $1 not begins "a" 
  "attvar*textmatch*","sysvar*textmatch*","uservar*textmatch*" {
    push; clear; pop; ++;++; put; clear; add "textmatch*"; get; --;--; 
    push; push; --;--; get; ++; swap; --; put; ++;++;
    clear; .reparse
  }

  # use a lexblock phantom token here? no because that allows
  # empty lex rule blocks which seems silly.

  # context-induced parse-token simplification 
  "class*and*","quoted*and*","charquoted*and*","charmatch*and*","textmatch*and*" {
     clear; add "andmatch*and*"; push; push; .reparse
  }

  # this is a nice way to ensure that only the right sort of 
  # tokens can go into a block {...} that follows a parse-reduction rule
  # I am not sure if I should allow lex rules here but I will for now.
  # example: a = b c { exit; }
  "ruleblock*action*","ruleblock*textrule*","ruleblock*attrule*",
  "ruleblock*lexrule*" {
    # join token with newline unless the 1st is a phantom ruleblock
    clear; get; !"" { add "\n"; } ++; get; --; put;
    clear; add "ruleblock*"; push; .reparse
  }

  # interpolate variables into double quoted text in the right context
  # The quoted* will become a concat* after it is interpolated or not.
  # also interpolate in conditions.
  "print*quoted*","println*quoted*",":=*quoted*",".*quoted*",
  "==*quoted*","!=*quoted*" { 
    add "interp*"; push; push; push; .reparse
  }

  "print*charquoted*" { clear; add "print*quoted*"; push; push; .reparse }
  "println*charquoted*" { clear; add "print*quoted*"; push; push; .reparse }

  # This is way to interpolate variables into a string,
  "quoted*interp*" {
    clear; get; 
    # single quotes are not interpolated. 
    B"'".E"'" { 
      clip;clop; put; clear; add 'add "'; get; add '";'; put;
      clear; add "concat*"; push; .reparse
    } 
    B'"'.E'"' { clip;clop; } put;
    clear; add 'add "'; get; add '"'; put;
    
    # special line and char and counter 'variables'
    # the number of lines read from the input stream
    replace "$line" '"; lines; add "';
    # the number of chars read from the input stream
    replace "$char" '"; chars; add "';
    # access the pep machine accumulator
    replace "$counter" '"; count; add "';
    # text is the text in the current tape cell 
    replace "$text" '"; get; add "';
    # get the parse-stack?
    # replace "$stack" "'; ++;++;++;put;--;--;--; d;stack;swap;get; add '";

    # can I replace any variable here?
    
    # the $n variables which are token attribute values
    # fix: this is already in the attvar, just need to do get!!.
    replace "$1" '"; get; add "';
    replace "$2" '"; ++; get; --; add "';
    replace "$3" '"; ++;++; get; --;--; add "';
    replace "$4" '"; ++;++;++; get; --;--;--; add "';
    replace "$5" '"; ++;++;++;++; get; --;--;--;--; add "';
    replace "$6" '"; ++;++;++;++;++; get; --;--;--;--;--; add "';
    replace "$7" '"; ++;++;++;++;++;++; get; --;--;--;--;--;--; add "';
    replace "$8" '"; ++;++;++;++;++;++;++; get; --;--;--;--;--;--;--; add "';
    replace "$9" '"; ++;++;++;++;++;++;++;++; get; --;--;--;--;--;--;--;--; add "';
    # fix: syntagma will allow 16 tokens in a rule

    # an optimisation!! remove empty add commands
    # replace "add '';" ""; replace 'add "";' '';

    # remove extra space from lone get.
    " get;" { clop; } add ";"; put;
    clear; add "concat*"; push; .reparse
  }

  # LHS token attribute assignment
  # example: @3
  # compile: ++;++; put; --;--;
  # example: @4
  # compile: ++;++;++; put; --;--;--;
  "@*digit*" {
    clear; add "@"; ++; get; --; 
    replace "@9" "++;@8;--"; replace "@8" "++;@7;--";
    replace "@7" "++;@6;--"; replace "@6" "++;@5;--";
    replace "@6" "++;@5;--"; replace "@5" "++;@4;--";
    replace "@4" "++;@3;--"; replace "@3" "++;@2;--";
    replace "@2" "++; @1; --"; replace "@1" "put"; add ";";

    put;
    clear; add "leftattvar*"; push; .reparse
  }

  # variable length alternation sequences will compile code into 
  # this '{' token attribute, so I want to make sure that it is 
  # empty. no, fix: 
  "rsequence*{*" { 
    clear; ++; put; --; add "rsequence*{*"; 
    # dont reparse because you get an infinite loop
  }

  # set up rsequence parsing, also in alternation-groups
  # example: a = b c ( e f | g h ) i j ; # alternation group
  # example: a = b c +( e f | g h );   # lookahead group
  # example: a = b < e | h > x y;      # rsequence in and after optional 
  "=*token*","|*token*","(*token*",")*token*",
  "+(*token*","<*token*",">*token*" {
    push;
    # store pop list in = attribute
    clear; --; add "pop;"; put; ++;
    clear; add '"'; get; add '"'; 
    # dont double-wrap not-tokens in quotes
    B'"!"'.E'""' { clip; clop; } put;
    # reverse not ends with. fix:
    B'E!' { clop; clop; put; clear; add "!E"; get; } put;
    clear; add "rsequence*"; push; .reparse
  }

  # just put a pop; list in | this is used by lookgroups etc 
  "|*charquoted*" {
    clear; add "pop;"; put; clear; add "|*charquoted*";
  }

  # example: '=' rsequence = '=' name ;
  # example: '(' rsequence = '(' name ;

  # in some contexts, like in parse rules, quoted text is really a literal
  # token, so lets realise that idea by removing quotes and adding the 
  # token-star. I should use this technique with charquoted as well.
  # example: a = 'begin' block 'end';
  # example: a '--' = a b c '--' ;

  "=*quoted*","(*quoted*","+(*quoted*","<*quoted*",
  "sequence*quoted*","rsequence*quoted*" {
    push; clear; get; 
    # convert 'text' to text* for literal tokens
    # handle quoted and unquoted text.
    B"'".E"'" { clip; clop; add "*"; }
    B'"'.E'"' { clip; clop; add "*"; } 
    !E"*" { add "*"; } put; 
    clear; add "token*"; push; .reparse
  }

  # trying to convert quoted text to tokens in the parse-rule context.
  "quoted*=*","quoted*token*","quoted*sequence*" {
    replace "quoted*" "token*"; push; push; 
    # convert 'text' to text* for literal tokens
    # handle quoted and unquoted text.
    --; --; get;
    B"'".E"'" { clip; clop; add "*"; }
    B'"'.E'"' { clip; clop; add "*"; } 
    !E"*" { add "*"; } put; clear;
    ++; ++; .reparse
  }




  # set up rsequence parsing with literals, also in alternations
  # see the 3 token rule for '|*charquoted*' etc
  # also for lookahead groups
  # example: '=' rsequence = '=' name ;
  # example: '(' rsequence = '(' name ;
  "=*charquoted*","(*charquoted*","+(*charquoted*","<*charquoted*" {
    push;
    # convert 'x' to x* for literal tokens
    clear; get; 
    # fix: also handle negated literal characters? these are
    # useful in lookaheads and other circumstances.
    # but I think I need a separate token. notcharquoted*
    # example: !";" -> !";*" ????
    clip; clop; add "*"; 
    # fix: # B"'",B"!'" { add "'"; } 
    put; 
    # store pop list in = or ( or +( or [ attribute
    clear; --; add "pop;"; put; ++;
    clear; add '"'; get; add '"'; put;
    clear; add "rsequence*"; push; .reparse
  }



  # get the next character into the pattern space or nothing if EOF
  # example: next;
  # compile: !(eof) { read; }

  "next*;*" {
    clear; add "!(eof) { read; }"; put;
    clear; add "lexrule*"; push; .reparse
  }

  "between*to*","between*not*","between*ends*","between*begins*" {
    replace "between*" ""; clip; put;
    clear; add "cant mix 'between' and '"; get; 
    add "' key words (at line "; lines; add ")'\n"; print; zero; a-;a-; quit;
  }
  
  "to*between*","to*not*","to*ends*","to*begins*" {
    replace "to*" ""; clip; put;
    clear; add "cant mix 'to' and '"; get; 
    add "' key words (at line "; lines; add ")'\n"; print; zero; a-;a-; quit;
  }
  # reduce number of tokens
  # example: [a-z] to -> parse: pattern*to*
  "class*to*","charquoted*to*","quoted*to*" {
    clear; add "pattern*to*"; push; push; .reparse
  }

  # example: to '/end' -> parse: to*pattern*
  # no classes here, because nom://until cant do it.
  "to*charquoted*","to*quoted*" {
    clear; add "to*pattern*"; push; push; .reparse
  }

  # example: [a-z] between -> parse: pattern*between*
  "class*between*","charquoted*between*","quoted*between*" {
    clear; add "pattern*between*"; push; push; .reparse
  }

  # example: between [:space:] -> parse: between*pattern*
  "between*charquoted*","between*quoted*" {
    clear; add "between*pattern*"; push; push; .reparse
  }
  
  # the 'match' or matches keyword turns things into a textmatch
  "match*class*","match*quoted*","match*charquoted*" {
     clear; ++; get; --; put;
     clear; add "textmatch*"; push; .reparse
  }

  # the 'match' or matches keyword may be optional sometimes 
  "match*eof*","match*empty*","match*ormatch*","match*andmatch*",
  "match*tomatch*","match*betweenmatch*" {
     clop;clop;clop;clop;clop;clop; push; get; --; put; ++; 
     clear; .reparse
  }

  # syntactic sugar, 
  # example: only 'abc'  or 'a'
  # compile: [abc] or [a]
  "only*quoted*","only*charquoted*" {
    clear; ++; get; --;
    B"B","E","!" { 
      clear; add "cant combine 'only' with 'begins/ends/not'\n";
      print; zero; a-;a-; quit;
    } 
    clip; clop; put;
    clear; add "["; get; add "]"; put;
    clear; add "class*"; push; .reparse
  }

  # allow negation of classes etc, but not.quoted becomes textmatch
  "not*class*","not*empty*","not*eof*" {
    replace "not*" ""; ++; ++; put; --; --;
    clear; add "!"; ++; get; --; put; 
    clear; ++; ++; get; --; --; push; .reparse 
  }

  # allow negation of tokens, wrap in quotes
  "not*token*" {
    clear; add '!"'; ++; get; --; add '"'; put; 
    clear; add "token*"; push; .reparse 
  }


  # text begins with and text ends with. 
  #*

  # fix: This rule is superceded by textmatch and charmatch word.
  "begins*quoted*","begins*charquoted*","ends*quoted*","ends*charquoted*" {
    B"ends*" { replace "ends*" ""; }
    B"begins*" { replace "begins*" ""; }
    push;
    --; get; ++; get; 
    # 'begins-not' needs to be 'not-begins' in nom etc
    B"E!" { clop; clop; put; clear; add "!E"; get; } 
    B"B!" { clop; clop; put; clear; add "!B"; get; } 
    # print; zero; a-;a-; quit;
    --; put; ++;
    clear; .reparse
  }
  *#

  "comment*comment*" { 
    clear; get; add "\n"; ++; get; --; put;
    clear; add "comment*"; push; .reparse 
  }

  # how to include comments
  #*
  "comment*lexrule*","lexrule*comment*","lexruleset*comment*" { 
    clear; get; add "\n"; ++; get; --; put;
    clear; add "lexruleset*"; push; .reparse 
  }
  *#

  "token*token*","sequence*token*" {
    # count tokens to calculate "push;" later
    # but I probably dont need the accumulator for this.
    a+;

    # add 1 or 2 pushes to the 'pushlist' tape cell
    # this is used when calculating sequence length differences for example
    # in lookahead attribute copy. The "pushlist" is initialised with one
    # push;
    mark "here"; go "pushlist";
    B"token*" { clear; add "push;push;"; } 
    B"sequence*" { clear; add "push;"; } 
    swap; get; put; go "here";

    clear; get; ++; get; --; put; 
    clear; add "sequence*"; push; .reparse
  }

  # allow literal chars in sequences if they have already been 
  # declared with lit: [abc]; (or) lit: ';'|':';
  # eg: option = word number ';' ;
  "token*charquoted*","sequence*charquoted*" {
    # count tokens to calculate "push;" later
    a+;
    # convert 'x' to x* for literal tokens
    clear; ++; get; clip; clop; add "*"; put; --;
    clear; get; ++; get; --; put; 
    clear; add "sequence*"; push; .reparse
  }

  # allow literal chars to begin sequences if they have already been 
  # declared with lit: [abc]; (or) lit: ';'|':';
  # eg: option = '(' obj ')' ;
  # charquoted.sequence should not occur.

  "charquoted*token*","charquoted*sequence*" {
    # count tokens to calculate "push;" later
    a+;
    # convert 'x' to x* for literal tokens
    clear; get; clip; clop; add "*"; put;
    clear; get; ++; get; --; put; 
    clear; add "sequence*"; push; .reparse
  }

  # eg: opt = '(' ')' ;
  "charquoted*charquoted*" {
    a+;
    # convert 'x' to x* for literal tokens
    clear; get; clip; clop; add "*"; put;
    clear; ++; get; clip; clop; add "*"; put; --;
    clear; get; ++; get; --; put; 
    clear; add "sequence*"; push; .reparse
  }

  # need to construct the LHS here. using 'stack' is much easier
  # but feels a bit lazy, and I quite like being reminded how many 
  # tokens I am pushing.
  # example: a b c = 
  # compile: "clear; add 'a*b*c*'; push;push;push; .reparse"
  #      or: "clear; add 'a*b*c*'; stack; .reparse"
  "token*=*","sequence*=*" {
    # initialise the pushlist cell to 1 token
    B"token*" { 
      clear; mark "."; go "pushlist"; add "push;"; put; go ".";
    }
    # later have to transform this count number into
    # push; or push;push; etc
    clear; get; a+; count; put; clear; 
    # reset the token counter for the RHS 
    zero; 
    clear; add 'clear; add "'; get; add '#;';
    # 6 token limit for left-hand-side which is more than enough
    # look-ahead or context?
    replace "1#;" '"; push;';
    replace "2#;" '"; push;push;';
    replace "3#;" '"; push;push;push;';
    replace "4#;" '"; push;push;push;push;';
    replace "5#;" '"; push;push;push;push;push;';
    replace "6#;" '"; push;push;push;push;push;push;';
    add " .reparse"; put;
    # save into top of tape for variable length alternations
    mark "here"; go "LHS"; put; go "here";
    clear; add "LHS*=*"; push; push; .reparse
  }

  #*
  no, old rule, remove
  "token*;*","sequence*;*" {
    clear; get; a+; count; put;
    clear; zero; add "RHS*"; push; .reparse
  }
  *#

  # just simplify parse rules, while maintaining the separation
  # between the lexing and parsing sections.
  "lexrule*rule*" { clear; add "lexruleset*rule*"; }
  "lexruleset*rule*" { clear; add "lexruleset*ruleset*"; }

  "lexruleset*ruleset*".(eof) { 
    clear; 
    add "# -------------------------------------\n";
    add "# nom script created by www.nomlang.org/eg/syntagma.pss\n\n";
    add "begin { nop; }\nread; put; \n"; get; 

    # if the parser doesn't consume or delete character from the 
    # input stream, then it is an error. stop the show.
    add "\n!'' { \n";
    # fix: escape \n \t etc so they are visible.
    # add "  replace '\\n' '\\\\n'; \n"; 
    add "  put; clear; add 'unlexed character \"'; get; add '\" ';\n";
    add "  add 'at line '; lines; add ' of input.\\n'; \n";
    add "  add 'All characters in the input should be lexed or ignored\\n'; \n";
    add "  print; clear; zero; a-; a-; quit; \n";
    add "}";
    add "\n\nparse>\n"; 
    add "# show the parse-stack reductions.\n";
    add 'add "## line:"; lines; add " char:"; chars; ';
    add 'add " "; print; clear; \n';
    add 'unstack; print; stack; (eof) { add " EOF"; }  \n';
    add '# show last attribute if required.\n';
    add '# add " ("; --; get; ++; add ")"; \n';
    add '# replace "\\n" "\\n##      ";\n';
    add 'add "\\n"; print; clear;\n';
    ++; get; --; put; 
    clear; add "grammar*"; push; .reparse
  }

  # lists of textrules (eg: keyword:'to'|'is'|'a';)
  # if we mix lexrules with text then they become textrules
  "textrule*action*","textruleset*action*",
  "textrule*textrule*","textrule*lexrule*","textruleset*textrule*",
  "lexrule*textrule*","lexruleset*textrule*","textruleset*lexrule*" {
    # dont add a newline to a phantom block
    clear; get; !"" { add "\n"; } ++; get; --; put;
    clear; add "textruleset*"; push; .reparse
  }

  "lexrule*lexrule*","lexruleset*lexrule*","action*lexrule*",
  "lexruleset*lexrule*","lexrule*action*","lexruleset*action*" {
    clear; get; add "\n"; ++; get; --; put;
    clear; add "lexruleset*"; push; .reparse
  }

  "rule*rule*","ruleset*rule*","rule*action*",
  "action*rule*","ruleset*action*" {
    clear; get; add "\n"; ++; get; --; put;
    clear; add "ruleset*"; push; .reparse
  }

  "delete*;*","trim*;*","ltrim*;*","rtrim*;*" {
    clear; get; add "; put; "; put;
    clear; add "action*"; push; .reparse
  }
  "exit*;*" {
    clear; get; add ";"; put;
    clear; add "action*"; push; .reparse
  }


  "action*action*" {
    clear; get; add "\n"; ++; get; --; put;   
    clear; add "action*"; push; .reparse
  }

  # do not allow actionblock to contain attribute rules.
  "actionblock*action*","actionblock*lexrule*" {
    clear; get; add "\n"; ++; get; --; put;   
    clear; add "actionblock*"; push; .reparse
  }

  # reduce token diversity for string concatenations
  "text*.*","sysvar*.*","uservar*.*","attvar*.*" {
    clear; add "concat*.*"; push; push; .reparse
  }

  #-----------------------
  # 3 token parsing
  pop;

  # condition token simplification
  # example: /$2=="<>" and $line ends "0"/ --> condition

  "/*andcondition*/*","/*orcondition*/*" {
    clear; add "/*condition*/*"; push;push;push; .reparse
  }

  "andcondition*and*condition*" {
    clear; get; 
    add "\n  'TRUE' {\n"; 
    ++;++; swap; replace "\n" "\n  "; swap; add "    "; get; --;--; 
    add "\n  }"; put;
    clear; add "andcondition*"; push; .reparse
  }

  "orcondition*or*condition*" {
    clear; get; 
    add "\n  !'TRUE' {\n"; 
    ++;++; swap; replace "\n" "\n  "; swap; add "    "; get; --;--; 
    add "\n  }"; put;
    clear; add "orcondition*"; push; .reparse
  }


  # some rules for conditions and condition sets like andcondition
  
  # just switch the order of condition. This makes the compilation to
  # nom slightly easier
  # example: $2 == "($3)" --> "($3)" == $1
  "attvar*==*concat*","attvar*==*uservar*","attvar*==*sysvar*",
  "attvar*!=*concat*","attvar*!=*uservar*","attvar*!=*sysvar*",
  "sysvar*==*concat*","uservar*==*concat*","sysvar*!=*concat*","uservar*!=*concat*" {

    # reverse tokens
    ++;++;++; 
    push;push; put; clear; pop; ++; swap; get; put; --; 
    clear; pop; ++;++; swap; get; --;--;
    --;--;--;
    push;push;push;

    --;--;--; get; ++;++; swap; --;--; put; ++;++;++;
    clear; .reparse 
  }

  # switch quoted==$1 twice to get interpolation
  # eg: "($1) == $2" -> reverse -> reverse with concat*
  "quoted*==*attvar*","quoted*==*sysvar*","quoted*==*uservar*" {
    replace "quoted*==*" ""; add "==*quoted*";
    push;push;push;
    clear; --;--;--; get; ++;++; swap; --;--; put; ++;++;++;
    clear; .reparse 
  }


  # a condition, that may occur between /../ or before braces?
  # but quoted will get interpolated if "..." and converted to 
  # concat* in any case.
  # example: / $2 == "two: $2" /

  # example:  $2 == "two: $2" /
  # fix: do concat==sysvar, but this is tricky because you cant just do (==)
  "sysvar*==*attvar*","uservar*==*attvar*","attvar*==*attvar*",
  "concat*==*attvar*","concat*==*uservar*" {
    clear; add "clear;\n  "; get; add "\n  "; 
    ++;++; swap; 
    replace 
      "get;" 
      "\n  (==) { d; add 'TRUE'; } !(==).!'TRUE' { d; add 'FALSE'; }"; 
    swap; get; --;--; put;
    clear; add "condition*"; push; .reparse
  }
  # example:  $2 != "two: $2" /
  # example:  $2 not == "two: $2" /
  "sysvar*!=*attvar*","uservar*!=*attvar*","attvar*!=*attvar*",
  "concat*!=*attvar*","concat*!=*uservar*" {
    clear; add "clear;\n  "; get; add "\n  "; 
    ++;++; swap; 
    replace 
      "get;" 
      "\n  !(==) { d; add 'TRUE'; } (==).!'TRUE' { d; add 'FALSE'; }"; 
    swap; get; --;--; put;
    clear; add "condition*"; push; .reparse
  }


  # lex zero or more chars from the input stream.
  # example: get [:alpha:]*;     # zero or more alphabetic chars
  # example: get not [:space:]*; # zero or more non-whitespace chars

  "lex*starmatch*;*" {
    clear; ++; get; --; put;
    clear; add "lexrule*"; push; .reparse 
  }

  # concats are how text is interpolated and concatenated.
  "concat*.*concat*","concat*.*text*","concat*.*sysvar*","concat*.*uservar*", 
  "concat*.*attvar*" {
    clear; get; ++;++; get; --;--; 
    # remove empty adds. which occur when a variable is at the start or
    # end of an interpolated string.
    replace 'add "";' ''; replace "add '';" ""; put;
    clear; add "concat*"; push; .reparse
  }

  "notset*and*notsequence*","notset*and*nottoken*" {
     clear; get; ++; ++; add "."; get; --; --; put;
     clear; add "notset*"; push; .reparse 
  }

  # an actionblock* cannot contain attrules* because we dont know the
  # length of the sequence.
  "altgroup*>*{*",">*rsequence*{*" {
    push; push; push; clear; put;
    add "actionblock*"; push; .reparse
  }

  "declare*var*;*" {
    # the var already has fetch code in it...
    clear; ++; get; --; 
    replace 'mark "here"; go' 'mark';
    replace 'get; go "here";' ''; add '++;'; put;
    clear; add "vardef*"; push; .reparse
  }

  # reverse the order
  "var*==*textmatch*","attvar*==*textmatch*" {
    clear; get; ++;++; swap; --;--; put;
    clear; add "textmatch*==*var*"; push; push; push; .reparse
  }

  # A phantom textruleset to start the block
  "==*attvar*{*","==*var*{*","rsequence*)*{*" {
    push; push; push; 
    put; add "textruleset*"; push; .reparse
  }

  # Let check for empty brackets (because of the phantom token above.
  # fix: put in the error section?
  "{*textruleset*}*","{*ruleblock*}*" {
    ++; swap; "" {
      add "Empty block braces {} found at line "; lines; add "\n"; 
      print; zero; a-;a-; quit;
    } 
    swap; --;
  }


  # for deleting tokens and maybe checking, I was using 0 but I need that
  # for a number.
  # example: () = a b ;
  # compile: "a*b*" { clear; .reparse }
  "(*)*=*" {
    clear; add "clear; .reparse"; put;
    clear; add "LHS*=*"; push; push; .reparse 
  }

  #*
  # a lookahead grouping, for a single token sequence. this is not as
  # useful as (a|b|c) for lookaheads.
  # example: a = b (c d);
  # compile: "b*c*" { replace "b*c*d*" "a*c*d*"; push; push; .reparse }
  #      or: B"b*".E"c*" { replace "b*" "a*"; push; push; .reparse }
    now look at the more complicated
    example: a b = c d e (a b|c e); # must be equal length alternations
    compile: 
    pop;pop;pop;
    B"c*d*e*".!"c*d*e*" { 
      # add a start marker like '#'
      E"a*b*",E"c*e*" { 
        # add start marker, somehow
        replace "#c*d*e*" "a*b*"; push;push;push;push; 
        # !! now need to copy attributes from a b and c e to new
        # positions. this will be challenging.
        .reparse
      }
    }
    push;push;pus;
  *#

  # sequence alternations within (..) and [...] are called altgroups
  # example: ('.' x | ',' y )
  "(*rsequence*|*","<*rsequence*|*" {
    replace "rsequence*" "altgroup*"; push; push; push;
    clear; --; --; get; 
    put; ++; ++; clear; .reparse
  }

  "+(*rsequence*)*","+(*rsequence*|*" {
    replace "rsequence*" "lookgroup*"; push; push; push;
    clear; --; --; add "E"; get; 
    # reverse not-ends-with, for not tokens for example
    B'E!' { clop; clop; put; clear; add "!E"; get; }
    put; ++; ++; 
    clear; .reparse
  }

  # this turns quoted text into a token when it is in an alternation in 
  # a parse rule, but not when it is in a lexing 'orset'.
  # example: a = b c | 'begin' ;
  "rsequence*|*quoted*","altgroup*|*quoted*","lookgroup*|*quoted*" {
    push; push;
    # convert 'abc' to "abc*" for literal tokens
    clear; get; 
    B"'".E"'" { clip; clop; add "*"; }
    B'"'.E'"' { clip; clop; add "*"; } 
    !E"*" { add "*"; } put; 
    clear; add "token*"; push; .reparse
  }

  # I need this avoid a class with lexing alternations (charset* token)
  # because a charset = charset | charset;
  "rsequence*|*charquoted*","altgroup*|*charquoted*","lookgroup*|*charquoted*" {
    push; push;
    # convert 'x' to "x*" for literal tokens
    # convert !'x' to !'x*' for negated literal tokens
    clear; get; clip; clop; add "*"; put; 
    clear; add '"'; get; add '"'; put;
    # store pop list in = attribute
    clear; --; add "pop;"; put; ++;
    clear; add "rsequence*"; push; .reparse
  }

  # part of the new rule parsing code. An rsequence is a sequence
  # of tokens on the right hand side of the = 

  # example: = a b c 
  # compile: = rseq
  # example: + ( '.' b | c d )
  # compile: +(*rsequence*|*rsequence*)*

  "=*rsequence*token*","|*rsequence*token*",
  "(*rsequence*token*","+(*rsequence*token*","<*rsequence*token*",
  ")*rsequence*token*",">*rsequence*token*" {
    # save the context token
    push;
    # store pop list in the '=' or '|' attribute. This will be used
    # for compilation later, but also to check rsequence lengths
    clear; --; get; add "pop;"; put; ++;
    # wrap sequence in quotes
    clear; get; clip; ++; get; add '"'; --; put;
    clear; add "rsequence*"; push; .reparse
  }

  # rule sequences with literals
  # example: = a 'c' 
  # compile: = rseq
  # example: ( a b '#' 
  "=*rsequence*charquoted*","|*rsequence*charquoted*",
  "(*rsequence*charquoted*","+(*rsequence*charquoted*",
  "<*rsequence*charquoted*",")*rsequence*charquoted*" {
    # save the context token
    push;

    # convert 'x' to x* for literal tokens
    clear; ++; get; clip; clop; add "*"; put; --;
    # store pop list in the '=' or '|' attribute. This will be used
    # for compilation later, but also to check rsequence lengths
    clear; --; get; add "pop;"; put; ++;
    # wrap sequence in quotes
    clear; get; clip; ++; get; add '"'; --; put;
    clear; add "rsequence*"; push; .reparse
  }


  # the second item can be a string because we compile to 'until'
  # but second item cant be a class because of 'untils' limitations
  # this compiles a incomplete snippet that will be completed later
  # example: '[' to ']' 
  # example: [:;] to '/end'
  # compile: '[' { until ']'; put; 
  "pattern*to*pattern*" {
    clear; get; add ' { until '; ++; ++; get; --; --;
    add '; put; '; put; clear; add "tomatch*"; push; .reparse 
  }

  # this compiles a incomplete snippet that will be completed later
  # example: '[' between [:space:] 
  # compile: '[' { whilenot [:space:]; put; 
  # example: [:;] until '/'
  # bug: this is allowing 'a' between 'ab' because everything is a 
  #  pattern. 
  "pattern*between*pattern*" {
    clear; ++; ++; get; 
    # convert from quoted to class
    B"'".E"'" { 
      clip; clop; "]" { clear; add "\\]"; } put;
      clear; add "["; get; add "]"; put; 
    }
    --; --; 
    clear; get; add ' { whilenot '; ++; ++; get; --; --;
    add '; put; '; put; clear; add "betweenmatch*"; push; .reparse 
  }


  # and logic, but this cannot be mixed with OR | logic  
  # fix: I can remove all the class.and.quoted rules etc because this
  # is delt with by:
  #   >> andset and = class and | quoted and | charquoted and ;
  "andmatch*and*quoted*","andmatch*and*class*","andmatch*and*charquoted*",
  "andmatch*and*charmatch*","andmatch*and*textmatch*" {
    clear; get; ++; get; ++; get; --; --; put;
    clear; add "andmatch*"; push; .reparse
  }

  "delete*quoted*;*","delete*charquoted*;*" {
     clear; add "replace "; ++; get; --; add " '';"; put;
     clear; add "action*"; push; .reparse
  }
  
  # print statements
  # example: print 'error at line: $line \n';
  # compile: clear; add 'error at line:'; lines; add "\n"; print; clear;
  "print*concat*;*" {
    clear; add "clear; "; ++; get; --; add " print; clear;"; put;
    clear; add "action*"; push; .reparse
  }

  # the same but adds a newline
  "println*concat*;*" {
    clear; add "clear; "; ++; get; --; add " add '\\n'; print; clear;"; put;
    clear; add "action*"; push; .reparse
  }

  # example: print 'x';
  "print*charquoted*;*" {
    clear; add "clear; add "; ++; get; --; add "; print; clear;"; put;
    clear; add "action*"; push; .reparse
  }

  # delete from the input stream all following matching chars
  "ignore*class*;*" {
    clear; 
    add "# ignore-rule \n";
    ++; get; add " { while "; get; add "; "; get; add " { clear; } }"; --; put;
    clear; add "lexrule*"; push; .reparse
  }

  # eg: EOF: name = capital lowerchars; 
  # example: eof: print "hi";
  "eof*:*rule*","eof*:*action*" {
    replace ":*" "{*"; add "}*";
    push; push; push; push; .reparse
  }

  # simplify lex parsing, 
  # textrules and lexrules can only occur in the lexing phase of
  # the syntagma script.
  "{*textrule*}*","{*lexrule*}*","{*lexruleset*}*" {
    clear; add "{*textruleset*}*"; 
    push; push; push; .reparse
  }

  # indent code in braces
  "{*textruleset*}*","{*action*}*","{*ruleset*}*","{*rule*}*",
  "{*ruleblock*}*","{*beginblock*}*" {
    push; push; push;
    add "\n"; --; --; get; replace "\n" "\n  "; put; ++; ++;
    clear; pop; pop; pop; 
  }

  # orsets, which are alternations in lexing rules. 
  "quoted*|*quoted*","quoted*|*charquoted*","quoted*|*class*" {
    clear; get; add ","; ++; ++; get; --; --; put;
    clear; add "ormatch*";
  }

  "charset*|*quoted*","charquoted*|*quoted*","class*|*quoted*" {
    clear; get; add ","; ++; ++; get; --; --; put;
    clear; add "ormatch*";
  }

  "ormatch*|*quoted*","ormatch*|*charquoted*","ormatch*|*class*" {
    clear; get; add ","; ++; ++; get; --; --; put;
    clear; add "ormatch*";
  }

  # but these should be able to be expressed by classes like [ab\n]
  # charsets eg: 'a'|'b'|'\n'
  #          eg: [a-z]|'x'|'y'
  "charquoted*|*charquoted*","charquoted*|*class*","charset*|*charquoted*",
  "class*|*charquoted*","charset*|*class*","class*|*class*" {
    clear; get; add ","; ++; ++; get; --; --; put;
    clear; add "charset*";
  }

  # eg: exit 4;
  # compile: zero; a+; a+; a+; a+; quit;
  "exit*digit*;*" {
    clear; add "zero; "; ++; get; --; add "#";
    # a rather silly trick, todo, negative numbers
    replace "5#" "4# a+;"; replace "4#" "3# a+;";
    replace "3#" "2# a+;"; replace "2#" "1# a+;";
    replace "1#" "0# a+;"; replace "0#" ""; 
    add " quit;"; put;
    clear; add "action*"; push; .reparse
  }

  #-----------------------
  # 4 token parsing
  pop;

  # make an altgroup into an altsequence so that we can combine
  # the rsequence tokens with the altgroup tokens into 1 alternation.
  # this will allow multiple alteration groups in the RHS of a rule.
  # example: a = b c ( x y z |
  "rsequence*(*altgroup*|*" {

    # build the altsequence attribute, the B begin test will be substituted
    # with E in other contexts
    clear; add "B"; get; clip; ++;++; swap; clop; swap; get; put; --;--; 
    clear; add "rsequence*(*altbuild*|*"; push;push;push;push; .reparse
  }

  # allow delete: [class]; and ignore: [class];
  "delete*:*class*;*","ignore*:*class*;*" {
    clear; ++;++; get; --; put; --;
    clear; add "ignore*class*;*"; push;push;push; .reparse
  }

  # allow negation of sequences of tokens on the right-hand-side of 
  # a parse rule. These can be used in "notsets" which are and logic
  # sets of negated tokens or sequences of tokens. 
  # example: e = e '*' e +(not ('*' e) and not ('/' e));
  "not*(*rsequence*)*" {
    clear; ++; ++; add "!"; get; --; --; put;
    # put the pop; list in the previous invisible token ( ) | = etc
    # clear; ++; get; --; --; put; ++;
    clear; add "notsequence*"; push; .reparse
  }

  # like awks begin blocks
  "begin*{*beginblock*}*" {
    clear; add "begin {"; ++;++; get; --;--; add "\n}\n"; put;
    clear; add "start*"; push; .reparse
  }

  # alternation group parsing. need to check for unequal length sequences.
  # example: (a b | c '.')  
  # compile: "a*b*","c*.*"

  "altgroup*|*rsequence*|*","altgroup*|*rsequence*)*",
  "altgroup*|*charquoted*|*","altgroup*|*charquoted*)*",
  "altgroup*|*rsequence*>*","altgroup*|*charquoted*>*" {
    # a push list is already in | - see 2 token rule for charquoted.
    replace "|*rsequence*" ""; replace "|*charquoted*" ""; 
    push; push;

    # workspace is clear. get the pop; lists in ( and | .The (* token
    # is just before the altgroup* token, but not visible here.
    --; get; --; --; 
    !(==) { 
      clear; 
      add "\n";
      add "The sequences in the alternation group were of unequal length\n";
      add "(rule on line "; lines; add ") \n";
      add "This is currently not allowed in alternation groups (x|y|x) \n";
      print; zero; a-;a-; quit;
    }
    ++; ++; ++;
    clear; --; --; get; ++; ++; add ","; get; --; --; put; ++; ++;
    clear; .reparse
  }

  # lookahead group parsing. need to check for unequal length sequences.
  # example: a b | c '.' |  (or) a b | c '.' )
  # compile: E"a*b*",E"c*.*"
  "lookgroup*|*rsequence*|*","lookgroup*|*rsequence*)*" {
    # here do "lookgroup|charquoted) as well" but need to put
    # a push list in | - see 2 token rule
    replace "|*rsequence*" ""; replace "|*charquoted*" ""; 
    push; push;

    # workspace is clear. get the pop; lists in +( and | .The +(* token
    # is just before the lookgroup* token, but not visible here.
    --; get; --; --; 
    !(==) { 
      clear; 
      add "\n";
      add "The sequences in the lookahead alternation were of unequal length\n";
      add "(rule on line "; lines; add ") \n";
      add "This is not allowed in lookahead groups +( ...) \n";
      print; zero; a-;a-; quit;
    }
    ++; ++; ++;

    clear; --; --; get; ++; ++; add ",E"; get; --; --; put; ++; ++;
    clear; .reparse
  }


  # make a phantom 'ruleblock*' token, to help with parsing. A phantom
  # token is a token created with an empty attribute value, and without actually
  # parsing anything from the input stream. It must be created in a 
  # particular context, and must avoid interfering with other parse rules.

  # I need to use this also in the lexblocks because it is so good
  "LHS*=*rsequence*{*","LHS*=*alttail*{*","+(*lookgroup*)*{*","(*altgroup*)*{*",
  "lookgroup*)*rsequence*{*","altgroup*)*rsequence*{*","/*condition*/*{*",
  "(*altbuild*)*{*","altbuild*)*rsequence*{*" {
    push;push;push;push; clear; put;
    add "ruleblock*"; push; .reparse
  }

  #*
  # assign text to LHS token attributes by getting attributes from the RHS
  # example: a = b c { @1 := "$1 : $2"; }
    compile: 
      pop;pop; "b*c*" {
        clear; get; add " : "; ++;get;--; put;
        clear; add "a*"; push; .reparse
      }
      push;push;
  *# 

  "leftattvar*:=*concat*;*","leftattvar*:=*attvar*;*",
  "leftattvar*:=*sysvar*;*","leftattvar*:=*uservar*;*" {
    # get the string/variable concatenation code;
    clear; ++;++; add "clear; "; get;

    # append @1 etc code from leftattvar
    --; --; add "\n"; get; put;
    clear; add "attrule*"; push; .reparse
  }

  # new LHS/RHS rule parsing
 
  # in the parse token attributes for '=' and '|', just preceding
  # the rsequence, we have stored the pop list 'pop;pop;etc'. we
  # can compare these to check if the sequences are the same length.
  # if they are different lengths, then the compilation procedure is
  # quite different, and in the case of '{' possibly non-sensicle.
  # if they are unequal we will store a flag in the 1st "UNEQUAL"
  # this needs some rethought...  
  # example: a = b c | e f ;
  # compile: pop;pop; "b*c*","e*f*" { clear; add "a*"; push; .reparse } push;push;
  # example: a = b | e f ;
  # compile: 
  #   pop; "b*" { clear; add "a*"; push; .reparse } push;
  #   pop;pop; "e*f*" { clear; add "a*"; push; .reparse } push;push;
  #
  # as can be seen, the second compilation is more tricky
  # because we have to separate into 2 blocks. I believe that the 
  # 2nd example requires a variable LHS stored on the tape, because we
  # need to grab that var as soon as we find an unequal sequence....

  "rsequence*|*rsequence*;*","rsequence*|*alttail*;*" {
    # save token sequence in tape cell above ';' attribute
    ++;++;++;++; put; --;--;--;--;
    clear; --; get; ++; ++;   
    # tape test
    # a trick to keep the poplist but flag the alternation as having
    # unequal length sequences. 
    # Here I could compile uneven sequences into the '{' token attribute
    # and remove "|*alttail*" Then when completing the rule, I check '{' for
    # compiled code and include it.
    # !(==) { --; --; replace "pop;" "unequal;"; put; ++; ++; }
    
    # --------------------------------
    # attempting to compile unequal alternations to the ';' attribute
    !(==) { 
      clear; 
      # get the pop; list from the '|' token attribute
      --; ++; add "\n"; get; add "\n"; 
      # add the token match list
      ++; get; add " {\n  "; 
      mark "here"; go "LHS"; get; go "here";
      add "\n}"; ++; swap; get; put; 

      # print; zero; a-;a-; quit;
      --; --; 
      # get the pop; list from '|' attribute and make a push; list
      clear; add "\n"; get; replace "pop;" "push;"; 
      ++; ++; swap; get; --; --;

      # put all compiled code into new ';' attribute
      put; --; 
      clear; add "rsequence*;*"; push; push; .reparse
    }

    # print; zero; a-;a-; quit;
    # restore token sequence. dont need to reparse
    clear; --; ++;++;++;++; get; --;--;--;--;
  }

  "rsequence*|*rsequence*;*","rsequence*|*alttail*;*" {
    # compose alternation
    clear; get; add ","; ++; ++; get; --; --; put;
    # copy ';' attribute down, this may contain compiled code
    # for unequal length alt sequences
    clear; ++; ++; ++; get; --; --; put; --;
    clear; add "alttail*;*"; push; push; .reparse
  }


  # I think variable length sequences in alternations for rules
  # that have a composition block {} is non sensical so I will disallow
  # it here
  "rsequence*|*rsequence*{*","rsequence*|*alttail*{*" {
    # save token sequence in { or ; attribute
    ++;++;++; put; --;--;--; clear; --; get; ++; ++;   
    !(==) { 
      clear; 
      add "\n";
      add "The sequences in the alternation were of unequal length\n";
      add "(alternation on line "; lines; add ") \n";
      add "This is not allowed in parse-token reduction rules that \n";
      add "have a following block\n";
      print; zero; a-;a-; quit;
    }
    # restore token sequence. dont need to reparse
    clear; --; ++; ++; ++; get; --;--;--;
  }

  # tail-reduction of RHS token sequences before '{' or ';'
  # this is quite elegant because the rsequences have already been
  # wrapped in quotes.
  # example: ... c d | e g {
  # compile: "c*d*","e*g*" {
  "rsequence*|*rsequence*{*","rsequence*|*alttail*{*" {
    clear; get; add ","; ++; ++; get; --; --; put;
    clear; add "alttail*{*"; push; push; .reparse
  }

  # this could also be an unequal alternation list with code in
  # the ';' attribute.

  # compile a complete rule. The pop;pop; list is stored in the '='
  # LHS should already have its compiled code
  # NOTE: that the ';' token attribute will contain code for unequal 
  # length sequences, and so should be added here.
  # example: a = d e ;
  # compile: pop;pop; "d*e*" { clear; add "a*"; push; .reparse } 
  "LHS*=*rsequence*;*" {

    clear; ++; get; add "\n"; ++; get; add " "; --; --; 
    add "{\n  "; get; add "\n}\n"; put;

    # here: build the push;push; list and add to nom code
    clear; ++; get; --; replace "pop;" "push;"; swap; get; 

    # add the unequal sequence compiled code (from the ';' attribute)
    ++; ++; ++; get; --; --; --; put;
    clear; add "rule*"; push; .reparse
  }

  # normally the rsequences can be same or different lengths.
  # the compilation for unequal length sequences is pretty special.
  # it involves creating separate blocks for each branch and 
  # compiled nom code is saved in the ';' attribute and copied with
  # that token.
  # example: a = b c | e f ;
  # compile: 
  #   pop;pop; "b*c*","e*f*" { clear; add "a*"; push; .reparse } push;push;
  # example: a = b | e f ;
  # compile: 
  #   pop; "b*" { clear; add "a*"; push; .reparse } push;
  #   pop;pop; "e*f*" { clear; add "a*"; push; .reparse } push;push;
  #
  # as can be seen, the second compilation is more tricky

  "LHS*=*alttail*;*" {
    #* remove:
    # check for unequal length sequences in the alternation
    # this is obsolete code, since unequal sequences are compiled
    clear; ++; get; B"unequal;" {  
      clear; 
      add "\n";
      add "The sequences in the alternation were of unequal length\n";
      add "(alternation on line "; lines; add ") ";
      add "... \n";
      print; zero; a-;a-; quit;
    } --; 
    *#

    # build code with pop;pop; list and token match list
    clear; ++; get; add "\n"; ++; get; add " "; --; --; 
    add "{\n  "; get; add "\n}\n"; put; clear;

    # here: build the push;push; list and do swap;get;
    ++; get; --; replace "pop;" "push;"; swap; get;

    # add the unequal sequence compiled code (from the ';' attribute)
    ++; ++; ++; get; --; --; --; 
    put;
    clear; add "rule*"; push; .reparse
  }

  # eg: match '<' to '>' { tag: '<a>'|'<b>'; }
  # compile: 
  #  '<' { until [>]; put; 'green','blue','x' 
  #        { clear; add 'tag*"; push; .reparse } }

  "tomatch*{*textruleset*}*","betweenmatch*{*textruleset*}*",
  "tomatch*{*action*}*","betweenmatch*{*action*}*" {
    clear; 
    add "# lex-rule \n";
    get; replace '" { until' '" {\n  until'; 
    # not needed here???
    replace "while !" "whilenot "; 
    # identing is done above
    ++; ++; get; --; --;
    add '\n}'; put;
    clear; add "lexrule*"; push; .reparse 
  }

  # the second item can be a string because we compile to 'until'
  # but second item cant be a class because of 'untils' limitations
  # example: register: '[' to ']' ;
  # example: register: [:;] to '/end' ;
  # compile: '[' { until ']'; put; clear; add "register*"; }
  "token*:*tomatch*;*" {
    clear; ++; ++; get; --; --; 
    add ' clear; add "'; get; add '"; push; .reparse }'; put;
    clear; add "lexrule*"; push; .reparse 
  }

  # example: register: '[' between [:space:] ;
  # compile: '[' { whilenot [:space:] ; put; clear; add "register*"; }
  "token*:*betweenmatch*;*" {
    clear; ++; ++; get; --; --; 
    add ' clear; add "'; get; add '"; push; .reparse }'; put;
    clear; add "lexrule*"; push; .reparse 
  }


  # this allows a default token for all text, in this context 
  # I want the * to create a default token name even if the 
  # pattern space is empty. But in 'match * { ... }' it only matches
  # if pattern space is not empty, silly???? fix

  # example: shape: * ;
  # compile: !'' { clear; add "shape*"; push; .reparse }
  "token*:*star*;*" {
    clear; 
    add "clear; add '"; get; add "'; push; .reparse"; put;
    clear; add "lexrule*"; push; .reparse
  }

  # example: lit: [,.;];
  # compile: [,.;] { add "*"; push; .reparse }
  # example: lit: ','|':'|'x' ;
  # example: [a-z]+ { lit: 'while'|'do'|'end'; * {println 'error';exit;} }
  # compile: ',',':','x' { add "*"; push; .reparse }

  "lit*:*class*;*","lit*:*charset*;*","lit*:*charquoted*;*" {
    clear; ++; ++; get; --; --; 
    add " { add '*'; push; .reparse }"; put;
    clear; add "lexrule*"; push; .reparse
  }

  # multicharacter literals, but these should go in a block after
  # lexing a sequence.
  "lit*:*quoted*;*","lit*:*ormatch*;*" {
    clear; ++; ++; get; --; --; 
    add " { add '*'; push; .reparse }"; put;
    clear; add "textrule*"; push; .reparse
  }

  # for empty do 'match empty { etc }'
  # eg: EOF { name = capital lowerchars; }
  "eof*{*rule*}*","eof*{*ruleset*}*" {
    clear; add "(eof) {"; ++; ++; get; --; --; add "\n}\n"; put;
    clear; add "rule*"; push; .reparse
  }

  # example: eof { print 'yes'; }
  "eof*{*action*}*" {
    clear; add "(eof) {"; ++; ++; get; --; --; add "\n}\n"; put; 
    clear; add "action*"; push; .reparse
  }


  # eg: EOF { letter: [a-z]; print 'bye'; exit 2; } 
  "eof*{*textruleset*}*" {
    clear; add "(eof) {"; ++; ++; get; --; --; add "\n}\n"; put;
    clear; add "lexrule*"; push; .reparse
  }

  # lex tokens with AND and OR | logic
  # example: keyword: 'is';
  # compile: 'is' { clear; add "keyword*"; push; .reparse }
  # example: logic: 'is'|'or'|'and';
  # compile: 'is','or','and' { clear; add "keyword*"; push; .reparse }
  # example: 0number: [:digit] AND begins '0' 
  # compile: [:digit:].B'0' { clear; add "0number*"; push; .reparse }
  "token*:*quoted*;*","token*:*ormatch*;*","token*:*andmatch*;*" {
    clear; ++; ++; get; --; --;
    add ' { put; clear; add "'; get; add '"; push; .reparse }'; put;
    clear; add "textrule*"; push; .reparse 
  }

  # example: char: [:alpha:];
  # compile: [:alpha:] { clear; add "char*"; push; .reparse }
  "token*:*class*;*" {
    clear; ++; ++; get; --; --;
    add ' { clear; add "'; get; add '"; push; .reparse }'; put;
    clear; add "lexrule*"; push; .reparse 
  }

  # example: space: ' ';
  # compile: ' ' { clear; add "space*"; }
  "token*:*charquoted*;*","token*:*charset*;*" {
    clear; ++; ++; get; --; --;
    add ' { clear; add "'; get; add '"; push; .reparse }'; put;
    clear; add "lexrule*"; push; .reparse 
  }

  
  #*
  fix: remove
  "andmatch*{*textruleset*}*","andmatch*{*action*}*",
  "charset*{*textruleset*}*","charset*{*action*}*",
  "ormatch*{*textruleset*}*","ormatch*{*action*}*",
  "quoted*{*textruleset*}*","quoted*{*action*}*",
  "star*{*textruleset*}*","star*{*action*}*",
  "empty*{*textruleset*}*","empty*{*action*}*",
  "charquoted*{*textruleset*}*","charquoted*{*action*}*",
  "class*{*textruleset*}*","class*{*action*}*" {
    clear; get; add " {"; 
    ++; ++; get; --; --; add "\n}"; put; 
    clear; add "lexrule*"; push; .reparse 
  }
  *#

  # example: "match 'ok' { print 'bye!'; exit 0; }
  # compile: 'ok' { ... }
  "textmatch*{*textruleset*}*","textmatch*{*action*}*" {
    clear; get; add " {"; 
    ++; ++; get; --; --; add "\n}"; put; 
    clear; add "lexrule*"; push; .reparse 
  }


  #-----------------------
  # 5 token parsing
  pop;


  # allow negation of sequences of tokens on the right-hand-side of 
  # a parse rule. These can be used in "notsets" which are and logic
  # sets of negated tokens or sequences of tokens. 
  # example: e = e '*' e +(not ('*' e) and not ('/' e));

  B";*",B"<*",B">*",B"(*",B")*",B"|*",B"=*" {
    E"not*(*rsequence*)*" {
      clear; ++; ++; add "!"; get; --; --; put;
      # put the pop; list in the previous token ( ) | = etc
      clear; ++; get; --; --; put; ++;
      clear; add "notsequence*"; push; .reparse
    }
  }

  # this should not interpolate the quoted text
  "declare*var*:=*quoted*;*","declare*var*:=*charquoted*;*" {
    # the var already has fetch code in it...
    clear; ++; get; --; replace 'mark "here"; go' 'mark';
    replace 'get; go "here";' ''; 
    add "add "; ++; ++; ++; get; --; --; --; add '; put; ++;'; put;
    clear; add "vardef*"; push; .reparse
  }

  # example: word: [:alpha:]+ ;
  # compile: [:alpha:] { while [:alpha:]; put; clear; add "word*"; }
  # example: word: ![a-z]+ ;
  # compile: ![:alpha:] { whilenot [:alpha:]; put; clear; add "word*"; }
  "token*:*class*+*;*" {
    clear; ++; ++; get; add ' { while '; get; --; --; 
    # while ![a-z]; is not valid nom syntax (sadly) 
    replace "while !" "whilenot ";
    add '; put; clear; add "'; get; 
    add '"; push; .reparse }'; put;
    clear; add "lexrule*"; push; .reparse 
  }

  # eg: [a-z]+ { colour: 'green'|'blue'|'x'; }
  # compile: 
  #  [a-z] { 
  #    while [a-z]; put; 
  #    'green','blue','x' { clear; add 'name*"; push; .reparse }
  #  }

  "class*+*{*textrule*}*","class*+*{*textruleset*}*",
  "class*+*{*lexrule*}*","class*+*{*lexruleset*}*",
  "class*+*{*action*}*" {
    clear; get; 
    add ' {\n  while '; get; replace "while !" "whilenot ";
    add '; put;'; 
    # identing is done above
    ++; ++; ++; get; --; --; --;
    add '\n}'; put;
    clear; add "lexrule*"; push; .reparse 
  }

  #-----------------------
  # 6 token parsing
  pop;

  # build altsequences
  # example: a = b c (x y | p q |
  # This builds a nom alternation, eg: B"b*c*x*y*",B"b*c*p*q*" in altsequence
  "rsequence*(*altbuild*|*rsequence*|*",
  "rsequence*(*altbuild*|*rsequence*)*" {
    # save final token for later
    E")*" { ++;++;++;++;++; clear; add ')'; put; --;--;--;--;--; }
    E"|*" { ++;++;++;++;++; clear; add '|'; put; --;--;--;--;--; }
    # check if sequences in (..) are of same length
    clear; ++; get; ++;++;
    !(==) { 
      clear; 
      add "
        The sequences in the alternation group were of unequal length.
        example: a = b c ( d | e f);  # not ok. (line "; lines; add ") 
        This is only possible in syntagma in non-group alternations 
        example: a = b | c d | e f g; # ok \n\n";
      replace "\n     " "\n";
      print; zero; a-;a-; quit;
    }
    --;--;--; 
    # get existing altsequence then build new branch
    clear; ++;++; get; --;--; add ",B"; get; clip; 
    ++;++;++;++; swap; clop; swap; get; --;--; put; --;--;
    clear; add "rsequence*(*altbuild*"; 

    # restore final token  - )* or |*
    ++;++;++;++;++; get; add '*'; --;--;--;--;--; 

    push;push;push;push; .reparse
  }

  # a syntax to check the value of a token attribute
  # I am avoiding reversing the test because that will match "textmatch{...}"
  # example: "green" == $2 { ... } 
  # compile: clear; ++; get; --; "green" { ... }
  "textmatch*==*attvar*{*textruleset*}*","textmatch*==*var*{*textruleset*}*" {
    # change '++;++; get; --; etc' to '++; swap; --;'
    # change 'go "xxx"; get; go "here" etc' to 'go "xxx"; swap; '
    # then add a swap at the end of the block. This preserves the token sequence?
    clear; ++;++; get; replace "get;" "swap;"; put; --;--;
    clear; 
    add "clear;\n"; ++;++; get; add '\n'; --;--; get; 
    add ' { '; ++;++;++;++; get; --;--;--;--; add "\n}\n";
    ++;++; get; --;--;
    put;
    clear; add "textrule*"; push; .reparse
  }

  # compile a complete rule with a following block. 
  # The pop;pop; list is stored in the '='
  # LHS should already have its compiled code. The code in the block
  # should be compiled before the LHS code.
  # example: a = d e { print 'reduced!\n'; }
  # compile: 
  #  pop;pop; "d*e*" { 
  #    clear; add "reduced!\n"; print; clear;
  #    clear; add "a*"; push; .reparse 
  #  } 
  "LHS*=*rsequence*{*ruleblock*}*","LHS*=*alttail*{*ruleblock*}*" {
    # make push list and store in ';' attribute
    clear; ++; get; replace "pop;" "push;"; ++; ++; put; --; --; --; 
    clear; ++; get; add "\n"; ++; get; add " "; --; --; 
    add "{ "; 
    # add block code
    ++; ++; ++; ++; get; --; --; --; --; add "\n  ";
    # add LHS code
    get; add "\n}\n"; 
    ++; ++; ++; get; --; --; --; put;
    clear; add "rule*"; push; .reparse
  }

  #*
  # compile a complete rule with alternation and a following block. 

  # example: a = d e | f g { print 'reduced!\n'; }
  # compile: 
  #  pop;pop; "d*e*","f*g*" { 
  #    clear; add "reduced!\n"; print; clear;
  #    clear; add "a*"; push; .reparse 
  #  } 
  *#
  
  #-----------------------
  # 7 token parsing
  pop;

  # the new condition syntax. the parse rule only triggers if the 
  # condition is met.
  # example: a = b c /$1 == "text"/;
  # example: a = b c /$1 == $2 and $line not matches [01]/;
  # example: a = b c /$line == $3/;
  # example: a b = b /$line == "4" or $char != "0"/;

  "LHS*=*rsequence*/*condition*/*;*" {

    # append the initial poplist and sequence match code
    clear; ++; get; add "\n"; ++; get; add " {\n  "; 

    # fix: some code to prevent $3 being used in rule like: a = b c/../;
    # because $3 is meaningless here. need to compare to pop; list
    # save token sequence up in tape by converting pop list to 
    # eg "++;++; put; --;--;"
    --;
    swap; replace "pop;" "++;"; swap; get; add " put; ";
    swap; replace "++;" "--;"; swap; get; add "\n  ";
    ++;
    # append condition code and brace block
    ++;++; get; add "\n  "; add '"TRUE" {\n    ';

    # append left-hand-side rule code 
    --;--;--;--; get; add "\n  }\n  "; 

    # append restore token sequence code, and final push list 
    add "clear; ";
    ++;
    swap; replace "--;" "++;"; swap; get; add " get; ";
    swap; replace "++;" "--;"; swap; get; add "\n}\n";
    swap; replace "--;" "push;"; swap; get; add "\n";
    --;

    put;
    clear; add "rule*"; push; .reparse
  }

  # alternations preceded by a token sequence on the RHS of a rule
  # example: a = b c d (x y| p q| s t);
  "LHS*=*rsequence*(*altbuild*)*;*" {
    # get the combined poplist and save in (
    clear; ++; get; ++;++; get; put; add "\n"; 

    # append the complete token test from altbuild, and convert to equals not begins
    ++; swap; clop; replace '",B"' '","'; swap; get; add " {\n  ";

    # append the LHS code
    --;--;--;--; get; add "\n}\n"; 

    # append the push list
    ++;++;++; swap; replace "pop;" "push;"; swap; get; --;--;--;

    put;
    clear; add "rule*"; push; .reparse
  }

  "stack*(*rsequence*)*{*textruleset*}*","stack*(*altgroup*)*{*textruleset*}*",
  "stack*(*rsequence*)*{*ruleblock*}*","stack*(*altgroup*)*{*ruleblock*}*" {
    clear; 
    add "clear; unstack;\n"; ++;++; get; ++;++;++;
    add " {"; get; add "\n}\nstack;"; 
    --;--;--;--;--; put;
    clear; add "rule*"; push; .reparse
  }

  # parse optionals where there is not following sequence
  # andsets may be of some use here, but I wont worry for now.
  # example: a = b c [ d | e];
  # this is compiled into 2 separate nom blocks.
  # actually just delegate to rs < altgroup > rs ;
  "LHS*=*rsequence*<*altgroup*>*;*","LHS*=*rsequence*<*rsequence*>*;*",
  "LHS*=*rsequence*<*notset*>*;*" {
    # make a push list for rsequence and optionals, save in ';'
    clear; ++; get; ++; ++; get; ++; ++; ++; 
    replace "pop;" "push;"; put; --; --; --; --; --; --; 
    clear; 
    # get pop; list and sequence
    ++; get; add "\n"; ++; get; add " {\n  "; 
    --; --; get; add "\n}\n";
    # now get optional pop; list and sequence alternation
    ++; ++; ++; get; add "\n"; 
    # begins-with rsequence
    --; add "B"; get; add ".!"; get; add " {\n  " ;
    # ends-with optional sequence alternation
    # I believe the replace below is safe because "," wont occur anywhere
    # else in the compiled code. Also, works for [rsequence] but not 
    # for andsets.
    ++; ++; add "E"; get; replace '","' '",E"';
    add " {\n    ";
    # add LHS compiled code
    --; --; --; --; get; add "\n  }\n}\n";
    # get the saved push list
    ++; ++; ++; ++; ++; ++; get; add "\n"; 
    --; --; --; --; --; --;
    put; clear; add "rule*"; push; .reparse
  }

  # an alternation group followed by a sequence of tokens. This is only 
  # distinguished from the look-behind syntax by (...) instead of +(...)
  # The ( and ) tokens contain pop lists. 
  # example: x y = (b c | p q) r s; 
  "LHS*=*(*altgroup*)*rsequence*;*" {
    # make initial poplist and save in ) attribute
    clear; ++;++; get; ++;++; get; put; add "\n";
    
    # append altgroup begin test.
    --; swap; clop; swap; add 'B"'; get; 
    replace '","' '",B"'; add " {\n  ";

    # append rsequence test
    ++;++; add "E"; get; add " {\n    ";

    # append the LHS code
    --;--;--;--;--; get; add "\n  }\n}\n";

    ++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get;
    --;--;--;--; put;
    clear; add "rule*"; push; .reparse
  }

  # fix: also add 'notset* here, for syntax such as
  #   a = +(not a b and not a c)
  #   "LHS*=*rsequence*+(*notset*)*;*" {

  # this is the new lookbehind syntax
  # example: 
  "LHS*=*+(*lookgroup*)*rsequence*;*" {
    # make the RHS pop; list and store in )*
    clear; ++;++; get; ++;++; get; put; add "\n";

    # append the lookgroup sequences, and modify E" -> B" for lookbehind
    --; swap; clop; clop; swap; add 'B"'; 
    get; replace '",E"' '",B"'; add " {\n  ";

    # append the rsequence tokens and start block, add 'E"*' to make sure
    # that the look-behind tokens are there
    ++;++; swap; clop; swap; add 'E"*'; get; add " {\n    ";

    # get the lookgroup pop list and make a push list
    --;--;--; swap; replace "pop;" "push;"; swap; get;

    # append the LHS code 
    --;--; add "\n    "; get; add "\n  }\n}\n";
    
    # append the saved combined pop list from )* after converting to push
    ++;++;++;++; swap; replace "pop;" "push;"; swap; get;
    --;--;--;--;
    put;

    clear; add "rule*"; push; .reparse
  }

  # I may have to keep the LHS push; list in a variable 
  # because I need to access it separately to align the tape
  # pointer to the start of the look up group; Or use stack?
  "LHS*=*rsequence*+(*lookgroup*)*;*","LHS*=*rsequence*+(*andmatch*)*;*" {
    E"andmatch*)*;*" {
      # this is a bit dubious..fix:
      clear; ++;++;++;++; get; replace "!'" "!E'"; 
      # also add a 1 pop list for the andset.
      put; --; clear; add "pop;"; put; --;--;--;
    }
    # temporarily fix the LHS
    clear; get; 
    replace "clear; add " ""; replace " .reparse" ""; put;

    # make an rsequence push; list from '=' and store in ';'
    clear; ++; get; replace "pop;" "push;"; ++;++;++;++;++; put;
    --; --; --; --; --; --;
    clear;

    # make a lookgroup push; list from '+(' and store in ')'
    clear; ++; ++; ++; get; replace "pop;" "push;"; ++; ++; put;
    --; --; --; --; --;

    clear;
    # construct pop; list at top of parse block
    # this list consist of rsequence length + lookahead length
    ++; get; ++; ++; get; add "\n";

    # match the rsequence
    # example: B"a*b*c*".!"a*b*c*" {
    --; add "B"; get; add ".!"; get; add " {\n  ";

    # match the lookahead group, but I want to do E"*x*y*",E"*p*q*"
    # this is necessary to match whole tokens and not partial tokens.
    # in the lookahead.
    # example: E"x*y*",E"p*q*" {
    ++; ++; 
    # add a star infront of each token sequence
    swap; clop; clop; replace ',E"' ',E"*'; swap; add 'E"*';
    get; add " {\n";

    # build the replace command, and push list. The push list is
    # the LHS length + Lookahead length.
    # example: replace "a*b*c*" "new*"; push;push;push;

    # I am just going to check for rsequence in lookahead and halt if true.
    # but this clobbers the current attribute

    # fix: something like this needs to go in the lookahead rule code...
    add "    put; replace "; --; --; 
    swap; clop; swap; add '"*'; get; add ' ""; !(==) {\n';
    add "      clear; add 'lookahead contains reduction sequence.\\n';\n";
    add "      add 'This is an error condition. Please modify \\n';\n";
    add "      add 'the syntagma grammar. \\n'; print; zero; a-;a-; quit;\n";
    add "    }\n";
    add '    replace "'; get; --; --; add " "; 
    # here I could try to use a trick to avoid multiple replace
    # but I cant do it, because LHS has '"a*b*"; push;push;'
    # example:
    #   replace "*a*b*" "****a*b*"; 
    #   replace "a*b*" "new*"; 
    #   replace "****a*b*" "*a*b*"; 
    #   push;push;push;
    # This trick should avoid replacing sequences that dont start
    # the workspace. But a ^ anchor for replace would be better.

    # build the push list
    get; add "\n    ";  
    ++; ++; ++; ++; ++; get; 
    add " .reparse";

    #*
    # ?? copy down all attributes in lookgroup\n";
    # need to realign to the end of rsequence...do this by
    # subtracting the push; list in LHS, but this push list also
    # has the name,... 
      A dodgy strategy: add "add ";get LHS attrib, get +( attribute, now we have
      >> add "l*g*"; push;push;pop;pop;
      >> replace '"; push;' '";clear;push;';
      now we have 
      >> add "l*g*";clear;push; ... pop;
      >> replace "push;" "++;"; replace "pop;" "--;";
      now we have
      >> add "...";clear;--;--;++;++;
      and this will realign the pointer?
    *# 

    add "\n  }\n}\n";

    # build final push; list from ')' and ';' 
    # which is rsequence + lookahead lengths
    get; ++; get; add "\n";
    # print; 
    --; --; --; --; --; --;
    put;
    # clear; add "LHS*=*rsequence*+(*lookgroup*)*;*";
    clear; add "rule*"; push; .reparse
  }

  # ---------------------
  # 8 token parsing
  pop;
  # parse optionals 
  # example: a = b c [ x y | e f ] c;
  # this is compiled into 2 separate nom blocks.

  "LHS*=*rsequence*<*altgroup*>*rsequence*;*",
  "LHS*=*rsequence*<*rsequence*>*rsequence*;*",
  "LHS*=*rsequence*<*notset*>*rsequence*;*" {
    # make a push list for rsequence and optionals, save in ';'
    clear; ++; get; ++; ++; get; ++; ++; get; ++; ++;
    replace "pop;" "push;"; put; --; --; --; --; --; --; --;
    clear; 
    # get both rsequence pop; list and both sequences
    ++; get; ++;++;++;++; get; --;--;--;--; add "\n"; 
    ++; get; ++;++;++;++; get; --;--;--;--; 
    replace '""' ''; add " {\n  "; 
    --; --; get; add "\n}\n";
    # now get optional pop; list and sequence alternation
    ++; ++; ++; get; add "\n"; 
    # begins-with rsequence
    --; add "B"; get; add ".!"; get; add " {\n  " ;
    # ends-with optional sequence alternation
    # I believe the replace below is safe because "," wont occur anywhere
    # else in the compiled code. Also, works for [rsequence] but not 
    # for andsets.
    ++; ++; add "E"; get; replace '","' '",E"';
    add " {\n    ";
    # add LHS compiled code
    --; --; --; --; get; add "\n  }\n}\n";
    # get the saved push list
    ++; ++; ++; ++; ++; ++; get; add "\n"; 
    --; --; --; --; --; --;
    put; clear; add "rule*"; push; .reparse
  }

  # alternations preceded and suceeded by a token sequence 
  # example: a = b c d (x y| p q| s t) n m;
  "LHS*=*rsequence*(*altbuild*)*rsequence*;*" {
    # get the combined poplist and save in )
    clear; ++; get; ++;++; get; ++;++; get; put; add "\n"; --;--;

    # append the complete token test from altbuild, leaving begin test
    ++; get; add " {\n  ";

    # append last rsequence code converting to ends test
    ++;++; add "E"; get; add " {\n    "; --;--;

    # append the LHS code
    --;--;--;--; get; add "\n  }\n}\n"; 

    # append the push list
    ++;++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get; 
    --;--;--;--;--;

    put;
    clear; add "rule*"; push; .reparse
  }


  # ---------------------
  # 9 token parsing
  pop;
  # condition syntax with ruleblocks. the parse rule only triggers if the 
  # condition is met.
  # example: a = b c /$1 == "text"/ { println "xx"; }

  "LHS*=*rsequence*/*condition*/*{*ruleblock*}*",
  "LHS*=*rsequence*/*orcondition*/*{*ruleblock*}*",
  "LHS*=*rsequence*/*andcondition*/*{*ruleblock*}*" {

    # append the initial poplist and sequence match code
    clear; ++; get; add "\n"; ++; get; add " {\n  "; 

    # fix: some code to prevent $3 being used in rule like: a = b c/../;
    # because $3 is meaningless here. need to compare to pop; list

    # save token sequence up in tape by converting pop list to 
    # eg "++;++; put; --;--;"
    --;
    swap; replace "pop;" "++;"; swap; get; add " put; ";
    swap; replace "++;" "--;"; swap; get; add "\n  ";
    ++;

    # append condition code and brace block
    ++;++; get; add "\n  "; add '"TRUE" {';

    # append rule block code.
    ++;++;++; swap; replace "\n" "\n  "; swap; get; --;--;--; 

    # append left-hand-side rule code 
    --;--;--;--; add "\n    "; get; add "\n  }\n  "; 

    # append restore token sequence code, and final push list 
    add "clear; ";
    ++;
    swap; replace "--;" "++;"; swap; get; add " get; ";
    swap; replace "++;" "--;"; swap; get; add "\n}\n";
    swap; replace "--;" "push;"; swap; get; add "\n";
    --;

    put;
    clear; add "rule*"; push; .reparse
  }
  # alternations preceded by a token sequence on the RHS of a rule
  # example: a = b c d (x y| p q| s t);
  "LHS*=*rsequence*(*altbuild*)*{*ruleblock*}*" {
    # get the combined poplist and save in (
    clear; ++; get; ++;++; get; put; add "\n"; 

    # append the complete token test from altbuild, and convert to equals not begins
    ++; swap; clop; replace '",B"' '","'; swap; get; add " {";

    # append the ruleblock code 
    ++;++;++; get; add "\n  "; --;--;--;

    # append the LHS code
    --;--;--;--; get; add "\n}\n"; 

    # append the push list
    ++;++;++; swap; replace "pop;" "push;"; swap; get; --;--;--;

    put;
    clear; add "rule*"; push; .reparse
  }

  # an alternation group followed by a sequence of tokens. This is only 
  # distinguished from the look-behind syntax by (...) instead of +(...)
  # The ( and ) tokens contain pop lists. 
  # example: x y = (b c | p q) r s; 
  "LHS*=*(*altgroup*)*rsequence*{*ruleblock*}*" {
    # make initial poplist and save in ) attribute
    clear; ++;++; get; ++;++; get; put; add "\n";
    
    # append altgroup begin test.
    --; swap; clop; swap; add 'B"'; get; 
    replace '","' '",B"'; add " {\n  ";

    # append rsequence test
    ++;++; add "E"; get; add " {";

    # append ruleblock code and indent 2 (already indented 2)
    ++;++; swap; replace "\n" "\n  "; swap; get; --;--; add "\n    "; 

    # append the LHS code
    --;--;--;--;--; get; add "\n  }\n}\n";

    ++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get;
    --;--;--;--; put;
    clear; add "rule*"; push; .reparse
  }
  # fix: also add 'notset* here, for syntax such as
  #   a = +(not a b and not a c)
  #   "LHS*=*rsequence*+(*notset*)*;*" {

  # this is the new lookbehind syntax with a rule block
  # example: a = +(x y|p q) b c d { @1:="$1,$2,$3";}
  # note that the attribute variables $1,etc do not reference the look-behind
  # tokens.
  "LHS*=*+(*lookgroup*)*rsequence*{*ruleblock*}*" {
    # make the RHS pop; list and store in )*
    clear; ++;++; get; ++;++; get; put; add "\n";

    # append the lookgroup sequences, and modify E" -> B" for lookbehind
    --; swap; clop; clop; swap; add 'B"'; 
    get; replace '",E"' '",B"'; add " {\n  ";

    # append the rsequence tokens and start block
    ++;++; add "E"; get; add " {\n    ";

    # get the lookgroup pop list and make a push list
    --;--;--; swap; replace "pop;" "push;"; swap; get; 

    # append the rule block code here, after indent 2 spaces (already
    # indented 2.
    ++;++;++;++;++; swap; replace "\n" "\n  "; swap; get;
    --;--;--;--;--;

    # append the LHS code 
    --;--; add "\n    "; get; add "\n  }\n}\n";
    
    # append the saved combined pop list from )* after converting to push
    ++;++;++;++; swap; replace "pop;" "push;"; swap; get;
    --;--;--;--;
    put;

    clear; add "rule*"; push; .reparse
  }
  # lookahead  with a rule block. To achieve this we need to copy attributes
  # from the lookahead tokens to their new positions in the parse stack,
  # if the LHS sequence is shorter than the RHS sequence (rsequence).
  # This copy procedure is somewhat verbose. I think I will prohibit the
  # LHS token sequence being longer than the RHS sequence, because it 
  # complicates the attribute copy, and it doesn't seem very useful anyway.

  # The tokens '=' and '+(' contain a list of pops which indicates the
  # length of the following sequence, or alternations group members.
  # I will probably use a variable to hold the pop; or push; list for
  # the LHS since there is nowhere to save it.

  # example: a = b c +(x y | p q) { @1 := "$1/$2"; println 'found a'; }
  # example: n m = b c +('.' | ',') { @1 := "$1/$2"; println 'found a'; }
  "LHS*=*rsequence*+(*lookgroup*)*{*ruleblock*}*" {
    clear; 
    # first create a complete pop; list for rsequence and lookgroup
    # and save it in the '{' token.
    ++; get; ++; ++; get; ++; ++; ++; put; mark "poplist";
    --; --;--; --; --; --; add "\n";
    # build block test eg: B"b*c*" { E"x*y*",E"p*q*" { ...
    ++; ++; add "B"; get; add " {\n  "; ++; ++; get; add " {\n";
    # build the replace, eg: replace "b*c*" "a*";
    --; --; add "    replace "; get; add " "; --; --; 
    swap; replace "clear; add " ""; replace " .reparse" ""; 

    # ------------------------------------
    # some serious juggling here. need the push later to 
    # calculate the length difference between the LHS and rsequence.
    replace '"; push;' '"; #push;'; swap;
    get; add "\n    ";

    # build code to: save new token sequence lower on tape and push later.
    mark "here"; go "poplist"; swap;
    replace "pop;" "++;"; swap; get; add " put; ";
    swap; replace "++;" "--;"; swap; get;
    go "here";
    # add compiled ruleblock code
    add "\n    # ----------";
    add "\n    # code block in {}";
    ++;++;++;++;++;++;++;
    # re-indent the code
    swap; replace "\n" "\n  "; swap; get; 
    --;--;--;--;--;--;--;

    # save built code so far.

    put; clear;

    # copy lookahead token attributes. this is the trikiest part.
    # first create a "diff" pop; list (difference in length between LHS
    # and RHS sequences.)
    
    # compare LHS and rsequence push/pop lists
    # 'pushlist' contains just push; list for LHS 

    mark "here"; go "pushlist"; get;  go "here";

    # get the RHS pop; list from '=' and build the 
    # lookahead attribute copy code there.
    ++; get;  # eg: "push;push;pop;pop;pop;" for "a b= c d e..."

    # up to 13 difference 
    replace "push;push;push;push;pop;pop;pop;pop;" ""; 
    replace "push;push;push;push;pop;pop;pop;pop;" ""; 
    replace "push;push;pop;pop;" ""; 
    replace "push;pop;" ""; replace " " "";

    # do not allow the LHS to be longer than the RHS with these 
    # lookahead rules, because attribute copy becomes too hard...
    B"push;" {
      clear; 
      add "The left hand side token sequence is longer than \n";
      add "the right hand side sequence. This is not permitted \n";
      add "within parse rules with lookahead tokens within +( and ) \n";
      print; zero; a-;a-; quit;
    }
    # if LHS and RHS sequences are the same length, then nothing to copy
    !"" {
      # now only "pop;" list. eg "pop;pop;"
      put; replace "pop;" "++;"; add "++; get; ";
      swap; replace "pop;" "--;"; swap; get; add " put; --;"; 

      # save code eg: "++;++; get; --; put;" (for 2 token difference)
      put; clear;

      # get +( pop; list (lookahead length)
      ++; ++; get; --; --;

      # 1 lookahead tokens
      "pop;" { clear; add "\n"; get; }
      # 2 lookahead tokens
      "pop;pop;" { 
        clear; add "\n"; get; add "\n";
        add "++;"; get; add "--;"; add "\n";
      }
      # 3 lookahead tokens
      "pop;pop;pop;" { 
        clear; add "\n"; get; add "\n";
        add "++;"; get; add "--;"; add "\n";
        add "++;++;"; get; add "--;--;"; add "\n";
      }
      # 4 lookahead tokens
      "pop;pop;pop;" { 
        clear; add "\n"; get; add "\n";
        add "++;"; get; add "--;"; add "\n";
        add "++;++;"; get; add "--;--;"; add "\n";
        add "++;++;++;"; get; add "--;--;--;"; add "\n";
      }

      put; clear; 
      add "\n# -----------------------."; 
      add "\n# lookahead attribute copy."; get;
      replace "\n" "\n    ";
      # print; zero; a-;a-; quit;
      # etc for more lookahead tokens
    }
    "" { add "\n    # LHS length = RHS length: no attribute copy\n"; }

    --;
    swap; get;

    # build code to: get saved new token sequence from tape and push.
    add "\n    clear; ";
    mark "here"; go "poplist"; swap;
    replace "--;" "++;"; swap; get; add " get; ";
    swap; replace "++;" "--;"; swap; get;
    go "here";
    add " stack;\n  }\n}\n";

    # get the pop list and convert to push list
    ++;++;++;++;++;++; swap; replace "--;" "push;"; swap; get;
    --;--;--;--;--;--;
    put; 
    clear; add "rule*"; push; .reparse
  }

  # ---------------------
  # 10 token parsing
  pop;


  # lookbehind syntax with conditions
  # example: a = +(x y|p q) b c /$1 == $2/; 
  # todo: "LHS*=*+(*lookgroup*)*rsequence*/*condition*/*;*" {}

  # 2 alternations groups on the RHS of a parse rule
  # example: a = (x|y) p q s (n m|o p);
  "LHS*=*(*altgroup*)*rsequence*(*altbuild*)*;*" {
    # get the complete pop list and save in (
    clear; ++;++; get; ++;++; get; ++;++; get; put; add "\n";

    # append the altgroup test
    --;--;--; add "B"; swap; replace '","' '",B"'; swap; get; add " {\n  ";

    # append altbuild test and convert to ends test
    # but convert to E"*a*b*c*" with preceding * to ensure that false
    # matches cant happen if altgroup=altbuild etc. ?
    ++;++;++;++; 
    swap; clop;clop; replace '",B"' '",E"*'; 
    swap; add 'E"*'; get; add " {\n    ";

   
    # get LHS code
    --;--;--;--;--;--;--;
    get; add "\n  }\n}\n"; 

    # append the push list
    ++;++;++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get;
    --;--;--;--;--;--;
    put;
    clear; add "rule*"; push; .reparse
  }

  # alternations preceded and suceeded by a token sequence 
  # example: a = b c d (x y| p q| s t) n m;
  "LHS*=*rsequence*(*altbuild*)*rsequence*{*ruleblock*}*" {
    # get the combined poplist and save in )
    clear; ++; get; ++;++; get; ++;++; get; put; add "\n"; --;--;

    # append the complete token test from altbuild, leaving begin test
    ++; get; add " {\n  ";

    # append last rsequence token test converting to ends test
    ++;++; add "E"; get; add " {"; 

    # append the ruleblock code indented 2 spaces more
    ++;++; swap; replace "\n" "\n  "; swap; get; add "\n    "; --;--;--;--;

    # append the LHS code
    --;--;--;--; get; add "\n  }\n}\n"; 

    # append the push list
    ++;++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get; 
    --;--;--;--;--;

    put;
    clear; add "rule*"; push; .reparse
  }

  # ---------------------
  # 11 token parsing
  pop;

  # 2 alternation groups with preceding sequences
  # this is easy because altbuild already has everything compiled
  # example: a = b c (d|e) f g h (i j|k l);
  "LHS*=*rsequence*(*altbuild*)*rsequence*(*altbuild*)*;*" {
    # get complete pop list and save in second (
    clear; ++; get; ++;++; get; ++;++; get; ++;++; get; put; add "\n";
   
    # append first altbuild test
    --;--;--; get; add " {\n  "; 

    # append 2nd altbuild test and convert to ends test
    ++;++;++;++; 
    swap; clop;clop; replace '",B"' '",E"*'; 
    swap; add 'E"*'; get; add " {\n    ";

    # append LHS code
    --;--;--;--;--;--;--;--; get; add "\n  }\n}\n";

    # append saved push list
    ++;++;++;++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get;
    --;--;--;--;--;--;--;
    put;
    clear; add "rule*"; push; .reparse
  }

  # ---------------------
  # 12 token parsing
  pop;

  # 2 alternations groups on the RHS of a parse rule
  # example: a = (x|y) p q s (n m|o p);
  "LHS*=*(*altgroup*)*rsequence*(*altbuild*)*{*ruleblock*}*" {
    # get the complete pop list and save in (
    clear; ++;++; get; ++;++; get; ++;++; get; put; add "\n";

    # append the altgroup test
    --;--;--; add "B"; swap; replace '","' '",B"'; swap; get; add " {\n  ";

    # append altbuild test and convert to ends test
    # but convert to E"*a*b*c*" with preceding * to ensure that false
    # matches cant happen if altgroup=altbuild etc. ?
    ++;++;++;++; 
    swap; clop;clop; replace '",B"' '",E"*'; 
    swap; add 'E"*'; get; add " {";

    # append rule block code
    ++;++;++; 
    swap; replace "\n" "\n  "; swap; get; add "\n    ";
    --;--;--;

    # get LHS code
    --;--;--;--;--;--;--;
    get; add "\n  }\n}\n"; 

    # append the push list
    ++;++;++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get;
    --;--;--;--;--;--;
    put;
    clear; add "rule*"; push; .reparse
  }

  # ---------------------
  # 13 token parsing
  pop;

  # 2 alternation groups with preceding sequences and rule block
  # this is easy because altbuild already has everything compiled
  # example: a = b c (d|e) f g h (i j|k l) { print "wow"; }
  "LHS*=*rsequence*(*altbuild*)*rsequence*(*altbuild*)*{*ruleblock*}*" {
    # get complete pop list and save in second (
    clear; ++; get; ++;++; get; ++;++; get; ++;++; get; put; add "\n";
   
    # append first altbuild test
    --;--;--; get; add " {\n  "; 

    # append 2nd altbuild test and convert to ends test
    ++;++;++;++; 
    swap; clop;clop; replace '",B"' '",E"*'; 
    swap; add 'E"*'; get; add " {";

    # append ruleblock code and indent 2 spaces more
    ++;++;++; 
    swap; replace "\n" "\n  "; 
    swap; get; add "\n    ";
    --;--;--;
    
    # append LHS code
    --;--;--;--;--;--;--;--; get; add "\n  }\n}\n";

    # append saved push list
    ++;++;++;++;++;++;++; 
    swap; replace "pop;" "push;"; swap; get;
    --;--;--;--;--;--;--;
    put;
    clear; add "rule*"; push; .reparse
  }
  # ---------------------
  # 14 token parsing
  pop;



  # some errors at eof. no see above
  (eof) {
    nop;
  }

  (eof) {

    "start*","action*","grammar*" {
      clear; get; add "\n\n"; print; quit;
    }

    # if no parse rules, make an empty one and a grammar 
    "lexruleset*","lexrule*" { 
      push; 
      add "\n# empty rule added to grammar\n"; put;
      clear; add "rule*"; push; .reparse 
    }

    # save the parse stack
    put; clear;
    add "[syntagma script did not parse well]\n";
    add "[parse stack: "; get; add " ]\n";
    #add "[parse stack: "; swap; replace "*" " "; swap; get; add "]\n";
    replace "\n     " "\n"; 
    print; quit;
  }
  
  push;push;push;push;push;push;
  push;push;push;push;push;push;
  push;push; 
