tools/docco/t_parse.py
author robin <robin@reportlab.com>
Tue, 07 Mar 2017 10:00:34 +0000
changeset 4330 617ffa6bbdc8
parent 3789 5bc95e1f3dd4
child 4370 823a8c33ce43
permissions -rw-r--r--
changes for release 3.4.0
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
4330
617ffa6bbdc8 changes for release 3.4.0
robin <robin@reportlab.com>
parents: 3789
diff changeset
     1
#Copyright ReportLab Europe Ltd. 2000-2017
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     2
#see license.txt for license details
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     3
#history http://www.reportlab.co.uk/cgi-bin/viewcvs.cgi/public/reportlab/trunk/reportlab/tools/docco/t_parse.py
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     4
"""
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     5
Template parsing module inspired by REXX (with thanks to Donn Cave for discussion).
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     6
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     7
Template initialization has the form:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     8
   T = Template(template_string, wild_card_marker, single_char_marker,
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
     9
             x = regex_x, y = regex_y, ...)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    10
Parsing has the form
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    11
   ([match1, match2, ..., matchn], lastindex) = T.PARSE(string)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    12
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    13
Only the first argument is mandatory.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    14
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    15
The resultant object efficiently parses strings that match the template_string,
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    16
giving a list of substrings that correspond to each "directive" of the template.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    17
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    18
Template directives:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    19
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    20
  Wildcard:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    21
    The template may be initialized with a wildcard that matches any string
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    22
    up to the string matching the next directive (which may not be a wild
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    23
    card or single character marker) or the next literal sequence of characters
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    24
    of the template.  The character that represents a wildcard is specified
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    25
    by the wild_card_marker parameter, which has no default.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    26
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    27
    For example, using X as the wildcard:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    28
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    29
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    30
    >>> T = Template("prefixXinteriorX", "X")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    31
    >>> T.PARSE("prefix this is before interior and this is after")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    32
    ([' this is before ', ' and this is after'], 47)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    33
    >>> T = Template("<X>X<X>", "X")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    34
    >>> T.PARSE('<A HREF="index.html">go to index</A>')
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    35
    (['A HREF="index.html"', 'go to index', '/A'], 36)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    36
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    37
    Obviously the character used to represent the wildcard must be distinct
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    38
    from the characters used to represent literals or other directives.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    39
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    40
  Fixed length character sequences:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    41
    The template may have a marker character which indicates a fixed
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    42
    length field.  All adjacent instances of this marker will be matched
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    43
    by a substring of the same length in the parsed string.  For example:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    44
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    45
      >>> T = Template("NNN-NN-NNNN", single_char_marker="N")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    46
      >>> T.PARSE("1-2-34-5-12")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    47
      (['1-2', '34', '5-12'], 11)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    48
      >>> T.PARSE("111-22-3333")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    49
      (['111', '22', '3333'], 11)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    50
      >>> T.PARSE("1111-22-3333")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    51
      ValueError: literal not found at (3, '-')
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    52
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    53
    A template may have multiple fixed length markers, which allows fixed
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    54
    length fields to be adjacent, but recognized separately.  For example:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    55
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    56
      >>> T = Template("MMDDYYX", "X", "MDY")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    57
      >>> T.PARSE("112489 Somebody's birthday!")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    58
      (['11', '24', '89', " Somebody's birthday!"], 27)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    59
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    60
  Regular expression markers:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    61
    The template may have markers associated with regular expressions.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    62
    the regular expressions may be either string represenations of compiled.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    63
    For example:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    64
      >>> T = Template("v: s i", v=id, s=str, i=int)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    65
      >>> T.PARSE("this_is_an_identifier: 'a string' 12344")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    66
      (['this_is_an_identifier', "'a string'", '12344'], 39)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    67
      >>>
3023
47087db79506 Extra blank-line to fix doc string for cheesecake index.
damian
parents: 2963
diff changeset
    68
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    69
    Here id, str, and int are regular expression conveniences provided by
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    70
    this module.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    71
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    72
  Directive markers may be mixed and matched, except that wildcards cannot precede
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    73
  wildcards or single character markers.
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    74
  Example:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    75
>>> T = Template("ssnum: NNN-NN-NNNN, fn=X, ln=X, age=I, quote=Q", "X", "N", I=int, Q=str)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    76
>>> T.PARSE("ssnum: 123-45-6789, fn=Aaron, ln=Watters, age=13, quote='do be do be do'")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    77
(['123', '45', '6789', 'Aaron', 'Watters', '13', "'do be do be do'"], 72)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    78
>>>
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    79
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    80
"""
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    81
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    82
import re, string
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
    83
from reportlab.lib.utils import ascii_letters
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    84
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    85
#
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    86
# template parsing
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    87
#
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    88
# EG: T = Template("(NNN)NNN-NNNN X X", "X", "N")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    89
#     ([area, exch, ext, fn, ln], index) = T.PARSE("(908)949-2726 Aaron Watters")
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    90
#
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    91
class Template:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    92
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    93
   def __init__(self,
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    94
                template,
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    95
                wild_card_marker=None,
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    96
                single_char_marker=None,
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    97
                **marker_to_regex_dict):
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    98
       self.template = template
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
    99
       self.wild_card = wild_card_marker
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   100
       self.char = single_char_marker
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   101
       # determine the set of markers for this template
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   102
       markers = list(marker_to_regex_dict.keys())
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   103
       if wild_card_marker:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   104
          markers.append(wild_card_marker)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   105
       if single_char_marker:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   106
          for ch in single_char_marker: # allow multiple scm's
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   107
              markers.append(ch)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   108
          self.char = single_char_primary = single_char_marker[0]
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   109
       self.markers = markers
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   110
       for mark in markers:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   111
           if len(mark)>1:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   112
              raise ValueError("Marks must be single characters: "+repr(mark))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   113
       # compile the regular expressions if needed
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   114
       self.marker_dict = marker_dict = {}
3723
99aa837b6703 second stage of port to Python 3.3; working hello world
rptlab
parents: 3721
diff changeset
   115
       for mark, rgex in marker_to_regex_dict.items():
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   116
           if isinstance(rgex,str):
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   117
              rgex = re.compile(rgex)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   118
           marker_dict[mark] = rgex
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   119
       # determine the parse sequence
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   120
       parse_seq = []
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   121
       # dummy last char
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   122
       lastchar = None
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   123
       index = 0
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   124
       last = len(template)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   125
       # count the number of directives encountered
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   126
       ndirectives = 0
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   127
       while index<last:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   128
          start = index
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   129
          thischar = template[index]
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   130
          # is it a wildcard?
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   131
          if thischar == wild_card_marker:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   132
             if lastchar == wild_card_marker:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   133
                raise ValueError("two wild cards in sequence is not allowed")
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   134
             parse_seq.append( (wild_card_marker, None) )
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   135
             index = index+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   136
             ndirectives = ndirectives+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   137
          # is it a sequence of single character markers?
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   138
          elif single_char_marker and thischar in single_char_marker:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   139
             if lastchar == wild_card_marker:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   140
                raise ValueError("wild card cannot precede single char marker")
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   141
             while index<last and template[index] == thischar:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   142
                index = index+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   143
             parse_seq.append( (single_char_primary, index-start) )
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   144
             ndirectives = ndirectives+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   145
          # is it a literal sequence?
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   146
          elif not thischar in markers:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   147
             while index<last and not template[index] in markers:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   148
                index = index+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   149
             parse_seq.append( (None, template[start:index]) )
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   150
          # otherwise it must be a re marker
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   151
          else:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   152
             rgex = marker_dict[thischar]
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   153
             parse_seq.append( (thischar, rgex) )
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   154
             ndirectives = ndirectives+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   155
             index = index+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   156
          lastchar = template[index-1]
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   157
       self.parse_seq = parse_seq
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   158
       self.ndirectives = ndirectives
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   159
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   160
   def PARSE(self, s, start=0):
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   161
       ndirectives = self.ndirectives
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   162
       wild_card = self.wild_card
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   163
       single_char = self.char
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   164
       parse_seq = self.parse_seq
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   165
       lparse_seq = len(parse_seq) - 1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   166
       # make a list long enough for substitutions for directives
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   167
       result = [None] * ndirectives
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   168
       current_directive_index = 0
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   169
       currentindex = start
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   170
       # scan through the parse sequence, recognizing
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   171
       for parse_index in range(lparse_seq + 1):
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   172
           (indicator, data) = parse_seq[parse_index]
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   173
           # is it a literal indicator?
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   174
           if indicator is None:
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   175
              if s.find(data, currentindex) != currentindex:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   176
                 raise ValueError("literal not found at "+repr((currentindex,data)))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   177
              currentindex = currentindex + len(data)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   178
           else:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   179
              # anything else is a directive
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   180
              # is it a wildcard?
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   181
              if indicator == wild_card:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   182
                 # if it is the last directive then it matches the rest of the string
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   183
                 if parse_index == lparse_seq:
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   184
                    last = len(s)
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   185
                 # otherwise must look at next directive to find end of wildcard
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   186
                 else:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   187
                    # next directive must be re or literal
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   188
                    (nextindicator, nextdata) = parse_seq[parse_index+1]
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   189
                    if nextindicator is None:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   190
                       # search for literal
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   191
                       last = s.find(nextdata, currentindex)
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   192
                       if last<currentindex:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   193
                          raise ValueError("couldn't terminate wild with lit "+repr(currentindex))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   194
                    else:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   195
                       # data is a re, search for it
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   196
                       last = nextdata.search(s, currentindex)
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   197
                       if last<currentindex:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   198
                          raise ValueError("couldn't terminate wild with re "+repr(currentindex))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   199
              elif indicator == single_char:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   200
                 # data is length to eat
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   201
                 last = currentindex + data
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   202
              else:
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   203
                 # other directives are always regular expressions
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   204
                 last = data.match(s, currentindex) + currentindex
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   205
                 if last<currentindex:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   206
                    raise ValueError("couldn't match re at "+repr(currentindex))
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   207
              #print("accepting", s[currentindex:last])
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   208
              result[current_directive_index] = s[currentindex:last]
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   209
              current_directive_index = current_directive_index+1
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   210
              currentindex = last
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   211
       # sanity check
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   212
       if current_directive_index != ndirectives:
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   213
          raise SystemError("not enough directives found?")
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   214
       return (result, currentindex)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   215
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   216
# some useful regular expressions
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   217
USERNAMEREGEX = \
3789
5bc95e1f3dd4 changes to userguide code
robin
parents: 3723
diff changeset
   218
  "["+ascii_letters+"]["+ascii_letters+string.digits+"_]*"
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   219
STRINGLITREGEX = "'[^\n']*'"
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   220
SIMPLEINTREGEX = "["+string.digits+"]+"
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   221
id = re.compile(USERNAMEREGEX)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   222
str = re.compile(STRINGLITREGEX)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   223
int = re.compile(SIMPLEINTREGEX)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   224
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   225
def test():
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   226
    global T, T1, T2, T3
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   227
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   228
    T = Template("(NNN)NNN-NNNN X X", "X", "N")
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   229
    print(T.PARSE("(908)949-2726 Aaron Watters"))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   230
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   231
    T1 = Template("s --> s blah", s=str)
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   232
    s = "' <-- a string --> ' --> 'blah blah another string blah' blah"
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   233
    print(T1.PARSE(s))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   234
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   235
    T2 = Template("s --> NNNiX", "X", "N", s=str, i=int)
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   236
    print(T2.PARSE("'A STRING' --> 15964653alpha beta gamma"))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   237
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   238
    T3 = Template("XsXi", "X", "N", s=str, i=int)
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   239
    print(T3.PARSE("prefix'string'interior1234junk not parsed"))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   240
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   241
    T4 = Template("MMDDYYX", "X", "MDY")
3721
0c93dd8ff567 initial changes from 2to3-3.3
rptlab
parents: 3617
diff changeset
   242
    print(T4.PARSE("122961 Somebody's birthday!"))
2963
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   243
c414c0ab69e7 reportlab-2.2: first stage of major re-org
rgbecker
parents:
diff changeset
   244
3023
47087db79506 Extra blank-line to fix doc string for cheesecake index.
damian
parents: 2963
diff changeset
   245
if __name__=="__main__": test()