deferred class BACKTRACKING_REGULAR_EXPRESSION_BUILDER

All features

The frame class of all the regular expression builders.

Direct parents

non-conformant parents

REGULAR_EXPRESSION_ITEM_GLOBALS, REGULAR_EXPRESSION_STRING_SCANNER

Known children

conformant children

POSIX_REGULAR_EXPRESSION_BUILDER

Summary

exported features

make

behaviours

internal behaviour

parsing

results

build

make

basic

error managment

scanning

assertions

character classes

character class naming

and/or basics

Details

make

Initialise the attributes.

is_case_insensitive: BOOLEAN

Is the match case insensitive? Default is False

is_case_sensitive: BOOLEAN

Is the match case sensitive? Default is True

set_case_sensitive

Set the match as case sensitive.

ensure

  • definition: is_case_insensitive = False and is_case_sensitive = True

set_case_insensitive

Set the match as case insensitive.

ensure

  • definition: is_case_insensitive = True and is_case_sensitive = False

does_any_match_newline: BOOLEAN

Does the "any character" mark match a newline? Default is False

set_any_match_newline

The "any character" mark will match a newline.

ensure

  • definition: does_any_match_newline = True

set_any_dont_match_newline

The "any character" mark will not match a newline.

ensure

  • definition: does_any_match_newline = False

does_match_line_boundary: BOOLEAN

Does the begin/end marks match line boundary? Default is False

does_match_text_boundary: BOOLEAN

Does the begin/end marks match text boundary? Default is True

ensure

  • definition: Result = not does_match_line_boundary

set_match_line_boundary

The begin/end marks will match line boundary.

ensure

  • definition: does_match_line_boundary = True and does_match_text_boundary = False

set_match_text_boundary

The begin/end marks will match text boundary.

ensure

  • definition: does_match_line_boundary = False and does_match_text_boundary = True

set_default_options

Set the default options

ensure

  • is_case_sensitive
  • not does_any_match_newline
  • does_match_text_boundary

is_greedy: BOOLEAN

Does match a maximal repeat? Default is False

set_greedy

Will match a maximal repeat.

ensure

  • definition: is_greedy = True

set_not_greedy

Will match a minimal repeat.

ensure

  • definition: is_greedy = False

is_looking_ahead: BOOLEAN

Is building a look-ahead term?

is_looking_behind: BOOLEAN

Is building a look-behind term?

is_looking_around: BOOLEAN

Is building look-ahead or look-behind?

is_looking_positive: BOOLEAN

Is the current look-around positive or negative?

parse_expression (expr: STRING)

Set the expression to parse and parse it. When no error the result if put in feature 'last_regular_expression'. If there is an error, a human readable explanation is retrievable by the feature 'last_error'.

require

  • expression_not_void: expr /= Void

ensure

  • error_or_result: has_error xor has_result

parse

Parse the current expression. The result if any is got through 'last_regular_expression'

require

  • expression_not_void: expression /= Void
  • internal_state_ok: stack.is_empty and group_stack.is_empty

ensure

  • internal_state_ok: stack.is_empty and group_stack.is_empty
  • error_or_result: has_error xor has_result

has_result: BOOLEAN

Did the last 'parse' or 'parse_expression' produced a result in 'last_regular_expression'?

ensure

  • definition: Result = last_pattern.is_valid

last_pattern: BACKTRACKING_REGULAR_EXPRESSION_PATTERN

The last regular expression pattern built by 'parse' or 'parse_expression'

deferred internal_parse

The parse function to be implemented by the effective builders.

require

  • at_first_position: position = expression.lower
  • stack_is_empty: stack.is_empty
  • no_groups: last_group_count = 0 and group_stack.is_empty

ensure

  • error_or_result: has_error or else stack.count = 1 and group_stack.is_empty

stack: FAST_ARRAY [E_][BACKTRACKING_NODE]

The stack of items.

group_stack: FAST_ARRAY [E_][INTEGER]

The stack of groups

last_group_count: INTEGER

The count of groups currently found.

Repeat_infiny: INTEGER

Constant that means "infinite repitition".

emit (item: BACKTRACKING_NODE)

Pushs 'item' on the stack. [..] -> [.., item]

require

  • item_not_void: item /= Void

ensure

  • stack_count_increased_by_one: stack.count = old stack.count + 1
  • stack_not_empty: stack.count > 0

unemit: BACKTRACKING_NODE

Pops the Result the stack. [... Result] -> [...]

require

  • stack_not_empty: stack.count > 0

ensure

  • stack_count_decreased_by_one: stack.count = old stack.count - 1

emit_any_character

Push the match to any characters

emit_begin_of_line

Push the match to begin of a line

emit_end_of_line

Push the match to end of a line

prepare_group

Declares that a new group begins.

ensure

  • last_group_count_increased_by_one: last_group_count = old last_group_count + 1
  • group_greater_than_zero: last_group_count > 0
  • group_stack_count_increased_by_one: group_stack.count = old group_stack.count + 1
  • group_pushed: group_stack.last = last_group_count

emit_group

Push the "end of group" item and update the group indicators [.. X] -> [.., end_group(i)]

require

  • group_greater_than_zero: last_group_count > 0
  • group_stack_not_empty: not group_stack.is_empty
  • enougth_data: stack.count > 0

ensure

  • constant_stack_count: stack.count = old stack.count
  • stack_not_empty: stack.count > 0
  • last_group_count_unchanged: last_group_count = old last_group_count
  • group_stack_count_decreased_by_one: group_stack.count = old group_stack.count - 1

emit_begin_group
This feature is obsolete: use declare_group/emit_group
emit_end_group
This feature is obsolete: use declare_group/emit_group
emit_match_previous_group (group: INTEGER)

Push the item that matches the character 'char' [..] -> [.., previous_group(group)]

require

  • valid_group: 0 < group and group <= last_group_count
  • closed_group: not group_stack.has(group)

ensure

  • stack_count_increased_by_one: stack.count = old stack.count + 1
  • stack_not_empty: stack.count > 0

emit_match_single (char: CHARACTER)

Push the item that matches the character 'char' [..] -> [.., char]

ensure

  • stack_count_increased_by_one: stack.count = old stack.count + 1
  • stack_not_empty: stack.count > 0

emit_match_range (lower: CHARACTER, upper: CHARACTER)

Push the item that matches the character range 'lower'..'upper'. [..] -> [.., lower..upper]

require

  • valid_range: lower <= upper

ensure

  • stack_count_increased_by_one: stack.count = old stack.count + 1
  • stack_not_empty: stack.count > 0

emit_match_text (text: STRING)

Push the item that matches the 'text' [..] -> [.., text]

ensure

  • stack_count_increased_by_one: stack.count = old stack.count + 1
  • stack_not_empty: stack.count > 0

begin_collect

Begin to collect a collection of items by pushing Void on the stack. After calling 'begin_collect', one of the features 'end_collect_or' or 'end_collect_and' have to be called. That kind of group is intended to manage the collections of alternatives or sequences in an optimal way. [..] -> [.., Void]

ensure

  • has_collect: stack.fast_occurrences(Void) > 0
  • emit_group_empty: stack.last = Void
  • emit_group_count_incremented: stack.fast_occurrences(Void) = old stack.fast_occurrences(Void) + 1

is_collect_empty: BOOLEAN

True if currently begun collect is empty

require

  • is_collecting: stack.fast_occurrences(Void) > 0

ensure

  • definition: Result = (stack.last = Void)

end_collect_true

Replace an empty collection by TRUE [.., Void] -> [.., TRUE]

require

  • is_collecting: stack.fast_occurrences(Void) > 0
  • collect_empty: is_collect_empty

end_collect_or

Collects the item on the stack until the collect mark (a Void) and replace it by a single item that is a or of all of them. The collection must not be empty. The order of evaluation will remain. The binary or's tree is recurssive on right for efficiency. [.., Void, X] -> [.., X] [.., Void, Y, X] -> [.., Y or X] [.., Void, Z, Y, X] -> [.., Z or (Y or X)] ...

require

  • is_collecting: stack.fast_occurrences(Void) > 0
  • collect_not_empty: not is_collect_empty

ensure

  • stack_not_empty: stack.count > 0
  • emit_group_count_decremented: stack.fast_occurrences(Void) = old stack.fast_occurrences(Void) - 1

end_collect_and

Collects the item on the stack until the collect mark (a Void) and replace it by a single item that is a and of all of them. The collection must not be empty. The order of evaluation will remain. The binary and's tree is recursive on right for efficiency. [.., Void, X] -> [.., X] [.., Void, Y, X] -> [.., Y and X] [.., Void, Z, Y, X] -> [.., Z and (Y and X)] ...

require

  • is_collecting: stack.fast_occurrences(Void) > 0
  • collect_not_empty: not is_collect_empty

ensure

  • stack_not_empty: stack.count > 0
  • emit_group_count_decremented: stack.fast_occurrences(Void) = old stack.fast_occurrences(Void) - 1

emit_not

Replaces the top of the stack by its negation. [.., X] -> [.., not(X)] (where not(X) is like (X and (CUT and FALSE)) or TRUE)

require

  • enougth_data: stack.count > 0

ensure

  • constant_stack_count: stack.count = old stack.count
  • stack_not_empty: stack.count > 0

emit_not_then_any

Replaces the top of the stack by its negation followed by any. [.., X] -> [.., not(X)] (where not(X) is like (X and (CUT and FALSE)) or ANY)

require

  • enougth_data: stack.count > 0

ensure

  • constant_stack_count: stack.count = old stack.count
  • stack_not_empty: stack.count > 0

emit_true_or

Replaces the top of the stack by true or it [.., X] -> [.., true or X]

require

  • enougth_data: stack.count > 0

ensure

  • constant_stack_count: stack.count = old stack.count
  • stack_not_empty: stack.count > 0

emit_or_true

Replaces the top of the stack by it or true [.., X] -> [.., X or true]

require

  • enougth_data: stack.count > 0

ensure

  • constant_stack_count: stack.count = old stack.count
  • stack_not_empty: stack.count > 0

emit_controled_or_true

Replaces the top of the stack by if is_greedy then [.., X] -> [.., X or true]

             else [.., X] -> [.., true or X]

controled_or_true_item (x: BACKTRACKING_NODE): BACKTRACKING_NODE

Returns an item for " 'x' or true ". The returned item depend on the flag 'is_greedy'. if is_greedy then Result = (X or true)

             else Result = (true or X)

emit_repeat (mini: INTEGER, maxi: INTEGER)

Takes the top of the stack and replace it with a construction that will evaluate the repeating of it from 'mini' to 'maxi' times. The feature boolean 'is_greedy' controls if the matched repeat will be of minimal length or of maximal length. That feature is reset to its default (False) value.

require

  • enougth_data: stack.count > 0
  • mini_is_valid: mini >= 0 and then mini /= Repeat_infiny
  • maxi_is_valid: maxi = Repeat_infiny or else maxi >= mini
  • not_droping: mini = 0 implies maxi /= 0

ensure

  • constant_stack_count: stack.count = old stack.count
  • stack_not_empty: stack.count > 0

emit_looking

require

  • enougth_data: stack.count > 0
  • is_looking: is_looking_around

ensure

  • constant_stack_count: stack.count = old stack.count
  • stack_not_empty: stack.count > 0

make

Initialise the attributes.

scanned_string: STRING

The expression being currently build.

set_scanned_string (string: STRING)

Set the 'scanned_string' with 'string'.

ensure

  • has_no_error: not has_error
  • definition: scanned_string = string
  • at_the_begin: position = scanned_string.lower

has_error: BOOLEAN

True when an error was encountered

clear_error

Remove the error flag

ensure

  • has_no_error: not has_error

last_error: STRING

Returns a string recorded for the error.

require

  • has_error: has_error

ensure

  • not_void: Result /= Void

set_error (message: STRING)

Set has_error and last_error. The explaining error string 'last_error' is created as follow: "Error at position 'position': 'message'.".

require

  • message_not_void: message /= Void
  • has_no_error: not has_error

ensure

  • has_error: has_error

position: INTEGER

The scanned position. It is the position of 'last_character'.

last_character: CHARACTER

The scanned character. The last character readden from 'scanned_string'.

valid_last_character: BOOLEAN

True when 'last_character' is valid. Is like 'scanned_string.valid_index(position)'

valid_previous_character: BOOLEAN

True if the position-1 is a valid position.

require

  • scanned_string /= Void

ensure

  • definition: Result = scanned_string.valid_index(position - 1)

previous_character: CHARACTER

The character at position-1.

require

  • valid_previous_character

ensure

  • definition: Result = scanned_string.item(position - 1)

valid_next_character: BOOLEAN

True if the position+1 is a valid position.

require

  • scanned_string /= Void

ensure

  • definition: Result = scanned_string.valid_index(position + 1)

next_character: CHARACTER

The character at position+1.

require

  • valid_next_character

ensure

  • definition: Result = scanned_string.item(position + 1)

end_of_input: BOOLEAN

True when all the characters of 'scanned_string' are scanned.

ensure

  • implies_last_character_not_valid: Result implies not valid_last_character

goto_position (pos: INTEGER)

Change the currently scanned position to 'pos'. Updates 'last_character' and 'valid_last_character' to reflect the new position value.

require

  • has_no_error: not has_error
  • scanned_string /= Void

ensure

  • has_no_error: not has_error
  • position_set: position = pos
  • validity_updated: valid_last_character = scanned_string.valid_index(position)
  • character_updated: valid_last_character implies last_character = scanned_string.item(position)

read_character

Reads the next character.

require

  • has_no_error: not has_error
  • not_at_end: not end_of_input

ensure

  • next_position: position > old position
  • has_no_error: not has_error

read_integer

Reads an integer value beginning at the currently scanned position. The readen value is stored in 'last_integer'.

require

  • has_no_error: not has_error
  • not_at_end: not end_of_input
  • begin_with_a_digit: last_character.is_decimal_digit

ensure

  • has_no_error: not has_error
  • digits_eaten: end_of_input or else not last_character.is_decimal_digit

saved_position: INTEGER

The saved position (only one is currently enougth).

save_position

Saves the current scanning position.

require

  • not_at_end: not end_of_input

ensure

  • not_at_end: not end_of_input
  • position_kept: position = old position
  • saved_position_set: saved_position = position

restore_saved_position

Restore the scanning position to the last saved one.

ensure

  • position_restored: position = old saved_position
  • not_at_end: not end_of_input

last_string: STRING

A string buffer.

last_integer: INTEGER

An integer buffer.

the_any_character_item: REGULAR_EXPRESSION_ITEM_ANY
the_not_end_of_line_item: REGULAR_EXPRESSION_ITEM_NOT_END_OF_LINE
the_begin_of_line_item: REGULAR_EXPRESSION_ITEM_BEGIN_OF_LINE
the_end_of_line_item: REGULAR_EXPRESSION_ITEM_END_OF_LINE
the_begin_of_text_item: REGULAR_EXPRESSION_ITEM_BEGIN_OF_TEXT
the_real_end_of_text_item: REGULAR_EXPRESSION_ITEM_END_OF_TEXT
the_end_of_text_item: REGULAR_EXPRESSION_ITEM_END_OF_TEXT
the_begin_of_word_item: REGULAR_EXPRESSION_ITEM_BEGIN_OF_WORD
the_end_of_word_item: REGULAR_EXPRESSION_ITEM_END_OF_WORD
the_is_posix_alnum_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_ALNUM
the_is_posix_alpha_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_ALPHA
the_is_posix_ascii_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_ASCII
the_is_posix_blank_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_BLANK
the_is_posix_cntrl_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_CNTRL
the_is_posix_digit_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_DIGIT
the_is_posix_graph_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_GRAPH
the_is_posix_lower_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_LOWER
the_is_posix_print_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_PRINT
the_is_posix_punct_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_PUNCT
the_is_posix_space_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_SPACE
the_is_posix_upper_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_UPPER
the_is_posix_word_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_WORD
the_is_posix_xdigit_item: REGULAR_EXPRESSION_ITEM_IS_POSIX_XDIGIT
has_named_posix_item (name: STRING): BOOLEAN

True if 'name' is for a valid posix character class

require

  • name_not_void: name /= Void

named_posix_item (name: STRING): REGULAR_EXPRESSION_ITEM

the item for the valid posix character class 'name'

require

  • name_not_void: name /= Void
  • good_name: has_named_posix_item(name)

ensure

  • good_result: Result /= Void

has_named_perl_item (name: STRING): BOOLEAN

True if 'name' is for a valid perl character class

require

  • name_not_void: name /= Void

named_perl_item (name: STRING): REGULAR_EXPRESSION_ITEM

the item for the valid perl character class 'name'

require

  • name_not_void: name /= Void
  • good_name: has_named_perl_item(name)

ensure

  • good_result: Result /= Void

internal_named_posix_item (name: STRING): REGULAR_EXPRESSION_ITEM

the item for a presumed posix character class 'name'

require

  • name_not_void: name /= Void

internal_named_perl_item (name: STRING): REGULAR_EXPRESSION_ITEM

the item for a presumed perl character class 'name'

require

  • name_not_void: name /= Void

the_cut_node: BACKTRACKING_NODE_CUT
the_true_node: BACKTRACKING_NODE_TRUE
the_false_node: BACKTRACKING_NODE_FALSE
the_cut_and_false_node: BACKTRACKING_NODE_CUT_AND_FALSE

Class invariant