OScript API/Built-in Package Index |
The Pattern built-ins implement a simplified and slightly modified subset of UNIX's regular expression pattern matching. They allow find patterns to be compiled for finding patterns within text, as well as change patterns which replacing that text as well.
A find pattern may consist of the following elements:
Element | Meaning in find pattern |
---|---|
c | Match exact character (c is any character here) |
@c | Match escaped character ( such as @%, @[, @*) |
? | Match any character except end of line |
% | Match beginning of line |
$ | Match end of line (null string before end of line) |
[...] | Match character class (any one of 'these' characters, such as [abc] or [a-z]) |
[!...] | Match negated character class (all but these characters, such as [! ] for no spaces) |
* | Closure (match zero or more occurrences of the previous element, such as [0-9]* for zero or more digits) |
+ | Closure (match one or more occurrences of the previous element, such as [A-Za-z]+ for one or more letters) |
<> | Brackets tagged material which is specially extracted for later reference or use |
Any element of a find pattern may be 'tagged' with the angle brackets <>. The tagged elements are numbered 1..n from left to right and can be specially extracted from the return value of Pattern.Find() or specially referred to in a change pattern.
A character class consists of zero or more of the following elements, surrounded by brackets ([]):
Element | Meaning in character class |
---|---|
c | Literal character, including [ |
a-c | Range of characters (digits, lower or upper case) |
! | Negated character class (if at beginning) |
@c | Escaped character (@!, @-, @@, @]) |
Special meaning of characters in a character class is lost when escaped or for exclamation point (!) when it is not at the beginning, and dash (-) when it is not at the beginning or end of the character class.
A change pattern is used to map the results of a find pattern to some replacement patter, as for a find and replace operation. It consists of zero or more of the following elements:
Element | Meaning in change pattern |
---|---|
c | literal character |
& | ditto (whatever was matched) |
@c | escaped character (such as @&) |
#n | tagged substring insertion |
In any pattern, an escape sequence consists of the character @ followed by a single character:
Element | Meaning in escape sequence |
---|---|
@r | carriage return |
@n | end of line |
@t | tab character |
@c | any other character (including @@) |
For those familiar with UNIX's regexp, use of OScript's pattern built-ins is easy keeping the following in mind:
Here is an example of Pattern.Find() which retrieves name value pairs, formatted in a particular way (" name = value;"), from a data string:
String target = "<data> firstname = Mary;lastname =Joe; phone = 555-5555; address= 4 Some Lane, Nowhere, SS 99999; </data>" String search = " *<[A-Za-z]+> *= *<[!;]+>;" String s = target PatFind finder = Pattern.CompileFind( search ) List result = Pattern.Find( s, finder ) while ( IsDefined( result ) ) Echo( result[ 4 ], ' = "', result[ 5 ], '"' ) s = s[ result[ 2 ] : ] result = Pattern.Find( s, finder ) end
The output of the example is:
address = "4 Some Lane, Nowhere, SS 99999" firstname = "Mary" lastname = "Joe" phone = "555-5555"
Here is an example of Pattern.Change() which operates on the same data, but instead produces an alteration of the input string with quoted data values and extraneous spaces removed:
String target = "<data> firstname = Mary;lastname =Joe; phone = 555-5555; address= 4 Some Lane, Nowhere, SS 99999; </data>" String search = " *<[A-Za-z]+> *= *<[!;]+>;" String change = '#1="#2";' PatFind finder = Pattern.CompileFind( search ) PatChange replacer = Pattern.CompileChange( change ) String result = Pattern.Change( target, finder, replacer ) Echo( "Input: ", target ) Echo( "Result: ", result )
The output of the example is:
Input: <data> firstname = Mary;lastname =Joe; phone = 555-5555; address= 4 Some Lane, Nowhere, SS 99999; </data> Result: <data>firstname="Mary";lastname="Joe";phone="555-5555";address="4 Some Lane, Nowhere, SS 99999"; </data>
Error returned when an invalid change pattern is compiled.
Error returned when an invalid find pattern is compiled.
The datatype number for the PatChange datatype.
The datatype number for the PatFind datatype.
Performs a find and replace on the target String with the given patterns.
Returns a compiled version of the specified change pattern.
Returns a compiled version of the specified find pattern.
Returns the result of the applying the find pattern to the target String.
Error returned when an invalid change pattern is compiled.
Error returned when an invalid find pattern is compiled.
The datatype number for the PatChange datatype.
The datatype number for the PatFind datatype.
Searches the target String for all occurrences of the find pattern, and replaces each occurrence with the specified change pattern.
The target String upon which to perform a search and replace.
The find pattern, either as a String or a compiled PatFind.
The change pattern, either as a String or a compiled PatChange.
If specified and TRUE, case is ignored in comparisons, otherwise FALSE, the default, for case-sensitive comparisons.
The new String result of the find and replace operation.
See the example in the class description.
Returns a compiled version of the specified String change pattern.
The change pattern to compile.
The successfully compiled PatChange, or an Error if the pattern String could not be compiled.
Returns a compiled version of the specified String find pattern.
The find pattern to compile.
The successfully compiled PatChange, or an Error if the pattern String could not be compiled.
Finds and returns a List describing the first match found of the find pattern in the target String. Otherwise, Undefined is returned if no match could be found.
The target String upon which the pattern match is performed.
The find pattern, either as a String or a compiled FindPat.
If specified and TRUE, case is ignored in comparisons, otherwise FALSE, the default, for case-sensitive comparisons.
If a match was found, a List, otherwise, Undefined if a match could not be found, or Error if a String pattern could not be compiled. The List will contain the following elements:
Element | Meaning |
---|---|
1 | The inclusive start index of the match within the _target_ String. |
2 | The exclusive end index of the match within the _target_ String. |
3 | The complete match text as a String. |
[ 4 onwards ] | Optionally any tagged elements extracted as Strings listed in the order they were tagged. |
Remember that Pattern.Find() returns only the first match found. To do multiple matches, the material following the match must be passed back to Find() until no matches can be found. Also, find pattern should be compiled beforehand to avoid the performance costs of repeated unnecessary compilation. See the example in the class description.
Copyright © 2022 OpenText Corporation. All rights reserved. |