Application icon

Regular Expression

This action is used to match a regular expression against a field.

The regular expressions are based on the ICU implementation. The ICU regular expressions are described here.

You specify a source and destination field. The same field may be specified for both if desired. The source and destination fields must both be metadata fields or named variables. If named variables are specified the fields may contain any of the escape sequences described in Escape Sequences. When named variables are specified, the statement is only executed once, regardless of the execution mode.

The Regular expression field must and resolve at runtime to a non empty string. The Validate button can be used to test the validity of the field.

The Replace template field contains the template used when replacing a matched pattern.

In both the Regular expression and Replace template fields, the following escape sequences will have their appropriate values substituted prior to compiling the regular expression:

\v#
\<name>

All other standard escape sequences will be ignored and passed through to the regular expression parser. All regular expression meta characters which may be inserted by the substitution will be properly escaped so that the inserted text is treated as a sequence of literal characters. When validating the Regular expression field, escape sequences are treated as a single space.

As just mentioned when a variable or named variable is replaced, the contents are by default treated as literal characters. They are properly escaped so that when passed to the regular expression parser they will not be treated as special escape sequences. If for any reason you want to save the actual regular expression sequences to a variable you must tell the application that you do not want the inserted contents to be escaped. Both the regular expression and replace template fields have associated options called Do not escape inserted variable contents. The term variable applies to both Variables and Named Variables. When initially saving the expression to a variable you must remember that Yate will attempt to process the escape sequences. You can prefix the string with a \L which will disable escape sequence processing. eg.

Assume you are matching a sequence of three digits and you want to save the expression in Variable 1.

Set Variable 1 to (\d{3})

will not have the desired effect as Yate will insert the date for the \d sequence.

Set Variable 1 to \L(\d{3})

will work as escaping will have been disabled.

You specify whether you want to match all occurrences, only the first or only the last.

You can also select case insensitivity and that you want the action state to be set or cleared based on changes being made or not not being made.

There are three functions available:

Replace
The matches are replaced with the evaluated Replace template. The destination field will contain either the initial or modified source field.
Return Matches
The matches are returned in the destination. If more than one match was made, the returned matches will be separated by the default list delimiter (\~). This function is useful when you want to extract information from a field.
Return Ranges
Each match is returned as as range specified as location,length. Note that the locations are relative to the source string after variable escape sequences have been replaced. If more than one match was made, the returned ranges will be separated by the default list delimiter (\~).

Unfortunately this function is not available for Snow Leopard users (OS 10.6.x) as the APIs are not available. The statement will be ignored if placed in an action. If the action state is being set, it will always be set to false.



More information on regular expressions may be found at:

Regular Expression Metacharacters

Regular Expression Operators

Regular Expression Replace Template Format

Regular Expression Flag Options

Information on alternate means of parsing or scanning

File to Tag From Content

Find and Remove

Replace

Scanner

List Statements