regcomp() – Compiling Regexps

The regcomp() functions

#include <tre/tre.h>
 
int tre_regcomp(regex_t *preg, const char *regex, int cflags);
int tre_regncomp(regex_t *preg, const char *regex, size_t len, int cflags);
int tre_regwcomp(regex_t *preg, const wchar_t *regex, int cflags);
int tre_regwncomp(regex_t *preg, const wchar_t *regex, size_t len, int cflags);
void tre_regfree(regex_t *preg);

The regcomp() function compiles the regex string pointed to by regex to an internal representation and stores the result in the pattern buffer structure pointed to by preg. The regncomp() function is like regcomp(), but regex is not terminated with the null byte. Instead, the len argument is used to give the length of the string, and the string may contain null bytes. The regwcomp() and regwncomp() functions work like regcomp() and regncomp(), respectively, but take a wide character (wchar_t) string instead of a byte string.

The cflags argument is a the bitwise inclusive OR of zero or more of the following flags (defined in the header <tre/regex.h>):

REG_EXTENDED
Use POSIX Extended Regular Expression (ERE) compatible syntax when compiling regex. The default syntax is the POSIX Basic Regular Expression (BRE) syntax, but it is considered obsolete.
REG_ICASE
Ignore case. Subsequent searches with the regexec family of functions using this pattern buffer will be case insensitive.
REG_NOSUB
Do not report submatches. Subsequent searches with the regexec family of functions will only report whether a match was found or not and will not fill the submatch array.
REG_NEWLINE
Normally the newline character is treated as an ordinary character. When this flag is used, the newline character ('\n', ASCII code 10) is treated specially as follows:

  1. The match-any-character operator (dot "." outside a bracket expression) does not match a newline.
  2. A non-matching list ([^...]) not containing a newline does not match a newline.
  3. The match-beginning-of-line operator ^ matches the empty string immediately after a newline as well as the empty string at the beginning of the string (but see the REG_NOTBOL regexec() flag below).
  4. The match-end-of-line operator $ matches the empty string immediately before a newline as well as the empty string at the end of the string (but see the REG_NOTEOLregexec() flag below).
REG_LITERAL
Interpret the entire regex argument as a literal string, that is, all characters will be considered ordinary. This is a nonstandard extension, compatible with but not specified by POSIX.
REG_NOSPEC
Same as REG_LITERAL. This flag is provided for compatibility with BSD.
REG_RIGHT_ASSOC
By default, concatenation is left associative in TRE, as per the grammar given in the base specifications on regular expressions of Std 1003.1-2001 (POSIX). This flag flips associativity of concatenation to right associative. Associativity can have an effect on how a match is divided into submatches, but does not change what is matched by the entire regexp.
REG_UNGREEDY
By default, repetition operators are greedy in TRE as per Std 1003.1-2001 (POSIX) and can be forced to be non-greedy by appending a ? character. This flag reverses this behavior by making the operators non-greedy by default and greedy when a ? is specified.