[TRE-general] Doc update
codemstr at ptd.net
codemstr at ptd.net
Wed Jun 14 05:08:04 EEST 2006
This just updates the regex syntax and api docs to be up to date (and also
fixes a couple of typos).
Dominick Meglio
-------------- next part --------------
*** tre-syntax.old.html 2006-06-09 20:25:09.186123800 -0400
--- tre-syntax.html 2006-06-11 20:09:49.279430400 -0400
***************
*** 2,8 ****
<p>
This document describes the POSIX 1003.2 extended RE (ERE) syntax and
! the basic RE (BRE) syntax as implented by TRE, and the TRE extensions
to the ERE syntax. A simple Extended Backus-Naur Form (EBNF) style
notation is used to describe the grammar.
</p>
--- 2,8 ----
<p>
This document describes the POSIX 1003.2 extended RE (ERE) syntax and
! the basic RE (BRE) syntax as implemented by TRE, and the TRE extensions
to the ERE syntax. A simple Extended Backus-Naur Form (EBNF) style
notation is used to describe the grammar.
</p>
***************
*** 70,77 ****
| <a href="#assertion"><i>assertion</i></a>
| <a href="#literal"><i>literal</i></a>
| <a href="#backref"><i>back-reference</i></a>
! | <b>"(?"</b> [<b>"i" "n" "r"</b>]* (<b>"-"</b> [<b>"i" "n" "r"</b>]*)? <b>")"</b> <i>extended-regexp</i>
! | <b>"(?"</b> [<b>"i" "n" "r"</b>]* (<b>"-"</b> [<b>"i" "n" "r"</b>]*)? <b>":"</b> <i>extended-regexp</i> <b>")"</b>
</pre>
</td></tr>
</table>
--- 70,78 ----
| <a href="#assertion"><i>assertion</i></a>
| <a href="#literal"><i>literal</i></a>
| <a href="#backref"><i>back-reference</i></a>
! | <b>"(?#"</b> <i>comment-text</i> <b>")"</b>
! | <b>"(?"</b> <a href="#options"><i>options</i></a> <b>")"</b> <i>extended-regexp</i>
! | <b>"(?"</b> <a href="#options"><i>options</i></a> <b>":"</b> <i>extended-regexp</i> <b>")"</b>
</pre>
</td></tr>
</table>
***************
*** 87,93 ****
character is not matched.
</p>
!
<h3>Repeat operators</h3>
<a name="repeat-operator"></a>
--- 88,97 ----
character is not matched.
</p>
! <p>
! <tt>Comment-text</tt> can contain any characters except for a closing parenthesis <tt>)</tt>. The text in the comment is
! completely ignored by the regex parser and it used solely for readability purposes.
! </p>
<h3>Repeat operators</h3>
<a name="repeat-operator"></a>
***************
*** 332,338 ****
<p>
A literal is either an ordinary character (a character that has no
other significance in the context), an 8 bit hexadecimal encoded
! character (e.g. <tt>\x1B</tt>9, a wide hexadecimal encoded character
(e.g. <tt>\x{263a}</tt>), or an escaped character. An escaped
character is a <tt>\</tt> followed by any character, and matches that
character. Escaping can be used to match characters which have a
--- 336,342 ----
<p>
A literal is either an ordinary character (a character that has no
other significance in the context), an 8 bit hexadecimal encoded
! character (e.g. <tt>\x1B</tt>), a wide hexadecimal encoded character
(e.g. <tt>\x{263a}</tt>), or an escaped character. An escaped
character is a <tt>\</tt> followed by any character, and matches that
character. Escaping can be used to match characters which have a
***************
*** 380,385 ****
--- 384,412 ----
EREs and BREs.
</p>
+ <h3>Options</h3>
+ <a name="options"></a>
+ <table bgcolor="#e0e0f0" cellpadding="10">
+ <tr><td>
+ <pre>
+ <i>options</i> ::= [<b>"i" "n" "r" "U"</b>]* (<b>"-"</b> [<b>"i" "n" "r" "U"</b>]*)?
+ </pre>
+ </td></tr>
+ </table>
+
+ Options allow compile time options to be turned on/off for particular parts of the
+ regular expression. The options equate to several compile time options specified to
+ the regcomp API function. If the option is specified in the first section, it is
+ turned on. If it is specified in the second section (after the <tt>-</tt>), it is
+ turned off.
+ <ul>
+ <li>i - Case insensitive.
+ <li>n - Forces special handling of the new line character. See the REG_NEWLINE flag in
+ the <a href="tre-api.html">API Manual</a>.
+ <li>r - Causes the regex to be matched in a right associative manner rather than the normal
+ left associative manner.
+ <li>U - Forces repetition operators to be non-greedy unless a <tt>?</tt> is appended.
+ </ul>
<h2>BRE Syntax</h2>
<p>
*** tre-api.old.html 2006-06-09 20:25:06.432163800 -0400
--- tre-api.html 2006-06-11 17:33:16.242907200 -0400
***************
*** 110,120 ****
considered ordinary. This is a nonstandard extension, compatible with
but not specified by POSIX.</dd>
! <dt><tt>REG_NOSPEC</tt><dt>
<dd>Same as <tt>REG_LITERAL</tt>. This flag is provided for
compatibility with BSD.</dd>
! <dt><tt>REG_RIGHT_ASSOC</tt><dt>
<dd>By default, concatenation is left associative in TRE, as per
the grammar given in the <a
href="http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap09.html">base
--- 110,120 ----
considered ordinary. This is a nonstandard extension, compatible with
but not specified by POSIX.</dd>
! <dt><tt>REG_NOSPEC</tt></dt>
<dd>Same as <tt>REG_LITERAL</tt>. This flag is provided for
compatibility with BSD.</dd>
! <dt><tt>REG_RIGHT_ASSOC</tt></dt>
<dd>By default, concatenation is left associative in TRE, as per
the grammar given in the <a
href="http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap09.html">base
***************
*** 122,127 ****
--- 122,133 ----
This flag flips associativity of concatenation to right associative.
Associativity can have an effect on how a match is divided into
submatches, but does not change what is matched by the entire regexp.
+ </dd>
+
+ <dt><tt>REG_UNGREEDY</tt></dt>
+ <dd>By default, repetition operators are greedy in TRE as per Std 1003.1-2001 (POSIX) and
+ can be forced to be non-greedy by appending a <tt>?</tt> character. This flag reverses this behavior
+ by making the operators non-greedy by default and greedy when a <tt>?</tt> is specified.</dd>
</dl>
</blockquote>
More information about the TRE-general
mailing list