[TRE-general] Tre now core part of Felix
skaller
skaller at users.sourceforge.net
Sun May 21 17:21:36 EEST 2006
For your pleasure -- Tre is now a core part of Felix.
Tre is Ville Laurikari's Tagged Regular Expression engine.
This is the best performing regexp interpreter available.
The current implementation:
* is a fork of version 0.7.2 modified to compile under C++
* has all i18n, wchar_t, multibyte char, and configuration
options like use of alloca() turned off.
* presently we only support compiling a POSIX extended regexp,
and matching the regexp against a string.
Enhanced functionality will be added, including a version
that substitutes the matched strings into a target string.
The current binding is a bit low level and may be modified.
Matches are returned in a malloc'ed array which the user
must free after finishing with it.
The current source is a compromise as follows:
We would like to use the latest sources, however
Tre doesn't yet build entirely cleanly on 64 bit platform,
and the build scripts are standard GNU make/libtool things
which may not work on Windows. OTOH Felix build system
current cannot compile C code.
Ideally, Felix would build unmodified Tre sources
of the latest version, using our own config and
build scripts: this requires synchronising the
build/config options manually, but not touching
the sources.
An alternative is to provide users a choice between
binding to an already installed Tre system, or building
the Felix version. On Unix, you'd normally want to build
Tre yourself from Ville's sources. On Windows it is
trickier because there's no really standard way to
find the required header files.
Here's an extract of the (sole) regression test:
-----------------------------------------------
#import <flx.flxh>
include "tre.flx";
open Tre;
open C_hack;
open Carray;
print$ "Using tre " tre_version; endl;
var r = tre_regcomp("(a|b)*abb");
var re : tre_regex_t =
match r with
| Some ?re => re
| None => re // HACK!
endmatch
;
var s = "aabbabababb";
res,n,a := tre_regexec re s;
print "Result = "; print res; endl;
print "nmatches = "; print n; endl;
var i : int;
for_each { i=0; } { i<n } { ++i; }
{
if int(a.[i].rm_so) == -1 do
print i; print " -> nomatch\n";
else
print i; print "-> match '";
start := int(a.[i].rm_so);
finish := int(a.[i].rm_eo);
print s.[start to finish];
print "'"; endl;
done;
}
;
free a;
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
More information about the TRE-general
mailing list