[TRE-general] TRE coredump, regexp syntax, definition of i/d/s
Stavros Macrakis
macrakis at alum.mit.edu
Tue Jan 22 19:05:35 EET 2008
Thanks very much for the TRE package! Great stuff!
I'm afraid I have a few problems and questions which I'd appreciate your
help on. I am using TRE 0.7.5, gcc 3.4.4 under Cygwin/Windows XP.
-----
I have run into a problem where TRE coredumps, but I think that my input is
correct:
echo foo | agrep
"((J{~1,1i+1d+1s<2})((a){~1,1i+1d+1s<2})|(~{~1,3d+10i+10s<4})J{~1,1i+1d+1s<2}o)"
==>
Program received signal SIGSEGV, Segmentation fault.
0x6ed0b2ea in tre_tnfa_run_approx (tnfa=0x673ad0, string=0x675988, len=9,
type=STR_BYTE, match_tags=0x0, match=0x23c9b0, default_params=
{cost_ins = 1, cost_del = 1, cost_subst = 1, max_cost = 0, max_ins =
2147483647, max_del = 2147483647, max_subst = 2147483647, max_err =
2147483647},
eflags=0, match_end_ofs=0x23c8f8) at tre-match-approx.c:682
682 for (trans = reach[id].state; trans->state; trans++)
I tried taking out various parts of the regexp, and the problem goes away.
-----
I also seem not to understand the syntax of approximate matching settings.
I thought the following would work:
echo foo | agrep "g{1d+1i+1s<3}"
echo foo | agrep -9 "g{1d+1i+1s<3}"
but I get "Invalid contents of {}" in both cases. But the following works:
echo foo | agrep "g{~9,1d+1i+1s<3}"
Am I misunderstanding something?
-----
I don't seem to understand the definition of
insertion/deletion/substitution. For example, using the following test:
echo XXX | agrep -s "(^PPP$){~8,Ii+Dd+Ss<MAX}"
==> Cost: …
I get the following results:
XXX
PPP
I
D
S
Max
Cost
abc
a
1
1
1
20
3
abc
a
1
1
20
20
4
aaa
a
1
1
1
20
2
aaa
a
1
1
20
20
2
aba
a
1
20
20
20
2
abc
a
1
20
20
20
No
I can't figure out how these costs are calculated – is there some sort of
"explain" mode?
Thanks,
-s
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://laurikari.net/pipermail/tre-general/attachments/20080122/c27926bb/attachment.html
More information about the TRE-general
mailing list