regaexec() – Approximate Matching
#include <tre/tre.h> typedef struct { int cost_ins; int cost_del; int cost_subst; int max_cost; int max_ins; int max_del; int max_subst; int max_err; } regaparams_t; typedef struct { size_t nmatch; regmatch_t *pmatch; int cost; int num_ins; int num_del; int num_subst; } regamatch_t; int tre_regaexec(const regex_t *preg, const char *string, regamatch_t *match, regaparams_t params, int eflags); int tre_reganexec(const regex_t *preg, const char *string, size_t len, regamatch_t *match, regaparams_t params, int eflags); int tre_regawexec(const regex_t *preg, const wchar_t *string, regamatch_t *match, regaparams_t params, int eflags); int tre_regawnexec( const regex_t *preg, const wchar_t *string, size_t len, regamatch_t *match, regaparams_t params, int eflags);
The tre_regaexec() function searches for the best match in string against the compiled regexp preg, initialized by a previous call to any one of the regcomp functions.
The tre_reganexec() function is like tre_regaexec(), but string is not terminated by a null byte. Instead, the len argument is used to tell the length of the string, and the string may contain null bytes. The tre_regawexec() and tre_regawnexec() functions work like tre_regaexec() and tre_reganexec(), respectively, but take a wide character (wchar_t) string instead of a byte string.
The eflags argument is like for the tre_regexec() functions.
The params struct controls the approximate matching parameters:
- int cost_ins
- The default cost of an inserted character, that is, an extra character in string.
- int cost_del
- The default cost of a deleted character, that is, a character missing from string.
- int cost_subst
- The default cost of a substituted character.
- int max_cost
- The maximum allowed cost of a match. If this is set to zero, an exact matching is searched for, and results equivalent to those returned by the regexec() functions are returned.
- int max_ins
- Maximum allowed number of inserted characters.
- int max_del
- Maximum allowed number of deleted characters.
- int max_subst
- Maximum allowed number of substituted characters.
- int max_err
- Maximum allowed number of errors (inserts + deletes + substitutes).
The match argument points to a regamatch_t structure. The nmatch and pmatch field must be filled by the caller. If REG_NOSUB was used when compiling the regexp, or match->nmatch is zero, or match->pmatch is NULL, the match->pmatch argument is ignored. Otherwise, the submatches corresponding to the parenthesized subexpressions are filled in the elements of match->pmatch, which must be dimensioned to have at least match->nmatch elements. The match->cost field is set to the cost of the match found, and the match->num_ins, match->num_del, and match->num_subst fields are set to the number of inserts, deletes, and substitutes in the match, respectively.
The tre_regaexec() functions return zero if a match with cost smaller than params->max_cost was found, otherwise they return REG_NOMATCH to indicate no match, or REG_ESPACE to indicate that enough temporary memory could not be allocated to complete the matching operation.