reguexec() – Matching Over Arbitrary Input

#include <tre/tre.h>
 
typedef struct {
  int (*get_next_char)(tre_char_t *c, unsigned int *pos_add, void *context);
  void (*rewind)(size_t pos, void *context);
  int (*compare)(size_t pos1, size_t pos2, size_t len, void *context);
  void *context;
} tre_str_source;
 
int tre_reguexec(const regex_t *preg, const tre_str_source *string, size_t nmatch,
                 regmatch_t pmatch[], int eflags);

The tre_reguexec() function works just like the other regexec() functions, except that the input string is read from user specified callback functions instead of a character array. This makes it possible, for example, to match regexps over arbitrary user specified data structures.

The tre_str_source structure contains the following fields:

get_next_char
This function must retrieve the next available character. If a character is not available, the space pointed to by c must be set to zero and it must return a nonzero value. If a character is available, it must be stored to the space pointed to by c, and the integer pointer to by pos_add must be set to the number of units advanced in the input (the value must be >=1), and zero must be returned.
rewind
This function must rewind the input stream to the position specified by pos. Unless the regexp uses back references, rewind is not needed and can be set to NULL.
compare
This function compares two substrings in the input streams starting at the positions specified by pos1 and pos2 of length len. If the substrings are equal, compare must return zero, otherwise a nonzero value must be returned. Unless the regexp uses back references, compare is not needed and can be set to NULL.
context
This is a context variable, passed as the last argument to all of the above functions for keeping track of the internal state of the users code.

The position in the input stream is measured in size_t units. The current position is the sum of the increments gotten from pos_add (plus the position of the last rewind, if any). The starting position is zero. Submatch positions filled in the pmatch[] array are, of course, given using positions computed in this way.

For an example of how to use tre_reguexec(), see the tests/test-str-source.c file in the TRE source code distribution.