Seth Woolley's Man Viewer

RegExp(3) - Tcl_GetRegExpFromObj, Tcl_RegExpCompile, Tcl_RegExpExec, Tcl_RegExpExecObj, Tcl_RegExpGetInfo, Tcl_RegExpMatch, Tcl_RegExpMatchObj, Tcl_RegExpRange, Tcl_GetRegExpFromObj, Tcl_RegExpCompile, Tcl_RegExpExec, Tcl_RegExpExecObj, Tcl_RegExpGetInfo, Tcl_RegExpMatch, Tcl_RegExpMatchObj, Tcl_RegExpRange - Pattern matching with regular expressions - man 3 RegExp

([section] manual, -k keyword, -K [section] search, -f whatis)
man plain no title

Tcl_RegExpMatch(3)          Tcl Library Procedures          Tcl_RegExpMatch(3)



NAME
       Tcl_RegExpMatch,  Tcl_RegExpCompile,  Tcl_RegExpExec,  Tcl_RegExpRange,
       Tcl_GetRegExpFromObj, Tcl_RegExpMatchObj,  Tcl_RegExpExecObj,  Tcl_Reg-
       ExpGetInfo - Pattern matching with regular expressions

SYNOPSIS
       #include <tcl.h>

       int
       Tcl_RegExpMatchObj(interp, strObj, patObj)

       int
       Tcl_RegExpMatch(interp, string(3,n), pattern)

       Tcl_RegExp
       Tcl_RegExpCompile(interp, pattern)

       int
       Tcl_RegExpExec(interp, regexp(3,n), string(3,n), start)

       Tcl_RegExpRange(regexp(3,n), index, startPtr, endPtr)

       Tcl_RegExp
       Tcl_GetRegExpFromObj(interp, patObj, cflags)

       int
       Tcl_RegExpExecObj(interp, regexp(3,n), objPtr, offset, nmatches, eflags)

       Tcl_RegExpGetInfo(regexp(3,n), infoPtr)


ARGUMENTS
       Tcl  interpreter  to  use  for error(8,n) reporting.  The interpreter may be
       NULL if(3,n) no error(8,n) reporting is desired.  Refers to the object from which
       to get the string(3,n) to search.  The internal representation of the object
       may be converted to a form that can be efficiently searched.  Refers to
       the object from which to get a regular expression. The compiled regular
       expression is cached in(1,8) the object.  String to check for a match with a
       regular  expression.   String  in(1,8) the form of a regular expression pat-
       tern.  Compiled regular expression.  Must have been returned previously
       by Tcl_GetRegExpFromObj or Tcl_RegExpCompile.  If string(3,n) is just a por-
       tion of some other string(3,n), this argument identifies  the  beginning  of
       the  larger  string.  If it isn't the same as string(3,n), then no ^ matches
       will be allowed.  Specifies which range is desired:  0 means the  range
       of the entire match, 1 or greater means the range that matched a paren-
       thesized sub-expression.  The address of the  first  character  in(1,8)  the
       range  is  stored here, or NULL if(3,n) there is no such range.  The address
       of the character just after the last one in(1,8) the range is  stored  here,
       or  NULL  if(3,n)  there is no such range.  OR-ed combination of compilation
       flags. See below for more information.  An object  which  contains  the
       string(3,n)  to  check for a match with a regular expression.  The character
       offset into the string(3,n) where matching should begin.  The value  of  the
       offset  has  no  impact  on  ^ matches.  This behavior is controlled by
       eflags.  The number of matching subexpressions that  should  be  remem-
       bered  for  later use.  If this value is 0, then no subexpression match
       information will be computed.  If the value is  -1,  then  all  of  the
       matching  subexpressions  will  be remembered.  Any other value will be
       taken as the maximum number of subexpressions to remember.  OR-ed  com-
       bination  of  the  values TCL_REG_NOTBOL and TCL_REG_NOTEOL.  See below
       for more information.  The address of the  location  where  information
       about a previous match should be stored by Tcl_RegExpGetInfo.


DESCRIPTION
       Tcl_RegExpMatch determines whether its pattern argument matches regexp(3,n),
       where regexp(3,n) is interpreted as a regular expression using the rules  in(1,8)
       the re_syntax reference page.  If there is a match then Tcl_RegExpMatch
       returns 1.  If there is no match then Tcl_RegExpMatch returns 0.  If an
       error(8,n) occurs in(1,8) the matching process (e.g. pattern is not a valid regu-
       lar expression) then Tcl_RegExpMatch returns -1  and  leaves  an  error(8,n)
       message  in(1,8)  the  interpreter result.  Tcl_RegExpMatchObj is similar to
       Tcl_RegExpMatch except it operates on the Tcl objects strObj and patObj
       instead of UTF strings.  Tcl_RegExpMatchObj is generally more efficient
       than Tcl_RegExpMatch, so it is the preferred interface.

       Tcl_RegExpCompile, Tcl_RegExpExec, and Tcl_RegExpRange  provide  lower-
       level access(2,5) to the regular expression pattern matcher.  Tcl_RegExpCom-
       pile compiles a regular expression string(3,n) into the internal  form  used
       for  efficient  pattern matching.  The return value is a token for this
       compiled form, which can be used in(1,8) subsequent calls to  Tcl_RegExpExec
       or  Tcl_RegExpRange.   If  an  error(8,n) occurs while compiling the regular
       expression then Tcl_RegExpCompile returns NULL and leaves an error(8,n) mes-
       sage  in(1,8) the interpreter result.  Note:  the return value from Tcl_Reg-
       ExpCompile is only valid up to the next call to Tcl_RegExpCompile;   it
       is not safe to retain these values for long periods of time.

       Tcl_RegExpExec  executes  the  regular  expression pattern matcher.  It
       returns 1 if(3,n) string(3,n) contains a range of characters that match regexp(3,n), 0
       if(3,n)  no  match  is  found, and -1 if(3,n) an error(8,n) occurs.  In the case of an
       error(8,n), Tcl_RegExpExec  leaves  an  error(8,n)  message  in(1,8)  the  interpreter
       result.   When searching a string(3,n) for multiple matches of a pattern, it
       is important to distinguish between the start of  the  original  string(3,n)
       and  the  start of the current search.  For example, when searching for
       the second occurrence of a match, the string(3,n) argument  might  point  to
       the character just after the first match;  however, it is important for
       the pattern matcher to know that this is not the start  of  the  entire
       string(3,n),  so that it doesn't allow ^ atoms in(1,8) the pattern to match.  The
       start argument provides this information by pointing to  the  start  of
       the overall string(3,n) containing string(3,n).  Start will be less(1,3) than or equal
       to string(3,n);  if(3,n) it is less(1,3)  than  string(3,n)  then  no  ^  matches  will  be
       allowed.

       Tcl_RegExpRange  may  be invoked after Tcl_RegExpExec returns;  it pro-
       vides detailed information about what ranges of the string(3,n) matched what
       parts  of  the  pattern.  Tcl_RegExpRange returns a pair of pointers in(1,8)
       *startPtr and *endPtr that identify a range of characters in(1,8) the source
       string(3,n)  for  the  most  recent call to Tcl_RegExpExec.  Index indicates
       which of several ranges is desired:  if(3,n)  index  is  0,  information  is
       returned  about the overall range of characters that matched the entire
       pattern;  otherwise, information is returned about the range of charac-
       ters  that  matched the index'th parenthesized subexpression within the
       pattern.  If there is no range corresponding  to  index  then  NULL  is
       stored in(1,8) *startPtr and *endPtr.

       Tcl_GetRegExpFromObj,   Tcl_RegExpExecObj,  and  Tcl_RegExpGetInfo  are
       object interfaces  that  provide  the  most  direct  control  of  Henry
       Spencer's  regular  expression  library.  For users(1,5) that need to modify
       compilation and execution options directly, it is recommended that  you
       use  these interfaces instead of calling the internal regexp(3,n) functions.
       These interfaces handle the details of UTF to Unicode  translations  as
       well  as  providing improved performance through caching in(1,8) the pattern
       and string(3,n) objects.

       Tcl_GetRegExpFromObj attempts to return a compiled  regular  expression
       from  the  patObj.   If  the object does not already contain a compiled
       regular expression it will attempt to create one from the string(3,n) in(1,8) the
       object and assign it to the internal representation of the patObj.  The
       return value of this function is of type Tcl_RegExp.  The return  value
       is  a  token  for  this  compiled form, which can be used in(1,8) subsequent
       calls to Tcl_RegExpExecObj or Tcl_RegExpGetInfo.  If  an  error(8,n)  occurs
       while   compiling  the  regular  expression  then  Tcl_GetRegExpFromObj
       returns NULL and leaves an error(8,n) message  in(1,8)  the  interpreter  result.
       The regular expression token can be used as long as the internal repre-
       sentation of patObj refers to the compiled form.  The  eflags  argument
       is a bitwise OR of zero or more of the following flags that control the
       compilation of patObj:

         TCL_REG_ADVANCED
                Compile advanced regular expressions (`AREs').  This mode cor-
                responds  to  the normal regular expression syntax accepted by
                the Tcl regexp(3,n) and regsub commands.

         TCL_REG_EXTENDED
                Compile extended regular expressions (`EREs').  This mode cor-
                responds  to  the  regular expression syntax recognized by Tcl
                8.0 and earlier versions.

         TCL_REG_BASIC
                Compile basic regular expressions (`BREs').  This mode  corre-
                sponds  to  the regular expression syntax recognized by common
                Unix utilities like sed and grep.  This is the default  if(3,n)  no
                flags are specified.

         TCL_REG_EXPANDED
                Compile  the regular expression (basic, extended, or advanced)
                using an expanded syntax that allows comments and  whitespace.
                This  mode causes non-backslashed non-bracket-expression white
                space and #-to-end-of-line comments to be ignored.

         TCL_REG_QUOTE
                Compile a literal string(3,n), with all characters treated as ordi-
                nary characters.

         TCL_REG_NOCASE
                Compile  for  matching  that ignores upper/lower case distinc-
                tions.

         TCL_REG_NEWLINE
                Compile for newline-sensitive matching.  By  default,  newline
                is  a completely ordinary character with no special meaning in(1,8)
                either regular expressions or strings.  With this  flag,  `[^'
                bracket  expressions  and `.' never match newline, `^' matches
                an empty string(3,n) after any newline in(1,8) addition  to  its  normal
                function,  and  `$' matches an empty string(3,n) before any newline
                in(1,8) addition to its normal function.  REG_NEWLINE is  the  bit-
                wise OR of REG_NLSTOP and REG_NLANCH.

         TCL_REG_NLSTOP
                Compile  for  partial  newline-sensitive  matching,  with  the
                behavior of `[^' bracket expressions and `.' affected, but not
                the  behavior  of  `^'  and  `$'.   In this mode, `[^' bracket
                expressions and `.' never match newline.

         TCL_REG_NLANCH
                Compile for inverse partial newline-sensitive  matching,  with
                the behavior of of `^' and `$' (the ``anchors'') affected, but
                not the behavior of `[^' bracket expressions and `.'.  In this
                mode `^' matches an empty string(3,n) after any newline in(1,8) addition
                to its normal function, and `$' matches an empty string(3,n) before
                any newline in(1,8) addition to its normal function.

         TCL_REG_NOSUB
                Compile for matching that reports only success or failure, not
                what was matched.   This  reduces  compile  overhead  and  may
                improve performance.  Subsequent calls to Tcl_RegExpGetInfo or
                Tcl_RegExpRange will not report any match information.

         TCL_REG_CANMATCH
                Compile for matching that reports the potential to complete  a
                partial match given more text (see below).

       Only  one  of  TCL_REG_EXTENDED,  TCL_REG_ADVANCED,  TCL_REG_BASIC, and
       TCL_REG_QUOTE may be specified.

       Tcl_RegExpExecObj executes the regular expression pattern matcher.   It
       returns 1 if(3,n) objPtr contains a range of characters that match regexp(3,n), 0
       if(3,n) no match is found, and -1 if(3,n) an error(8,n) occurs.  In  the  case  of  an
       error(8,n),  Tcl_RegExpExecObj  leaves  an  error(8,n) message in(1,8) the interpreter
       result.  The nmatches value indicates to the matcher  how  many  subex-
       pressions  are  of  interest.   If nmatches is 0, then no subexpression
       match information is recorded, which may allow the matcher to make var-
       ious optimizations.  If the value is -1, then all of the subexpressions
       in(1,8) the pattern are remembered.  If the value  is  a  positive  integer,
       then  only  that number of subexpressions will be remembered.  Matching
       begins at the  specified  Unicode  character  index  given  by  offset.
       Unlike  Tcl_RegExpExec,  the behavior of anchors is not affected by the
       offset value.  Instead the behavior of the anchors is  explicitly  con-
       trolled  by  the eflags argument, which is a bitwise OR of zero or more
       of the following flags:

         TCL_REG_NOTBOL
                The starting character will not be treated as the beginning of
                a  line  or the beginning of the string(3,n), so `^' will not match
                there.  Note that this flag has no effect on how `\A' matches.

         TCL_REG_NOTEOL
                The  last  character  in(1,8) the string(3,n) will not be treated as the
                end of a line or the end of the string(3,n), so '$' will not  match
                there.  Note that this flag has no effect on how `\Z' matches.

       Tcl_RegExpGetInfo retrieves information about the last match  performed
       with  a given regular expression regexp(3,n).  The infoPtr argument contains
       a pointer to a structure that is defined as follows:

       typedef struct Tcl_RegExpInfo {      int nsubs;       Tcl_RegExpIndices
       *matches;      long extendStart; } Tcl_RegExpInfo;

       The  nsubs field contains a count of the number of parenthesized subex-
       pressions within the regular  expression.   If  the  TCL_REG_NOSUB  was
       used,  then  this  value  will be zero.  The matches field points to an
       array of nsubs values that indicate the bounds  of  each  subexpression
       matched.  The first element in(1,8) the array refers to the range matched by
       the entire regular expression, and subsequent  elements  refer  to  the
       parenthesized  subexpressions in(1,8) the order that they appear in(1,8) the pat-
       tern.  Each element is a structure that is defined as follows:

       typedef struct Tcl_RegExpIndices {      long start;       long  end;  }
       Tcl_RegExpIndices;

       The  start and end values are Unicode character indices relative to the
       offset location within objPtr where matching began.   The  start  index
       identifies  the  first character of the matched subexpression.  The end
       index identifies the first character after the  matched  subexpression.
       If  the subexpression matched the empty string(3,n), then start and end will
       be equal.  If the subexpression did not participate in(1,8) the match,  then
       start and end will be set(7,n,1 builtins) to -1.

       The extendStart field in(1,8) Tcl_RegExpInfo is only set(7,n,1 builtins) if(3,n) the TCL_REG_CAN-
       MATCH flag was used.  It indicates the first character  in(1,8)  the  string(3,n)
       where a match could occur.  If a match was found, this will be the same
       as the beginning of the current match.  If no match was found, then  it
       indicates the earliest point at which a match might occur if(3,n) additional
       text is appended to the string.  If it is no  match  is  possible  even
       with further text, this field will be set(7,n,1 builtins) to -1.

SEE ALSO
       re_syntax(n)

KEYWORDS
       match,  pattern,  regular expression, string(3,n), subexpression, Tcl_RegEx-
       pIndices, Tcl_RegExpInfo



Tcl                                   8.1                   Tcl_RegExpMatch(3)

References for this manual (incoming links)