FIELDS(3) FIELDS(3)
NAME
fieldread, fieldmake, fieldwrite, fieldfree - field access(2,5) package
SYNTAX
#include "fields.h"
typedef struct {
int nfields;
int hadnl;
char *linebuf;
char **fields;
} field_t;
#define FLD_RUNS 0x0001
#define FLD_SNGLQUOTES 0x0002
#define FLD_BACKQUOTES 0x0004
#define FLD_DBLQUOTES 0x0008
#define FLD_SHQUOTES 0x0010
#define FLD_STRIPQUOTES 0x0020
#define FLD_BACKSLASH 0x0040
extern field_t *fieldread (FILE * file(1,n), char * delims,
int flags, int maxf);
extern field_t *fieldmake (char * line, int allocated,
char * delims, int flags, int maxf);
extern int fieldwrite (FILE * file(1,n), field_t * fieldp,
int delim);
extern void fieldfree (field_t * fieldp);
extern unsigned int field_line_inc;
extern unsigned int field_field_inc;
DESCRIPTION
The fields access(2,5) package eases the common task of parsing and access-
ing information which is separated into fields by whitespace or other
delimiters. Various options can be specified to handle many common
cases, including selectable delimiters, runs of delimiters, and quot-
ing.
fieldread reads one line from a file(1,n), parses it into fields as speci-
fied by the parameters, and returns a field_t structure describing the
result. fieldmake performs the same process on a buffer already in(1,8)
memory. fieldwrite creates an output line from a field_t structure and
writes it to an output file. fieldfree frees a field_t structure and
any associated memory allocated by the package.
The field_t structure describes the fields in(1,8) a parsed line. A well-
behaved should only access(2,5) the nfields, fields, and hadnl elements; all
other elements are used internally by the package and are not guaran-
teed to remain the same even though they are documented here. Nfields
gives the number of fields in(1,8) the parsed line, just like the argc argu-
ment to a C program; fields is a pointer to an array of string(3,n) point-
ers, just like the argv argument to a C program. As in(1,8) C, the last
field pointer is followed by a null pointer, although the field count
is the preferred method of accessing fields. The user may alter
nfields by decreasing it, and may replace any pointer in(1,8) fields without
harm. This is often useful in(1,8) replacing a single field with a calcu-
lated value preparatory to output. The hadnl element is nonzero if(3,n) the
original line was terminated with a newline when it was parsed; this is
used to accurately reproduce the input when fieldwrite is called.
The linebuf element contains a pointer to an internal buffer allocated
by fieldread or provided to fieldmake. This buffer is not guaranteed
to contain anything sensible, although in(1,8) the current implementation
all of the field contents can be found therein.
fieldread reads a single line of arbitrary length from file(1,n), allocating
as much memory as necessary to hold it, and then parses the line
according to its remaining arguments. A pointer to the parsed field_t
structure is returned, with NULL returned if(3,n) an error(8,n) occurs or if(3,n) EOF
is reached on the input file. Fields in(1,8) the input line are considered
to be separated by any of the delimiters in(1,8) the delims parameter. For
example, if(3,n) delimiters of ":.;" are specified, a line containing
"a:b;c.d" would be considered to have four fields.
The default parsing of fields considers each delimiter to indicate a
separate field, and does not allow any quoting. This is similar to the
parsing done by cut(1). This behavior can be modified by specifying
flags. Multiple flags may be OR'ed together. The available flags are:
FLD_RUNS
Consider runs of delimiters to be the same as a single delim-
iter, suppressing all null fields. This is similar to the way
utilities like awk(1) and sort(1,3)(1) treat whitespace, but it is
not limited to whitespace. A run does not have to consist of a
single type of delimiter; if(3,n) both semicolon and colon are
delimiters, ";::;" is a run.
FLD_SNGLQUOTES
Allow field contents to be quoted with single quotes. Delim-
iters and other quotes appearing within single quotes are
ignored. This may appear in(1,8) combination with other quote
options.
FLD_BACKQUOTES
Allow field contents to be quoted with reverse single quotes.
Delimiters and other quotes appearing within reverse single
quotes are ignored. This may appear in(1,8) combination with other
quote options.
FLD_DBLQUOTES
Allow field contents to be quoted with single quotes. Delim-
iters and other quotes appearing within double quotes are
ignored. This may appear in(1,8) combination with other quote
options.
FLD_SHQUOTES
Allow shell-style quoting. In the absence of this option,
quotes are only recognized at the beginning of a field, and
characters following the close(2,7,n) quote are removed from the field
(and are thus lost from the input line). If this option is
specified, quotes may appear within a field, in(1,8) the same way as
they are handled by sh(1). Multiple quoting styles may be used
in(1,8) the same field. If none of FLD_SNGLQUOTES, FLD_BACKQUOTES,
or FLD_DBLQUOTES is specified with FLD_SHQUOTES, all three
options are implied.
FLD_STRIPQUOTES
Remove quotes and backslash sequences from the field while pars-
ing, converting backslash sequences to their proper ASCII equiv-
alent. The C sequences \a, \b, \f, \n, \r, \v, \xnn, and \nnn
are supported. Any other sequence is simply converted to the
backslashed character, as in(1,8) sh(1).
FLD_BACKSLASH
Accept standard C-style backslash sequences. The sequence will
be converted to an ASCII equivalent if(3,n) FLD_STRIPQUOTES is speci-
fied (q.v.).
FLD_NOSHRINK
Don't shrink allocated memory using realloc(3) before returning.
This option can have a significant effect on performance, espe-
cially when fieldfree is going to be called soon after fieldread
or fieldmake. The disadvantage is that slightly more memory
will be occupied until the field structure is freed.
The maxf parameter, if(3,n) nonzero, specifies the maximum number of fields
to be generated. This may enhance performance if(3,n) only the first few
fields of a long line are of interest to the caller. The actual number
of fields returned is one greater than maxf, because the remainder of
the line will be returned as a single contiguous (and uninterpreted,
FLD_STRIPQUOTES or FLD_BACKSLASH is specified) field.
fieldmake operates exactly like fieldread, except that the line parsed
is provided by the caller rather than being read(2,n,1 builtins) from a file. If the
allocated parameter is nonzero, the memory pointed to by the line
parameter will automatically be freed when fieldfree is called; other-
wise this memory is the caller's responsibility. The memory pointed to
by line is destroyed by fieldmake. All other parameters are the same
as for fieldread.
fieldwrite writes a set(7,n,1 builtins) of fields to the specified file(1,n), separating
them with the delimiter character delim (note that this is a character,
not a string(3,n)), and appending a newline if(3,n) specified by the hadnl ele-
ment of the structure. The field structure is not freed. fieldwrite
will return nonzero if(3,n) an I/O error(8,n) is detected.
fieldfree frees the field_t structure passed to it, along with any
associated auxiliary memory allocated by the package (or passed to
fieldmake). The structure may not be accessed after fieldfree is
called.
field_line_inc (default 512) and field_field_inc (default 20) describe
the increments to use when expanding lines as they are read(2,n,1 builtins) in(1,8) and
parsed. fieldread initially allocates a buffer of field_line_inc bytes
and, if(3,n) the input line is larger than that, expands the buffer in(1,8)
increments of the same amount until it is large enough. If input lines
are known to consistently reach a certain size, performance will be
improved by setting field_line_inc to a value larger than that size
(larger because there must be room for a null byte). field_field_inc
serves the same purpose in(1,8) both fieldread and fieldmake, except that it
is related to the number of fields in(1,8) the line rather than to the line
length. If the number of fields is known, performance will be improved
by setting field_field_inc to at least one more than that number.
RETURN VALUES
fieldread and fieldmake return NULL if(3,n) an error(8,n) occurs or if(3,n) EOF is
reached on the input file. fieldwrite returns nonzero if(3,n) an output
error(8,n) occurs.
BUGS
Thanks to the vagaries of ANSI C, the fields.h header file(1,n) defines an
auxiliary macro named(5,8) P. If the user needs a similarly-named macro,
this macro must be undefined first, and the user's macro must be
defined after fields.h is included.
local FIELDS(3)