The Substdio Library

There’s a joke here, whether Bernstein inteded it or I just made it up. My bet is on his quirky sense of humor.

To me, this library is used instead of the standard Kernighan and Ritchie (K&R) C I/O library routines. Perhaps this makes them sub-standard I/O routines? hehe

Anyway, they are here, so get used to them! They are meant to replace the “fopen, fclose, fread, fprintf” routines that are found in typical C programs for performing input/output to the standard Unix stream files. If you have read my stralloc page, you know how the K&R string library has also been replaced by the DJB stralloc functions. In the same manner, we find the substdio library repacing the K&R buffered stream I/O library. The File One of the coolest thing about Unix is that the entire OS is written on the premise that everything that sends or receives stream data is a file. This includes, our beloved stream files, network I/O sockets, pipes (inter-process communication–IPC), the notion of STDIN/STDOUT/STDERR console streams that can be redirected, and even processes. Each opened file in the operating system is allocated an operating system resource snown as a file descriptor. I won’t really offer any more prologue than that, becuase it is covered in many texts, including the great and famous Advanced Programming in the UNIX Environment by W. Richard Stevens.

The substdio library used a structure to manage its files. The libarary is somewhere between the raw I/O of the open/read/write/close routines and the buffered fopen/fread/fwrite/fclose routines, both the the standard C libarary.

#include "substdio.h"

This includes the definitions to the substdio structures and the routines that use them. The definition fo the substdio file structure in the programs is as follows.

typedef struct substdio {
  char *x;
  int p;
  int n;
  int fd;
  int (*op)();
} substdio;

Before you can use a file, you must declare it, and you better do it in static memory (like in the file global space). You can declare a file like this:

int fdinfo;
char buf[128];
substdio ss;
...
fdinfo = open_read(fn.s);
substdio_fdbuf(&ss,read,fdinfo,buf,sizeof(buf));

The substdio_fdbuf function initializes the file before you use it! These parameters mean:

  • &ss - The name of the substio structure. As is is a C structure, you must pass it be reference (use the “&” address-of prefix). This prevents the structure from getting passed as a memory block and is important not only to runtime efficiency, but also that is what the routines are expecting!
  • ss→op = read This is really the reference of a int read() function that is called to perform the I/O. This is what differentiates an input file from an output file! Note that this “read” function is not the one in the standard libary, but is in the “readwrite.h” file. For output files, this may be “write” or your home-grown function.
  • ss→fd = fdinfo - This is the file resoure number returned from the open file statement, and is essentially your “FD”, file descriptor. You see that the fdinfo was returned from the open_read function. This is just a DJB routine to do a standard read open for files in “O_RDONLY | O_NDELAY” mode. You can use open_write to open a file in “O_WRONLY | O_NDELAY” mode instead.
  • ss→buf = buf - Is your buffer variable. Make sure your buffer is large enough to handle a single record. If you have a buffer of a single character, remember to use th “&” address-of operator to pass its address.
  • ss→len = sizeof(buf) - yes, its the largest amount of characters/bytes to read into your buffer.

Once this is executed, it stores these preferences in the substdio structure.

After you groked all that, now you can learn the short-cut, which is implemented in a C macro.

substdio ss1 = SUBSTDIO_FDBUF(write,1,buf1,sizeof(buf1));

This statement defines a stdstdio file, ss1, and immediately initializes it with the proper settings. This is useful if a file desciptor usage is static, not going to change. The Functions The libaray is defined in the rest sudstdio.h file. The naming convection here is taht a “s” suffix works with a regular C-language string constatant or ASCIIZ variable. See the stralloc page for more info on this.

Unfortunately, these functions were not desinged to work with the stralloc functions wholesale. Therefore, if you want to perform I/O to/from a stralloc variable, you must specify the stralloc “s” (value) and “len” (length) parameters explicity.

extern int substdio_flush();
extern int substdio_put();
extern int substdio_bput();
extern int substdio_putflush();
extern int substdio_puts();
extern int substdio_bputs();
extern int substdio_putsflush();
 
extern int substdio_get();
extern int substdio_bget();
extern int substdio_feed();
 
extern char *substdio_peek();
extern void substdio_seek();
 
#define substdio_fileno(s) ((s)->fd)
 
#define SUBSTDIO_INSIZE 8192
#define SUBSTDIO_OUTSIZE 8192
 
/* Author's note: do not use these, see bottom of page */
#define substdio_PEEK(s) ( (s)->x + (s)->n )
#define substdio_SEEK(s,len) ( ( (s)->p -= (len) ) , \
   ( (s)->n += (len) ) )
 
#define substdio_BPUTC(s,c) \
  ( ((s)->n != (s)->p) \
    ? ( (s)->x[(s)->p++] = (c), 0 ) \
    : substdio_bput((s),&(c),1) \
  )
 
extern int substdio_copy();

Well, I think you should understand the usage. here are a set of examples, gleaned from the qmail code in order to illustrate them better.

if (substdio_flush(&ssout) == -1) die_write();
substdio_put(subfdout,sender.s,sender.len);
if (substdio_bput(&ssout,&ch,1) == -1) die_write();
if (substdio_putflush(&sstoqc,fn.s,fn.len) == -1) 
   { cleandied(); return; }
substdio_puts(subfdout,".\n");
if (substdio_bputs(&ssout,"\n")) goto writeerrs;
substdio_putsflush(subfderr,
   "qmail-inject: fatal: out of memory\n"); temp(); }
substdio_get(&ssin,&ch,1);
substdio_bget(s,buf,len)
n = substdio_feed(ssin);
 
close(fdinfo); /* Uses the Unix close here */

Outputing a number

Since there is no analogous substdio function for C’s fprintf, we need another tool. Writing strings is annoying but basic. However, if you wanted to write a number, this is how it is done in the DJB style. It brings me back to my assembly language programming days! hehe Use the fmt_ulong() function to translate a (unsigned long int) number into a left-justified ASCII character representation. The function returns the length (number of digits) of the string. The DJB style uses these two method for using fmt_ulong. As you would expect, the usage is clever, thick, not very readable, and not immediately obvious. Still, there is something elegant to the well thought-out design of his work.

 #include "fmt.h"
 
char strnum2[FMT_ULONG];
strnum2[fmt_ulong(strnum2,id)] = 0;
 
substdio_put(subfdout,num,fmt_ulong(num,(unsigned long) auto_patrn));

Its no replacement for printf() routines, but it’ll do. Also, it produces left-justified numbers which wucks when you are trying to print out dates and times. Here’s a quick time-stamp I threw together. I blindly just checked to see if the result was a single characer, and appended an extra zero if it was. The result is less than pretty.

  clock = time(&clock);
  lt = localtime(&clock);
  numbuf[ fmt_ulong(numbuf, lt->tm_year + 1900) ] = 0;
  stralloc_copys(&logtime, numbuf);
  stralloc_cats(&logtime, "-");
  numbuf[ fmt_ulong(numbuf, lt->tm_mon + 1) ] = 0;
  if (numbuf[1]=='\0') stralloc_cats(&logtime, "0");
  stralloc_cats(&logtime, numbuf);
  stralloc_cats(&logtime, "-");
  numbuf[ fmt_ulong(numbuf, lt->tm_mday) ] = 0;
  if (numbuf[1]=='\0') stralloc_cats(&logtime, "0");
  stralloc_cats(&logtime, numbuf);
  stralloc_cats(&logtime, " ");
  numbuf[ fmt_ulong(numbuf, lt->tm_hour) ] = 0;
  if (numbuf[1]=='\0') stralloc_cats(&logtime, "0");
  stralloc_cats(&logtime, numbuf);
  stralloc_cats(&logtime, ":");
  numbuf[ fmt_ulong(numbuf, lt->tm_min) ] = 0;
  if (numbuf[1]=='\0') stralloc_cats(&logtime, "0");
  stralloc_cats(&logtime, numbuf);
  stralloc_cats(&logtime, ":");
  numbuf[ fmt_ulong(numbuf, lt->tm_sec) ] = 0;
  if (numbuf[1]=='\0') stralloc_cats(&logtime, "0");
  stralloc_cats(&logtime, numbuf);
  stralloc_0(&logtime);

Quickie I/O Operations

There is also a quickie read-n-close function that slurps the file contents into the buffer (like eating spaghetti!) and closed the file descriptor. Its useful for control or other small files.

if (slurpclose(0,&message,1024) == -1) die_read();

This reads file descriptor 0 (in this case, STDIN) into a stralloc variable ( message ) for a length of up to 1024 bytes.

There are also a few control file read routines to help read these files. They are self-explanatory if you just read these examples

 #include "control.h"
 
if (control_readint(&i,"control/concurrencyremote") == -1) return 0;
if (control_rldef(&stralloc_var,"control/filename",
   default_is_me_file_bool,"default") != 1) return 0;
if (!control_readline(&me,"control/me")) err();

Note: These functions wrap a zero-byte (’\0’) onto the end of these values, even though they are stralloc variables. The reason for this would be to make the values easy to pass to routines taking ASCIIZ strings instead of stralloc references. Therefore, if you try to manipulate these variables by concatenating a string, it will get appended after the zero-byte, and any code interpreting the value as an ASCIIZ string will expect the value to stop after the first zero-byte, effectively losing your appended values. I got around this by decrementing the length (–sc.len;) before concatenating more data onto it.

Sometimes we want to take an input file and save the values from each input line.

 #include "constmap.h"
 
struct constmap maplocals;
if (control_readfile(&locals,"control/locals",1) != 1) return 0;
if (!constmap_init(&maplocals,locals.s,locals.len,0)) return 0;
if (constmap(&maplocals,addr.s,addr.len)) found_it();
if (constmap(&maplocals,asciiz_string,sizeof(asciiz_string))) 
   found_it();
constmap_free(maplocals);

Since the constmap tool holds memory pointers to dynamically allocated data, do not use it inside a local block or function without first doing the constmap_free() operation to free up its resources.

Wrap it up, already!

I have discoverd a minor bug in substdi.c in the substdio_peek() function. It doesn’t work across buffers. I changed the function to this:

  if (s->n >= s->p) {
    r = substdio_feed(s); if (r <= 0) return r;
  }
  return s->x + s->n;

The funny thing is that this function is not actaully used by qmail. However, it is used by the qmail-verh-0.06 patch which is used to insert the local and host addresses into the email message during qmail-remote using the ##L and ##H respectively. When the ##L or ##H variables fall across the buffer boundary, it can’t be “peeked” and is never matched. My changes correct this. Hopefully, I will be able to get this out into the Qmail community soon :)

Don’t you dare use the #define substdio_PEEK Even if I corrected the bug in substdio_peek, there is a nasty C macro called substdio_PEEK. This does not contain the corrections. Nasty evil macros. This is certainly one reason why C++ guru Scott Myers has denounced them in his illuminating text, Effective C++.

If you couldn’t figure out what I was referring to by the acronym “DJB”, they are the initials of the father of qmail, Dan J. Bernstein

 
qmail/substdio.txt · Last modified: 2005/07/17 15:01 by allen