Stralloc Library
Bernstein has decided not to use the standard K&R C library and has instead opted to write his own. I do not beleive that this was done to frustrate anyone wanting to understand or modify his code, but nonetheless that is the first response. I do not want to debate his wisdom or errors here. (Though damn, he does write great, tight code... even if we can not read it!)
His STRALLOC library is a set of routines that are meant reo replace the standard ASCIIZ C Strings. The C strings are a sequence of 8-bit characters terminated by a null-byte (binary zeroes). This has the downside of not being able to include the Zero byte, but also allows for nasty memory allocation problems and writing beyond the bounds of the variable’s memory, a fact far too familiar with many a C programmer. The C++ programmer has had this burden lifted by the wonderful string template class found in the Standard Template Libarary (STL).
And so was born the STRALLOC. Though it smacks a bit of C++’s Object-oriented model because all the data lives inside a structure and the routines manipulate that structure. Remember that the STRALLOC variables are structures, because you need to pass them by reference (that is, prepend them with the “&” address-of operator) to these subroutines.
The Alloc
The stralloc type is a “subsclass” (sub-type) of the abstract alloc type. The alloc type is a data structure with 3 elements
- *field - a pointer to the data field, dynamically handled
- len - The length of the field
- a - Something else, maybe used internally :)
Any type of data useing dynamic length can be built using this structure. There are maros which create a “typedef” statement The string version, stralloc, uses this defintion:
GEN_ALLOC_typedef(stralloc,char,s,len,a)
which generates a type of “stralloc” which is a dynamic vector of type char. The macro that generates this is
#define GEN_ALLOC_typedef(ta,type,field,len,a) \ typedef struct ta { type *field; unsigned int len; unsigned int a; } ta;
Therefore, a stralloc type is analagous to the following definition:
typedef struct stralloc { char *s; unsigned int len; insigned int a; /* used?? */ } stralloc;
The STRALLOC
Programs use the stralloc by including the following file:
#include "stralloc.h"
Once all these definitions are explained to the C compiler, you can define a stralloc variable as follows:
stralloc name = {0};
You may think “huh?”, and I do not blame you! Ok, what happens here is an initalization of the varible to a null string. You will find this mostly in file-glabal variables where they are statically initialized at load time. (I do not reccommend local stralloc variables; see bottom of page for the reason) What happens in this statement is that the length variable is the pointer is initialized to null, meaning is has no value.
The libaray consists of a few function to populate, append, and search the string. The naming convention of the subroutines (methods?) is that if an “s” is appended to it, it takes a ASCIIZ string (such as a C String “constant” in double quotes) as a parameter. Similarly, a “b” infers some kind of binary data.
To assign a value to a stralloc string, you may do this:
if (!stralloc_copys(&addr,"hello")) die_nomem();
which is right of of the DJB book of style. (In other words, it is thick with meaning!) What is really happening here is the copy-ASCIIZ string function is called, passing the address (reference) of the stralloc structure and the address of a ASCIIZ C string (here a C string constant). This function will allocate memory and copy the “hello” string into the memory pointed to by the “s” element of the stralloc structure. The length will be computed accordingly. If memory allocation, etc., completed well, the functin will return 1 (true), but if it returns 0 (falise) then we have to pass control along to a handler. He uses the die_nomem() function to do just that: exit the program with a message about running out of memory. Therefore, all of his stralloc functions are couched in these little terse if statements. Just be sure that you can die if you run out of memory before copying this boilerplate verbatim.
The other fun things you can do with a stralloc are:
int stralloc_ready(&sa,len); int stralloc_readyplus(&sa,len); int stralloc_copy(&sa,&sa2); int stralloc_copys(&sa,buf); int stralloc_copyb(&sa,buf,len); int stralloc_cat(&sa,&sa2); int stralloc_cats(&sa,buf); int stralloc_catb(&sa,buf,len); int stralloc_append(&sa,buf); int stralloc_0(&sa); int stralloc_starts(&sa,buf); stralloc sa = {0};stralloc sa2 = {0};unsigned int len; char *buf;
Note that each function returns true if it worked and should be used as in the above style example. The “cat” function concatenates (appends) one stralloc to another, while “cats” appends an ASCIIZ or C String constant to the variable. These are similar to the K&R strcat() function.
The “copy” functions are like the strcpy() C library functions, but operate with strallocs (copyb), ASCIIZ strings (copys).
The stralloc_starts() function is used to see if a string (or substring therein) starts with a passed ASCIIZ string, returning true if it does.
stralloc_ready makes sure that sa has enough space allocated for len characters. It allocates extra space if necessary.
stralloc_readyplus makes sure that sa has enough space allocated for len characters more than its current length. If sa is unallocated, stralloc_readyplus is the same as stralloc_ready.
The summary
Okay, once you befriend the stralloc, and unserstand a bit about how it works, you can use its easier. In fact, you’ll discover a few times where you’ll run back to stralloc from the K&R library.
The Stralloc library deals with dynamically allocated and re-sized strings. It only really knows about 8-bit data, though you can of course put anything in them.
The last though is about cleaning up. Bersteins program structure (remember he doesn’t use stralloc’s in local variables, but only in static file variables!!) is a safe model to use here, even if it is not a tidy memory model. So it would be a good rule of thumb to never declare a stralloc variable in a subsroutine or local block! What happens here is that you create a subtle memory leak. (This isn’t Java, after all.) When your subroutine or block finishes, the stralloc structure is de-allocated from the stack, however if leaves the data pointer allocated with a lost reference. A typical memory leak. This is where a nice C++ object would use a destructor to clean up after itself, but we are blessed here with pure C.
Another rule of thumb is that when you are done with a particularly long data string, you should set it to “empty” so save memory. just call the stralloc_copys(&sa,”“) function to empty its contents. (I expect this should work!)
Feedback
Jos Leurs gave me feedback on this page. I haven’t had time to update to reflect his information... so I’ve just presented it here for now until I can re-write this page! Thanks, Jos! –Allen
typedef struct stralloc { char *s; unsigned int len; insigned int a; /* used?? */ } stralloc;
The meaning of “a” is clear , it is the size of stralloc structure which is not necessarily the same as the len .
Because the alloc mechanism of Bernstein is working with an alignment memory allocation is always a product of this alignment. You can find this back in the code. of alloc.c. There for the len is not the same as assigned memory.
Since the alloc function allocs memomory in the heap in a preserved 4k or if not possible allocs memory using malloc. I think it is possible to use a stralloc in a subroutine.
whatching the free code of alloc.c. Freeing is not really freeing since freeing is doing nothing if the pointer was pointing to a place in the heap. ( ie: space ⇐ x < space+ SPACE ) which is a block of 4k. If the pointer is pointing to a place in the stack the normal function free is called. So freeing only takes place when memory is allocated in the stack.
Perhaps I did not really understand the code, but what in my hounest opinion what I see is that Bernstein is just preserving a place of 4k with an alignment of 16 bytes. When he allocs space he is returning a pointer to a place in that heap. When there is no space left the normal malloc is called. So if you are using a stralloc in a subroutine memory is allocted in the heap or in the stack. when you call a free in that subroutine before returning memory is freed in the stack and not realy freed in that 4k blok.
What I understood of this piece of code is that Birnstein is just preserving a block of 4k to make allocating less time expensive and there for he is just preserving a block of 4k . The mechanism will not work very efficiently if you are allocating and freeing a lot of time because the 4k block will be used and the 4k of the heap will be lost. After that the normal malloc and free is going to be used. So we cannot really talk about a leek.
Tell me if i am wrong !
#define ALIGNMENT 16 /* XXX: assuming that this alignment is enough */ #define SPACE 4096 /* must be multiple of ALIGNMENT */ typedef union { char irrelevant[ALIGNMENT]; double d; } aligned; static aligned realspace[SPACE / ALIGNMENT]; #define space ((char *) realspace) static unsigned int avail = SPACE; /* multiple of ALIGNMENT; 0<=avail<=SPACE */ /*@null@*//*@out@*/char *alloc(n) unsigned int n; { char *x; n = ALIGNMENT + n - (n & (ALIGNMENT - 1)); /* XXX: could overflow */ // my remark. The memory assigments starts with the highest block and //goes than further down. if (n <= avail) { avail -= n; return space + avail; } x = malloc(n); if (!x) errno = error_nomem; return x; } void alloc_free(x) char *x; { if (x >= space) if (x < space + SPACE) return; /* XXX: assuming that pointers are flat , */ /* my remark : if x is in the 4k block nithing is happening. free(x); }Jos Leurs