Home | About | Partners | Contact Us | Logout
VA Linux Systems

Welcome to the COOK homepage

Contents

What is COOK?

COOK is not an acronym. The name is supposed to describe the process of transforming one text file into another - just like cooking food or books respectively transforms food or tax payable.

COOK is intended to be used as a

  • Macro preprocessor
  • Document generator
  • Interpreted text-processing language
with an emphasis on transforming and manipulating text files, rather than being a general-purpose language like Perl or C. COOK is designed so that it is easy to read zero or more files containing tabular data then, under the direction of a template file, create an output document. The template file is actually the COOK "program". The program syntax is imbedded in the template file, much like C preprocessor statements are imbedded in your C programs.

In many ways COOK is like the C preprocessor or other preprocessors, however it is designed to be much more powerful by allowing use of the following programming constructs:

  • Structured: uses if, else, endif, for, next and other structured programming syntax.
  • Variables and associative arrays.
  • Read-only list/tree/graph structures initialised from external text files which contain (possibly cross-referenced) tabular data in a quite flexible format. These are accessed using the forall statement.
  • Sorting and searching of the above structures.
  • String manipulations including regular expressions.
  • Integer arithmetic.
  • Subroutines with local variables - called "methods" in the documentation.
  • Output filtering e.g. pagination and tabulation.

What is the Development Status of COOK?

COOK is under active development (only me at the moment). The latest and only release is version 1.0. Being a first release, there will be more than a few problems to shake out.

Implementation is mostly complete for the applications I have in mind, although contact me if you want any extra features. The remaining items which need to be fixed or implemented are:

  • -I include directory handling
  • Floating point. Initially I thought this wouldn't be worth the trouble, but it comes up often enough that it would be useful. Until then, keep track of your own decimal point!
  • Smoother interfacing to the outside world.
  • Better parser error handling. Optimal positioning of those error tokens in the yacc.
  • More extensive string manipulations and mathematical functions.
  • Better 'assembler-friendly' macro definitions.
  • Control over delimiter characters. The example below lies about the ability to change use of braces to other character pairs.
  • Much improved documentation. Most of the documentation is a flat ASCII file, which has had stuff added to it without much planning. I hope to turn it into groff or LaTeX markup.

Some of the above items will only be done if anyone other than me starts to use COOK. After all, I don't need better error handling or doc.

A Little History

COOK started in early 1999 as an attempt to improve part of a documentation-generating tool called Perceps which was based on Perl scripts. I wanted it to run faster and be more flexible. Then I started writing this z2k assembler. I realised that COOK could also be used as a macro preprocessor with slight modification. This explains some of the assembler-oriented features.

Presently, I use COOK whenever I need to do a bit of text processing, without needing the speed and power of straight C or the great flexibility of general purpose scripting languages.

Examples

The best way for you to get an idea of what COOK is about is to see some examples. The first example demonstrates use of COOK as a macro preprocessor. The second example shows how COOK could generate a PostScript file based on some tabular data.

  1. Using COOK instead of cpp.

    Here is the "traditional" way of using cpp (the C preprocessor) to perform some conditional compilation, and variable substitution:

    #include "my.h"
    #define ARRAY_SIZE 20
    #ifdef NEED_TABLE
    int table[ARRAY_SIZE];
    #endif
    int main(int argc, char ** argv)
    {
      ...
    }
    

    The important elements in the above code are the inclusion of a file my.h and the numeric substitution of 20 for ARRAY_SIZE. This is how the same thing would be expressed using COOK:

    #include "my.h"
    {;ARRAY_SIZE=20;}
    {if (NEED_TABLE)}
    int table[{ARRAY_SIZE}];
    {endif}
    int main(int argc, char ** argv)
    \{
      ...
    \}
    

    Note that the #include is the same, but text substitution is performed rather differently: COOK executes code that is enclosed in braces. This is in distinct contrast to cpp, which will substitute text in almost any context.

    Of course, you will have noticed that the price to pay for this control is that the braces in the C syntax need to be "escaped" with a backslash. This will be objectionable to C programmers! Well, this was supposed to be an example; I'm not recommending that anyone throw away cpp in favour of COOK. As it turns out, you can change the brace characters to be any other distinct opening/closing characters [Note: not implemented yet], but that's a bit too much detail for an overview such as this.

    One final point: the macro variable NEED_TABLE seems to be undefined in both cases. Both cpp and COOK allow specification of one or more initial values for macro variables. Not surprisingly, both accept the same syntax:

    cpp|cook -DNEED_TABLE=1 ...
  2. Processing tabular data.

    OK, so the previous example wasn't all that inspiring. This next example shows where COOK has an advantage over most preprocessors, namely in the collation of tabular data stored in text format, and the use of that data to generate some sort of repetitive output.

    Suppose we have a simple weather station which logs the time, outside temperature and wind speed every minute. This data is added to a text file, with one line for each minute's data. Let's suppose the format of each line is

    W 2000/06/20 13:20 23 4

    where the fields are a constant 'W' for a weather report record, the day of year, the time of day, the temperature in degrees C, and the wind speed in km/h. The log file might have thousands of these records. Our task is to take the data and, for a specified time interval, create a graph of temperature and wind speed as a PostScript file. [If you don't know what PostScript is, it's a markup language for printers. We could have chosen any other markup language for this example such as HTML; the principle is the same.]

    We need the following files for this task: the log file itself, which is the raw data; a description of the logfile format (we could call this the meta-data); and the template file which defines the output format. It is the template file which will contain the COOK embedded language, along with a host of PostScript boilerplate. With these three files available, COOK can produce the desired PostScript output file.

    The meta-data must be in a format that COOK understands. It can be included in the raw data, or supplied as a separate file. The latter is more convenient unless you can modify the application which generates the data. Here is the contents of the meta-data file:

    [WxRec] W date time temp(int) speed(int) ;

    This is simply declaring a record type that will be called WxRec. The first field indicates the character(s) which identify this record type in the log file. Then, each field is defined and named. By default, fields are read in as strings. Suffixing the field name with (int) specifies that the field takes integer values. The final file which is needed is the template. This contains the COOK statements necessary to turn the raw data into a properly formatted PostScript output file. Here is the template, although some detail has been omitted:

    %!PS-Adobe-3.0 EPSF-3.0
    %%BoundingBox: 40 40 800 400
      ...
      lots of PostScript boilerplate
      ...
    /initialise_plot
    \{ ... \} bind def
    /plot_temp_and_speed
    \{ ... \} bind def
    %%EndProlog
    %%Page 1 1
    {
      /* COOK code here... */
      sp = ' ';
      if (!START_DATE) START_DATE = '2000/06/01'; endif
      if (!END_DATE)   END_DATE = '2000/06/30';   endif
      START_DATE sp END_DATE ' initialise_plot';
      forall WxRec w
        sort(w.date, w.time)
        keep(w.date >= START_DATE && w.date <= END_DATE)
        ifnone
          '(There was no data between' START_DATE 
          ' and ' END_DATE ') show';
        endif
        w.date sp w.time 
          sp w.temp sp w.speed ' plot_temp_and_speed';
      next
    }
    %%EOF
    

    The salient point in the above code is the initial output of possibly large quantities of "boilerplate" i.e. pretty-much fixed output which is required by the target processor (a PostScript printer in this case). This output is then followed by the COOK code which is responsible for inserting the dynamic part of the document; in this case, the weather log as a series of lines which call a PostScript routine that was output in the boilerplate section -- presumably, that routine would handle the details of plotting the output e.g. as a series of linetos.

    Finally, we invoke COOK to put all this together. Note that the actual Unix command is "cook" in lower case. It is only my convention to use the uppercase convention when talking about COOK in general as opposed to the cook Unix command.

    % cook -f -d wxlog.meta -d log.dat templ.dg | ghostview

    The -d arguments specify the metadata and data files in order. The last argument is the template file name, and the output is piped into a PostScript viewer. For historical reasons the template files have a "dg" suffix, however this is not really important. The -f argument is a technical necessity: this tells COOK that the template starts in "boilerplate" mode. Without -f, COOK assumes that the file starts with COOK statements i.e. as if it had already seen an open brace. If you didn't want to code the -f every time, you could put a closing brace as the first character in the template file.

    Finally, there is at least one example in the COOK tarball. This is a serious use of COOK. In the test subdirectory, there is a template file called pcb.dg. This template accepts a metadata file called pcb.dbdef and a data file called tempsens.pcb. The latter file is an output file from Protel Autotrax for MS/DOS, without any modifications. Autotrax is an EDA (Elecronic Design Automation) tool that allows design of PCBs (Printed Circuit Boards). I think it's available for free now, but only runs on (ugh...) DOS. As it came originally, the plot program had some limitations when it came to PostScript output. pcb.dg was an attempt to improve the output quality.

Links

COOK Embedded Language project page.

You might also be interested in PHP (Hypertext Preprocessor) which is a similar concept, with applicability to dynamically generating HTML.

The first major project to use COOK as a macro preprocessor is z2k which is an assembler for a new microcontroller.


All trademarks and copyrights on this page are properties of their respective owners. Forum comments are owned by the poster. The rest is copyright ©1999-2000 VA Linux Systems, Inc.