Go up to Top
Go forward to m4

Introduction

In [1], Rogers describes how to solve a very complicated SQL programming problem with just a couple of queries using the Sybase built-in functions abs, sign and charindex. In [2], Rozenshtein et al, previously developed this idea to solve many distinct classes of problems using the same or similar built-in functions. The same authors have collected an extensive catalog of the uses of these functions in [3].

The ideas found in all these sources are invaluable to SQL programmers but extremely difficult to implement and debug because of their notational complexity.

For example, in [2] something like the following example appears. I have placed the query in the pubs2 database.

    Find the number of books which have exactly one author, 
        between two and four authors ,
        and finally the books with five or more authors.  
      
    select oneonly  = sum(sign((1 - abs(sign(au_ord - 1)))) - 
                                sign((1 - abs(sign(au_ord - 2))))),
           two2four = sum((1 - abs(sign(au_ord - 2))) - 
                                      (1 - abs(sign(au_ord - 5)))),
           many     = sum((1 - abs(sign(au_ord - 5)))) 
    from authortitle 
The authors observe, for example, that the expression sign((1 - abs(sign(au_ord - 1)))) is very difficult to understand intuitively and so introduce the concept of a characteristic function. A characteristic function, delta , for a Boolean expression such as A = B is written delta[A = B]. Its definition is that delta[A = B] = 1 whenever A = B is true and delta[A = B] = 0 otherwise. In their article, the authors then proceed to write the above query as
    select oneonly   = sum(delta[au_ord = 1] - delta[au_ord = 2]), 
           two2four  = sum(delta[au_ord = 2] - delta[au_ord = 5]),
           many      = sum(delta[au_ord = 5])
    from authortitle
Once you become comfortable with characteristic functions, queries become much easier to design and read. However, Transact-SQL can't deal with delta[au_ord = 1] and other similar expressions and so when it comes time to develop executable code we still need to produce an expression like sign((1 - abs(sign(au_ord - 1)))). The macro expander, m4, found on most UNIX systems provides a mechanism for writing characteristic functions in a mneumonic and more importantly, machine processable form. Queries developed using macros must normally be developed in files and then redirected into isql from the command line. For example, the following code, typed in at the UNIX command line, combines preprocessing with m4 followed by execution via isql using a pipe.
      ~> m4 query1.m | isql -U<userID> -P<password>
This is a bit cumbersome when writing ad hoc queries but when developing stored procedures the process is much easier to imagine since this command line syntax needs only be used when creating the procedure. Beginning with a stored procedure file developed using macros, say proc.m, we expand it to proc.sql and then treat this file in the normal way. If errors occur during compilation, we can refer to the proc.sql file for source code analysis but still need to make corrections to proc.m.
pletch@mcs.newpaltz.edu

Up Next