UDRs: COALESCE

This article is written in English and Portuguese
Este artigo está escrito em Inglês e Português

English version:

Introduction
This is another article in the UDR series. In a discussion on IIUG mailing list someone was complaining about the lack of COALESCE function in Informix. The first answer was that nested NVL() calls could be used as a replacement, or an SPL function could be written to implement it. But the same person warned that the SPL procedure would have a significant performance impact. I made a couple of tests and verified the same. So I decided to test a UDR to implement it.
But before diving into it, a few warnings are in order… The true COALESCE() function or SQL construct is a very flexible operation that takes an undetermined number of arguments of unknown and possibly different data types. I’m not sure if something like that can be implemented in a user defined function (UDR). So, for the scope of this article I’ll assume two restrictions: A fixed number of maximum arguments (this would be easy to change) and that all the arguments belong to the same type (although the engine could cast them).
I’d also like to thank John Miller, the Senior Technical Staff Member of the Informix team, and a well known member of the Informix community, for his input, suggestions and code review.

The code
You can find the C UDR source code at the end of this article. I’ll just go through it, to better explain how it works, but the juicy part is the comparison between several methods that will follow.

Lines 1 to 8 are just the usual include sections.

Lines 11 to 15 is the function header. As you can see at the C code level it receives ten LVARCHARs and returns an LVARCHAR. The reason why the LVARCHAR was choose is because it has implicit casts for most if not all the data types. This means that when we define the function at the SQL level we can use any data type we like (or create several functions with different signatures that allows for a broader use).

Lines 17 to 20 include an auxiliary variable declaration and initialization with the function mi_fp_nargs() which returns the number of parameters defined for the function.

Line 23 defines a loop that checks if any of the arguments is not null. If it finds one, it returns that argument. Unfortunately I could not find an easy way to make this piece of code generic (automatically adaptable to a different number of parameters), so a long switch statement was used.

If none of the arguments is non-NULL, then at lines 62-63 it returns a NULL value.

The compilation

As usual I use a simple makefile to generate the dynamic library containing the code:

include $(INFORMIXDIR)/incl/dbdk/makeinc.linux86_64


MI_INCL = $(INFORMIXDIR)/incl
CFLAGS = -DMI_SERVBUILD $(CC_PIC) -I$(MI_INCL)/public $(COPTS)
LINKFLAGS = $(SHLIBLFLAG) $(SYMFLAG) $(LD_SHARED_FLAGS)

all: ix_coalesce

clean:
rm *.udr *.o

ix_coalesce: ix_coalesce.udr
@echo "Library genaration done"

ix_coalesce.o: ix_coalesce.c
@echo "Compiling..."
$(CC) -c $(CFLAGS) -o $@ $?

ix_coalesce.udr: ix_coalesce.o
@echo "Creating the library..."
$(SHLIBLOD) $(LINKFLAGS) -o $@ $?

In the end, after the make command we should have a dynamic linked library that we can use to create the function in SQL. The make process is simply:


tpch@kimball:informix-> make
Compiling...
cc -c -DMI_SERVBUILD -fPIC -I/opt/informix/srvr1170fc5/incl/public -g -o ix_coalesce.o ix_coalesce.c
Creating the library...
gcc -shared -m64 -Bsymbolic -shared -m64 -o ix_coalesce.udr ix_coalesce.o
Library genaration done

tpch@kimball:informix-> ls -lia ix_coalesce.udr
1654803 -rwxr-xr-x 1 informix informix 8273 Aug 13 00:00 ix_coalesce.udr

Creating the function in SQL
Once we get the compiled code in the form of a dynamic loadable library we need to create the function in SQL, referencing the C code function.
This is done with this simple SQL code:

DROP FUNCTION IF EXISTS coalesce_udr;
CREATE FUNCTION coalesce_udr(
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL,
INTEGER DEFAULT NULL
) RETURNING INTEGER
WITH (NOT VARIANT, HANDLESNULLS)
EXTERNAL NAME '/opt/informix/work/coalesce/ix_coalesce.udr(ix_coalesce)'
LANGUAGE C;

A few notes:

  1. I’m defining an UDR that takes INTEGERs as arguments. Internally they’ll be treated as LVARCHARs. If we need COALESCE() for other types of arguments we could create other functions, with the same name but different parameter types. This is called overloading and is perfectly supported in Informix
  2. I used HANDLESNULLS because without it, if we use NULL as arguments the C functions are not called and return NULL
  3. EXTERNAL NAME specifies the “path” to the C code function. In this case it’s the pathname of the dynamic loadable library we created and the function name (inside the library) between parentheses

Tests and speed comparison
So, the purpose of this is to compare several ways to overcome the lack of a native COALESCE SQL construct. The ways I considered were:

  1. Using nested NVL() functions directly in the SQL statement. This is the fastest and was considered the reference
  2. Creating an SPL function called coalesce_basic that receives the arguments and has a nested NVL() statement inside the SPL
  3. Creating an SPL function called coalesce_if that is similar to the previous, but instead of a nested NVL() structure has a sequence of IF statements
  4. The C code function explained above

You can find the code for all this alternatives at the end of the article. After creating the functions I ha…

Auteur : noreply@blogger.com (Fernando Nunes)

No comments yet.

Leave a Reply

%d bloggers like this: