Este artigo está escrito em Inglês e Português
In this article I’ll mention some new functionality in 12.1, but I’ll use that to answer a recent question on IIUG mailing list. The question initially posted was about when did Informix update the index involved in a transaction. But after a few interactions with the original poster, it was clear what the problem was and it’s a bit tricky:
The user was UNLOADing a table that contained a primary key and there were duplicate records (regarding the primary key) in the result set. This naturally looks like a bug, but there is a technical explanation for that. Let’s describe a simple reproduction. For that I wrote three simple SQL scritps that will be run through dbaccess:
DROP TABLE IF EXISTS test;
CREATE TABLE test
PRIMARY KEY (col1,col2)
EXTENT SIZE 16 NEXT SIZE 16
LOCK MODE ROW;
INSERT INTO test VALUES ( 1, 1, 'a');
INSERT INTO test VALUES ( 1, 2, 'a');
INSERT INTO test VALUES ( 1, 3, 'a');
INSERT INTO test VALUES ( 1, 4, 'a');
INSERT INTO test VALUES ( 1, 5, 'a');
UPDATE TEST SET COL3 = 'a1' WHERE col1 = 1 AND col2 = 3;
SET LOCK MODE TO WAIT;
UNLOAD TO test.unl
SELECT *, ROWID FROM test;
DELETE FROM test WHERE col1 = 1 AND col2 = 1;
INSERT INTO test VALUES (1, 1, 'b');
The first script, called “main.sql” will create a table, with row level locking and a particularity that helps the reproduction: Each page will contain only one row. I did this on Linux where the default page size is 2KB. If you run it on AIX (4KB pages) please change the col3 definition to CHAR(4000). Then the script will populate the table with 5 rows. Note that I’m using a composite primary key, but a simple primary key or any unique index should be enough.
Then I start a transaction and do an update on row “3” and just after that I run an external command to sleep for 30 seconds before executing the COMMIT. This gives me plenty of time to run the other scripts in parallel.
Next script is called “unload.sql” and it simply runs an UNLOAD of that table after setting the LOCK MODE TO WAIT. As you may expect this script will be stuck on the lock created by the first one.
The third script has the “magic”. It DELETEs a row with a certain primary key and INSERTs it again, but with a different col3 value.
I will use the following SHELL script to run the reproduction scripts:
dbaccess stores main.sql 1>main.out 2>&1 &
dbaccess stores unload.sql 1>unload.out 2>&1 &
dbaccess stores move_rows.sql 1>move_rows.out 2>&1 &
So… It runs the “main.sql” in background. Sleeps for 5 seconds (enough so that the “main.sql” is able to enter the sleep after locking record 3). Then is launches the “unload.sql” script which will be waiting on the first one, also in the background. And then it runs the script that DELETEs and INSERTs a row with a certain primary key.
This is the resulting unload file:
As you can see we have 6 rows in the file. And only 5 on the table. We have a problem.
The row with primary key (1,1) is duplicated. How can this happen? In order to understand it, we need to understand how the queries are processed and that Informix is not a multi-version database (just like SQL Server, DB2 in most cases, Sybase and some others).
The sequence of events is:
- We have a table with 5 rows. And we lock row number “3”
- In parallel we start a full scan of the same table. We read rows 1 and 2 and we get stuck on row number 3
- While we’re still holding the lock we do another parallel operation where we remove row 1, and insert it again (with a different value for col3 for better understanding) on a different physical location
- We unlock row 3
- The full scan continues and reads row 1 again, on another physical location
If you’re scared with…