Wednesday, April 29, 2009

An severe performance impact by subselect in sql "from" clause (DB2)

Started working on the auto-pricing process performance test, but an very strange thing blocks me: a very simple query runs in very very bad performance, only 1 record inserted per second! The database is DB2 8.5 on zLinux server which runs in normal CPU utilization (35% only).

Originally, I thought there must be some problem in the database, so bad performance should not happen even the SQL hasn't been optimized. But the feedback from DBA is that he checked all things in DB, but can not find any abnormal indication. I really felt overwhelmed!

But today, I finally acknowledge that the bad performance is really caused by my query:

// less 1 record inserted per second, 30000 records insertion need 10 hours to complete (timeout...)
INSERT INTO WWPRT.CABLEPRODUCT_PRICETYPE_JOIN_CN (CABLEPRODUCTID, PRICETYPE, SLAVE) (
     SELECT cableproduct.ID, pricetypes.PRICETYPE, 'N'
     FROM WWPRT.CABLE_PRODUCT_JOIN_CN AS cableproduct,
     (
          SELECT DISTINCT usprice.PRICETYPE FROM WWPRT.PRICE_CN usprice
          JOIN WWPRT.CONF_PRICETYPE pt ON pt.ID = usprice.PRICETYPE
          WHERE usprice.CABLEID = 'USCABF3FVT01'
          AND pt.DOMAIN IN ( SELECT DISTINCT DOMAIN FROM WWPRT.CONF_GAP )

     ) AS pricetypes
     WHERE CABLEID = 'GEOANNF3FVT1'
     AND EXISTS (
         ......
     )
)


The arch-criminal is the subselect statement in "from" clause, the subselect query is performed for each record in WWPRT.CABLE_PRODUCT_JOIN_CN table. (awful!)

I changed it to a new way, using temporary table which query is executed only once.

// 30000 records insertion only costs 9 seconds to complete!
WITH US_PRICETYPES (PRICETYPE) AS (
    SELECT DISTINCT usprice.PRICETYPE FROM WWPRT.PRICE_CN usprice
    JOIN WWPRT.CONF_PRICETYPE pt ON pt.ID = usprice.PRICETYPE
    LEFT OUTER JOIN WWPRT.CONF_GAP gap ON pt.DOMAIN = gap.DOMAIN
    WHERE usprice.CABLEID = 'USCABF3FVT01'
    AND gap.DOMAIN IS NOT NULL
)
SELECT COUNT(*) FROM NEW TABLE
(INSERT INTO WWPRT.CABLEPRODUCT_PRICETYPE_JOIN_CN (CABLEPRODUCTID, PRICETYPE, SLAVE) (
     SELECT cableproduct.ID, pricetypes.PRICETYPE, 'N'
     FROM WWPRT.CABLE_PRODUCT_JOIN_CN AS cableproduct,
     US_PRICETYPES AS pricetypes
     WHERE CABLEID = 'GEOANNF3FVT1'
     AND EXISTS (
         ......
     )
))


The improvement is awesome! From 10 hours to 10 seconds! Although it happens in DB2, but I think this case should also exist in other types of database, but not verified.

--------------------------------------------------------------------------------------
Feedback from Daniel, thank you very much:

If you don't care the result of how many records inserted, I think you can use 'FETCH FIRST 1 ROWS ONLY' instead of count(*).

SELECT CABLEPRODUCTID FROM NEW TABLE
(INSERT INTO WWPRT.CABLEPRODUCT_PRICETYPE_JOIN_CN (CABLEPRODUCTID, PRICETYPE, SLAVE) (
     SELECT cableproduct.ID, pricetypes.PRICETYPE, 'N'
     FROM WWPRT.CABLE_PRODUCT_JOIN_CN AS cableproduct,
     US_PRICETYPES AS pricetypes
     WHERE CABLEID = 'GEOANNF3FVT1'
     AND EXISTS (
         ......
     )
)) FETCH FIRST 1 ROWS ONLY


No comments:

Post a Comment