Hasan Jawaid: Oracle SQL: Merge

Definition:

The MERGE statement was introduced in Oracle 9i to conditionally insert or update data depending on its presence, a process also known as an "upsert". The MERGE statement reduces table scans and can perform the operation in parallel if required.

Oracle’s MERGE statement is tailor-made for situations when you want to do an "upsert" i.e. update existing rows in a table or insert new rows depending on a match condition. This is typically the case when you have to synchronize a table periodically with data from another source (table/view/query). In place of 3 separate unwieldy INSERT, UPDATE and DELETE statements with conditional sub-queries, the all-in-one MERGE does the job in one shot.

To illustrate how MERGE works, lets go back to the example of our STUDENT Table with GMAT scores.

SQL> select * from student;

        ID NAME                 SCORE
---------- --------------- ----------
         1 Jack                   540
         2 Rose
         3 William                650
         4 Caledon                620
         5 Fabrizio               600
         6 Thomas
         7 Ruth                   680
         8 Spacer                 555

8 rows selected.

Consider another table STUDENT_N, which holds updates for the target table STUDENT. Every week, table STUDENT needs to be synchronized with the data in STUDENT_N – any new entries for students who attempted the GMAT to be inserted, plus corrections if any made to the existing details.

Let’s say this is how STUDENT_N looks before it is merged with STUDENT:

SQL> select * from student_n;

        ID NAME                 SCORE
---------- --------------- ----------
         7 Ruth                   690
         8 Spicer                 620
         9 Wallace                600
        10 Lizzy
        11 Brock                  705

As you can see, the following actions are required on table STUDENT:

1 row for id#7 to be corrected for score: Ruth had scored 690, not 680.
1 row for id#8 to be corrected for name: the student is called Spicer, not Spacer.
3 new rows (ids#9,10,11) to be inserted into STUDENT table.

5 rows should get processed in all.

MERGE statement pseudo-code

A bird’s eye view of the semantics of MERGE:

MERGE into <target table>
USING
    <souce table/view/result of subquery>
ON
    <match condition>
WHEN MATCHED THEN
    <update clause>
    <delete clause>
WHEN NOT MATCHED THEN
    <insert clause>

…and finally…

MERGE magic in action

SQL> merge into student a
  2  using
  3    (select id, name, score
  4     from student_n) b
  5  on (a.id = b.id)
  6  when matched then
  7    update set a.name = b.name
  8         , a.score = b.score
  9  when not matched then
 10    insert (a.id, a.name, a.score)
 11    values (b.id, b.name, b.score);

5 rows merged.

Let’s check the values in table STUDENT now.

SQL> select * from student;

        ID NAME                 SCORE
---------- --------------- ----------
         1 Jack                   540
         2 Rose
         3 William                650
         4 Caledon                620
         5 Fabrizio               600
         6 Thomas
         7 Ruth                   690
        11 Brock                  705
        10 Lizzy
         9 Wallace                600
         8 Spicer                 620

11 rows selected.

Sure enough, 5 rows have got merged as expected – 2 updates + 3 inserts.

MERGE enhancements in Oracle 10G

MERGE was introduced in Oracle 9i. Since then it has gone through a revamp to accommodate new features, most importantly:

MERGE in 10G supports insert-only and update-only operations. Its 9i predecessor required both WHEN MATCHED and WHEN NOT MATCHED to be present in the statement, this is no longer necessary in 10G.
You may ask why one would use MERGE for only a single operation – when there is no all-in-one "upsert" happening, what is the point? What benefit does it give us over good old INSERT or UPDATE?

Well, the big advantage is of ease of coding and understanding. MERGE is a lot more intuitive to read – and so a lot less error-prone – than the equivalent UPDATE with correlated sub-queries.

MERGE might also have a performance advantage over INSERT or UPDATE – don’t assume that though, do benchmark and test.
MERGE in 10G supports the delete operation as part of the update clause.
Merge in 10G has improved exception reporting.

Watch out for these when you MERGE

You cannot update any of the columns you are merging on. If you try updating a student’s id in the example above, this error will show up in 10G:

ORA-38104: Columns referenced in the ON Clause cannot be updated
MERGE is a deterministic statement – that is, you cannot update a row of the target table multiple times in the same MERGE statement.
You must have the INSERT and UPDATE privileges on the target table and the SELECT privilege on the source table. To specify the DELETE clause, you must also have the DELETE privilege on the target table.
When using MERGE for the DELETE operation, remember that:

DELETE checks the match condition on the target table, not the source.
DELETE works only on rows updated during MERGE. Any rows in the target table that are not processed during MERGE are not deleted, even if they match the DELETE condition.

SQL> select * from student;

        ID NAME                 SCORE
---------- --------------- ----------
         1 Jack                   540
         2 Rose
         3 William                650
         4 Caledon                620
         5 Fabrizio               600
         6 Thomas
         7 Ruth                   690
        11 Brock                  705
        10 Lizzy
         9 Wallace                600
         8 Spicer                 620

11 rows selected.

SQL> select * from student_n;

        ID NAME                 SCORE
---------- --------------- ----------
         7 Ruth                   690
         8 Spicer                 620
         9 Wallace                600
        10 Lizzy
        11 Brock                  705

SQL> merge into student a
  2  using
  3    (select id, name, score
  4     from student_n) b
  5  on (a.id = b.id)
  6  when matched then
  7    update set a.name = b.name
  8         , a.score = b.score
  9    delete where a.score < 640;

5 rows merged.

SQL> select * from student;

        ID NAME                 SCORE
---------- --------------- ----------
         1 Jack                   540
         2 Rose
         3 William                650
         4 Caledon                620
         5 Fabrizio               600
         6 Thomas
         7 Ruth                   690
        11 Brock                  705
        10 Lizzy

9 rows selected.

In the MERGE statement above, Jack, Caledon and Fabrizio’s records were not deleted from the table STUDENTS though their scores are below 640, since the delete condition works on the source, not the target.

MERGE is a key-preserved operation – that is, for each source row, Oracle has to be able to identify a single target record for update. The simplest method of ensuring this is to join source and target tables by the primary key of the target. If this condition is not satisfied, error ORA-30926 will show up in 10G:

ORA-30926: unable to get a stable set of rows in the source tables
Oracle Database does not implement fine-grained access control during MERGE statements. If you are using fine-grained access control on the target table, you’d probably have to stick to the equivalent INSERT and UPDATE statements. [Please leave a comment if you know a way out of this.]
Even if a record in source and target is identical, it is processed by the merge_update_clause if the match condition returns TRUE. e.g. If the source has 3 records and all 3 are identical to the target, MERGE will report ‘3 rows merged’ though this merge made no difference to the target table.
MERGE tells you the total number of rows processed in the upsert, it does not tell you the individual counts of rows inserted/updated/deleted. If you’re working for a customer who insists on getting individual counts, you’ll either have to persuade them to adapt to the new way of reporting or once again stick to the equivalent INSERT and UPDATE statements.
If you are have a thing for gap less sequences, you’re in for heartache as MERGE tends to produce large sequence gaps. The MERGE statement increments sequence.nextval for both insert and update – read more about it in this case study.