Upsert in SQL
Upsert is an operation that ➊ inserts new records into the database and ➋ updates existing ones. Let’s see how it works in different DBMS. The examples are interactive, so you can read and practice.
We will use the toy
employees
table:┌────┬───────┬────────┬────────────┬────────┐ │ id │ name │ city │ department │ salary │ ├────┼───────┼────────┼────────────┼────────┤ │ 11 │ Diane │ London │ hr │ 70 │ │ 12 │ Bob │ London │ hr │ 78 │ │ 21 │ Emma │ London │ it │ 84 │ │ 22 │ Grace │ Berlin │ it │ 90 │ │ 23 │ Henry │ London │ it │ 104 │ │ 24 │ Irene │ Berlin │ it │ 104 │ │ 31 │ Cindy │ Berlin │ sales │ 96 │ │ 32 │ Dave │ London │ sales │ 96 │ └────┴───────┴────────┴────────────┴────────┘
Let’s say we are adding two new employees:
(25, 'Frank', 'Berlin', 'it', 120) (33, 'Alice', 'Berlin', 'sales', 100)
Hello, Frank. Hello, Alice.
And updating the other two:
- (11, 'Diane', 'London', 'hr', 70) + (11, 'Diane', 'Berlin', 'hr', 70) - (21, 'Emma', 'London', 'it', 84) + (21, 'Emma', 'London', 'it', 95)
Diane is moving to Berlin, while Emma’s salary is increasing to 95.
You can read this article from start to finish, or you can skip to a section about the specific database engine:
MySQL/MariaDB • SQLite • PostgreSQL • SQL Server • Oracle • Others
INSERT
Let’s try the easy way:
insert into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100);
The first table is “before insert”, the second is “after insert”. We’ll use this notation for all examples.
As expected, it fails for Diane and Emma: the employee ID is a primary key, so we can’t have two employees with the same ID. We’ll have to try something else.
MySQL: REPLACE
MySQL’s
replace into
inserts records with new primary/unique keys (in our case, employee IDs) and replaces the ones with existing keys:replace into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100);
Added Frank and Alice, replaced Diane and Emma.
The bad thing about
replace
is that it does an actualdelete
followed byinsert
, which has some unexpected side effects like firingon delete
triggers.Also, you can’t reference the old values when replacing the record. For example, you can’t set
city
=Berlin (prev. London)
for Diane.To ignore the duplicates and only perform inserts, you can use
insert ignore
:insert ignore into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100);
Added Frank and Alice, ignored Diane and Emma.
replace
is probably not the best way to do upserts in MySQL. There is a better alternative ↓MySQL: INSERT…ON DUPLICATE KEY UPDATE
MySQL’s
on duplicate key update
inserts records with new primary/unique keys (in our case, employee IDs) and updates the ones with existing keys:insert into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) on duplicate key update city = values(city), salary = values(salary);
Added Frank and Alice, updated Diane and Emma.
When updating a record, you can reference the old value by column name (
city
) and the new value with thevalues()
function (values(city)
). Now we can actually update a value, not just replace it:insert into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70) on duplicate key update city = concat(values(city), ' (prev. ', city, ')');
Updated the city for Diane and kept the old value for reference.
Starting with MySQL 8.0, you can alias the new values and not use the ugly
values()
function:insert into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) as new on duplicate key update city = new.city, salary = new.salary;
Using the
new
alias for new records.SQLite: INSERT OR…
SQLite’s
insert or replace
inserts records with new primary/unique keys (in our case, employee IDs) and replaces the ones with existing keys:insert or replace into employees (id, name, city, department, salary) values (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100), (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95);
Added Frank and Alice, replaced Diane and Emma.
Like MySQL’s
replace
, SQLite’sor replace
deletes existing duplicate records before inserting the new ones. It fires delete triggers in some cases and does not allow references to old values.SQLite provides a number of alternative strategies instead of
replace
:
ignore
: skips the duplicate records (similar to MySQL’signore
).rollback
: aborts the current statement and rolls back the entire transaction.abort
: aborts the current statement and reverts any changes made by it, but does not roll back the transaction.fail
: aborts the current statement, but does not revert any changes already made by it, and does not roll back the transaction.insert or abort into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100);
Since there are conflicts,
abort
reverts all changes. The table remains as it was.
insert or replace
is probably not the best way to do upserts in SQLite. There is a better alternative ↓SQLite: INSERT…ON CONFLICT
SQLite’s
on conflict do update
inserts records with new primary/unique keys (in our case, employee IDs) and updates the ones with existing keys:insert into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) on conflict do update set city = excluded.city, salary = excluded.salary;
Added Frank and Alice, updated Diane and Emma.
When updating a record, you can reference the old value by column name (
city
) and the new value with theexcluded
prefix (excluded.city
).You can optionally disable some updates using the
where
clause:insert into employees (id, name, city, department, salary) values (21, 'Emma', 'London', 'it', 95), (22, 'Grace', 'Berlin', 'it', 105) on conflict do update set salary = excluded.salary where excluded.salary < 100;
We only allow salaries under 100, so Emma gets updated but Grace does not.
Besides
do update
, there is alsodo nothing
, which ignores conflicts and only inserts new records:insert into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) on conflict do nothing;
Added Frank and Alice, ignored Diane and Emma.
on conflict
even allows you to specify different rules for different uniqueness constraints:create unique index idx_employees_name on employees(name); insert into employees (id, name, city, department, salary) values -- increase Emma's salary to 95 (21, 'Emma', 'London', 'it', 95), -- change Grace's name to Diane (22, 'Diane', 'Berlin', 'it', 90) on conflict (id) do update set city = excluded.city, salary = excluded.salary on conflict (name) do nothing;
We only allow unique names (stupid, I know), and “Diane” is already taken. So Emma’s salary is updated, but Grace’s name is not.
Code playgrounds
Interactive examples in this post are powered by codapi — the platform I’m building. Use it to embed code playgrounds into your online course, documentation, or blog.
PostgreSQL: INSERT…ON CONFLICT
PostgreSQL’s
on conflict do update
is very similar to SQLite’s (in fact, SQLite’s implementation was inspired by PostgreSQL’s). It inserts records with new primary/unique keys (in our case, employee IDs) and updates the ones with existing keys:insert into employees (id, name, city, department, salary) values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) on conflict (id) do update set city = excluded.city, salary = excluded.salary;
Added Frank and Alice, updated Diane and Emma.
When updating a record, you can reference the old value by column name (
city
) and the new value with theexcluded
prefix (excluded.city
).Note that the constraint reference after
on conflict
(it is called conflict target —id
in our case) is required. You can also specify a constraint name:alter table employees add constraint uniq_employees_name unique (name); insert into employees (id, name, city, department, salary) values -- change Grace's name to Diane (22, 'Diane', 'Berlin', 'it', 90) on conflict on constraint uniq_employees_name do nothing;
We only allow unique names, and “Diane” is already taken. Grace’s name remains unchanged.
As you can see from the example above, PostgreSQL supports
do nothing
just like SQLite. It also supports thewhere
clause:insert into employees (id, name, city, department, salary) values (21, 'Emma', 'London', 'it', 95), (22, 'Grace', 'Berlin', 'it', 105) on conflict (id) do update set salary = excluded.salary where excluded.salary < 100;
We only allow salaries under 100, so Emma gets updated but Grace does not.
Unlike SQLite, PostgreSQL does not allow multiple
on conflict
clauses:create unique index idx_employees_name on employees(name); insert into employees (id, name, city, department, salary) values (21, 'Emma', 'London', 'it', 95), (22, 'Diane', 'Berlin', 'it', 90) on conflict (id) do update set city = excluded.city, salary = excluded.salary on conflict (name) do nothing;
This is an error: only one
on conflict
is allowed.
insert...on conflict
is a fine way to do upserts on older PostgreSQL versions. For modern ones (15+) there is a better alternative ↓SQL Standard: MERGE
Now that we’ve seen all the creative ways vendors have implemented upsert in their systems, let’s look at the standard way — the
merge
statement:merge into employees using ( values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) ) as new (id, name, city, department, salary) on employees.id = new.id when matched then update set city = new.city, salary = new.salary when not matched then insert values (id, name, city, department, salary);
Added Frank and Alice, updated Diane and Emma.
Here is what happened:
- We specified the merge source in the
using
clause (here we use thevalues
as the source, butusing
also accepts tables, views and subqueries).- We gave the merge source an alias (
new
) for future reference.- We defined the match criteria between the merge source (our
new
values) and the merge target (theemployees
table) using theon
clause (employees.id = new.id
).- We set the action to take when the merge source and merge target match by the given criteria (update
city
andsalary
with new values).- We set the action to take when the merge source and merge target don’t match (i.e., there are new records in the source, so we should insert them).
Whew. That’s a lot to process. But the
merge
statement can do much more. For example, you can change the inserted values:merge into employees using ( values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) ) as new (id, name, city, department, salary) on employees.id = new.id when matched then update set city = new.city, salary = new.salary when not matched then insert values (id, name || ' (new!)', city, department, salary);
We still update Diane and Emma, but we also change Frank and Alice’s names on insert.
Or even delete the matching records:
merge into employees using ( values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) ) as new (id, name, city, department, salary) on employees.id = new.id when matched then delete when not matched then insert values (id, name, city, department, salary);
Added Frank and Alice, deleted Diane and Emma.
Or ignore new records altogether by omitting the
when not matched
:merge into employees using ( values (11, 'Diane', 'Berlin', 'hr', 70), (21, 'Emma', 'London', 'it', 95), (25, 'Frank', 'Berlin', 'it', 120), (33, 'Alice', 'Berlin', 'sales', 100) ) as new (id, name, city, department, salary) on employees.id = new.id when matched then update set city = new.city, salary = new.salary;
Updated Diane and Emma, ignored Frank and Alice.
You can perform conditional inserts, updates and deletes using
when...and
clause:merge into employees using ( values (21, 'Emma', 'London', 'it', 95), (22, 'Grace', 'Berlin', 'it', 105) ) as new (id, name, city, department, salary) on employees.id = new.id when matched and new.salary < 100 then update set city = new.city, salary = new.salary;
We only allow salaries under 100, so Emma gets updated but Grace does not.
merge
is supported in PostgreSQL (15+), SQL Server (2008+) and Oracle (11+). Of course, each of them has its own quirks. For example, PostgreSQL adds the non-standarddo nothing
action, while SQL Server supports differentwhen not matched
actions depending on where the value is missing — in a source or target dataset.→ Documentation: Standard • PostgreSQL • SQL Server • Oracle
Other database engines
BigQuery, H2, Snowflake, DB2 — all support
merge
(each to some extent).MariaDB supports
replace
andon duplicate key update
, just like MySQL.Clickhouse does not have upsert SQL statements, but it does provide a ReplacingMergeTree table engine that automatically performs a replace-style upsert on insert.
Cockroachdb supports
insert...on conflict
(like PostgreSQL), but with an optional conflict target. It also provides anupsert
statement (ta-da!) that works like an automaticon conflict do update
.DuckDB supports
insert or replace
andinsert or ignore
(like SQLite) as well asinsert...on conflict
(also like SQLite), but without multiple clauses.Summary
We’ve reviewed ways to do upserts in each of the major databases:
replace
oron duplicate key update
in MySQL (MariaDB).insert or replace
andon conflict do
in SQLite.on conflict do
in older PostgreSQL versions.merge
in modern PostgreSQL, SQL Server and Oracle.It’s funny that (almost) no DBMS has actually named the statement
upsert
, despite the popularity of the name.
Upsert in SQL
Pages: 1 2