Category Archives: PostgreSQL

PostgreSQL Upserts

Background

SQL upserts are a combination of an INSERT and/or UPDATE into a single database operation which allows rows to to be added or modified in an atomic, concurrent-safe way.

For a time upserts were somewhat of a sensitive subject in PostgreSQL circles. Several other DBMS’s had some form of support for upserts/conflict resolutions (albeit often with limited assurances) and refugees from these other DBMS’s often got hung up when porting existing queries that heavily relied on this type of functionality.

Prior to Postgresql 9.5 there were various (and less-than-ideal) approaches to get around the lack of native support. The PostgreSQL manual provided a stored procedure example which:

  • attempts to update an existing row
  • detects if no row was updated
  • attempts to insert new row instead
  • catches and throws away duplicate key exceptions
  • repeat the steps above indefinitely until either a row has been updated or inserted

Though this approach works as intended (and is more or less applicable to other DBMS’s), it has a certain smell to it. Ignoring the performance aspects for a moment, one of the bigger practical pain points of this approach was that each table that you might want to upsert would need to have its own corresponding stored procedure.

However thanks to the work of dedicated PostgreSQL developers, native support finally made into the software starting with the 9.5 release. The INSERT statement was expanded with an ON CONFLICT clause that gives it the ability to recover from unique constraint violations.

Demo

Test system setup

If you don’t already have a PostgreSQL testbed, you can use the instruction listed here to quickly start up a PostgreSQL docker.

Create test table

CREATE TABLE upsert_test1 (
     name       TEXT PRIMARY KEY,
     fav_color  TEXT
);

Insert row using upsert

INSERT INTO upsert_test1 (name, fav_color) 
                  VALUES ('Sally', 'blue')
         ON CONFLICT (name) 
            DO UPDATE SET fav_color = 'blue';

Confirm row values


SELECT * FROM upsert_test1;

 name  | fav_color 
-------+-----------
 Sally | blue

Update row using same upsert statement

INSERT INTO upsert_test1 (name, fav_color) 
                  VALUES ('Sally', 'green')
         ON CONFLICT (name) 
            DO UPDATE SET fav_color = 'green';

Confirm row values


SELECT * FROM upsert_test1;

 name  | fav_color 
-------+-----------
 Sally | green

Conclusion

As you can see above the same upsert statement was able to insert a new row or update an existing row as the circumstances dictated. Though you shouldn’t use upserts recklessly as there will still be a performance hit, the additional flexibility it provides makes migrating from other databases much easier.

Setting Up A PostgreSQL Docker Sandbox

Motivation

Sandboxes are very handy for isolated testing and demonstration purposes. As it happens, I began to write a post that was going to use a PostgreSQL Docker container as a sandbox and I decided it was better to break it out into its own post that could be referred to in future posts rather than add complexity to a post where it wasn’t the main focus.

Install Docker

For my docker host system I am using Debian 9 (Stretch). Installation was done following the official Get Docker CE for Debian instructions.

I won’t repeat the commands here as those instructions will be updated appropriately as packages, dependencies and Debian distributions change.

However I will mention I’m not a fan of piping raw URL output directly into sudo commands as some of the official instructions indicate. Anytime the instructions call for it I would suggest instead dumping the output to a file and sanity checking it before feeding it to a command with elevated privileges.

One last installation note is that depending on your environment and requirements you may want to give specific users the ability to control docker. As a result of completing the instructions indicated above a new group ‘docker’ was added, adding users to this group is as easy as:

usermod -aG docker yourUsername

Afterwhich you’ll need to re-login for the group changes to take effect (a simple ‘su – yourUsername’ on the command line you plan to run docker commands on suffices for the short-term)

Firing Up The Sandbox

Downloading and starting the PostgreSQL sandbox is as easy as:

docker run \
            --name pgsandbox1 \
            -d \
            -e "POSTGRES_PASSWORD=password" \
            -p 5432:5432 \
            postgres:10

This tells docker to:

  • Run a (ephemeral) container named ‘pgsandbox1’ based of the postgres:10 image tag
  • In detached mode (-d)
  • Setting the password for the PostgreSQL superuser to ‘password’ via a system environmental variable (-e)
  • And publishes (-p) port 5432 of the container on the host

Kicking Tires

First, we verify the container is running:

$ docker container list

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3b445b6a2dc0 postgres:10 "docker-entrypoint.s…" 3 minutes ago Up 3 minutes 5432/tcp pgsandbox1

Next, we lauch the psql interactive shell

Method 1: Exec psql inside the running docker
This method does not require psql to be installed on the host system:

$ docker exec \
          -it \
          pgsandbox1 \
          psql -U postgres

psql (10.3 (Debian 10.3-1.pgdg90+1))
Type "help" for help.

postgres=#

Method 2: Connect via local psql to the published container port

$ psql \
     --host=127.0.0.1 \
     --port=5432 \
     --username=postgres

Password for user postgres: 

psql (10.3 (Debian 10.3-1.pgdg90+1))
Type "help" for help.

postgres=# 

And as a final check we create a table, add a row to it and run a query:

postgres=# CREATE TABLE test (id SERIAL, name TEXT);
CREATE TABLE

postgres=# \d
             List of relations
 Schema |    Name     |   Type   |  Owner   
--------+-------------+----------+----------
 public | test        | table    | postgres
 public | test_id_seq | sequence | postgres
(2 rows)


postgres=# \d test
                            Table "public.test"
 Column |  Type   | Collation | Nullable |             Default              
--------+---------+-----------+----------+----------------------------------
 id     | integer |           | not null | nextval('test_id_seq'::regclass)
 name   | text    |           |          | 


postgres=# INSERT INTO test (name) VALUES ('Pat');
INSERT 0 1

postgres=# SELECT * FROM test;
 id | name 
----+------
 1 | Pat
(1 row)

Important reminder: The docker container running above is ephemeral; it does not retain or persist data between container restarts.

Final Notes

In the end of this we have a minimalist PostgreSQL instance ready for simple testing and examples. There is obviously a lot of room for enhancements and additional functionality. For those interested in modifications to the simple use-case above I would recommend reading through the base postgres docker image page which includes guidance on how best to do so.