Import JSON into Postgres using COPY
Loading data is a very common task when working with databases, and most of the data comes from external data sources, like JSON or CSV files.
Fortunately, most databases offer some kind of mechanism to import external data into tables and PostgreSQL is no different, in this article we will focus on import JSON into PostgreSQL.
The COPY command
The COPY
command can be used to copy contents from one table to another, or to copy data from a file into a table.
COPY
is a very powerful tool and it's blazing fast too! allowing us to import multi-GB files in seconds if done properly.
The COPY
command allows us to import data from a variety of formats: CSV, TSV and JSON. In this article we'll be focusing on JSON, since it can get tricky, depending on how your file is structured:
Step 1. Figure out what of JSON file you have
Open your file with a text editor to see how it's formatted.
If your file is structured this way (note that the rows are inside an array):
[
{"id":1,"first_name":"Kelly","last_name":"Freddi","email":"[email protected]","gender":"Male","ip_address":"79.111.233.14"},
{"id":2,"first_name":"Orazio","last_name":"Playhill","email":"[email protected]","gender":"Male","ip_address":"94.111.107.131"},
...
]
You will have to do an extra step to transform the file into a Newline Delimited JSON or NDJSON, which should look like this:
{"id":1,"first_name":"Kelly","last_name":"Freddi","email":"[email protected]","gender":"Male","ip_address":"79.111.233.14"}
{"id":2,"first_name":"Orazio","last_name":"Playhill","email":"[email protected]","gender":"Male","ip_address":"94.111.107.131"}
...
Note how each JSON object is delimited by a new line (without commas) and we don't include the array start [
and end ]
characters.
Note: If your file is already a NDJSON file, you can skip to step 2.
To convert our file from normal JSON to a Newline Delimited JSON you could just use our converter which will do you it for you, or you can use a command line tool like jq
:
$ jq -c '.[]' your_file.json > your_new_file.json
This simple command will take your JSON file in array format and convert it into a NDJSON file.
Now you can use the COPY
command to import your data into Postgres.
Step 2. Importing Newline Delimited JSON data
Use psql
to connect to your database:
$ psql -h YOUR_HOST -U YOUR_USER -d YOUR_DATABASE
First, create a temporary table where you'll import your data.
CREATE TABLE temp (data jsonb);
Once the table is created, let's import our data.
We will use pgsql
's COPY
command to load a local file to the PostgreSQL server and into the temporary table:
\COPY temp (data) FROM 'my_data.json';
This will import each JSON object into a row in the temporary table, now you can easily query the data like this:
SELECT data->>'id', data->>'first_name'
FROM temp;
id | first_name
------+--------------
1 | Kelly
2 | Orazio
3 | Sabine
...
Step 3. Moving the JSON data into a proper table
Now we have our JSON data but it's all in a JSONB column, we might want to import it into a proper table to guarantee the types and structure, which is very recommended.
You can make use of INSERT INTO .. SELECT
statement to do this easily, here's an example (you can use an existing table to do this):
CREATE TABLE users (id SERIAL, first_name TEXT);
INSERT INTO users
SELECT (data->>'id')::integer, data->>'first_name'
FROM temp;
This will insert all the records into the table users
and when running it Postgres will issue errors if there are type mismatches, which is very helpful to ensure data integrity.
Note: If you're inserting a large amount of records, indexes and triggers in the destination table can greatly increase the time it takes to insert your data, you can temporarily disable them when inserting to improve the speed significantly.
Make sure you delete the temporary table once you're done!
DROP TABLE temp;
Do it the easy way with Konbert
You might want to consider using a tool to do this automatically for you. Our JSON to Postgres converter takes a JSON file and outputs a SQL file which you can then import into your database.