Automate schema migrations using DizzleORM and GitHub Actions - Manage thousands of tenants with this workflow

Postgres Array data type

Manage collections of elements using arrays

In Postgres, the ARRAY data type is used to store and manipulate collections of elements in a single column. An array can have variable length and one or more dimensions, but must hold elements of the same data type. Postgres provides a variety of functions and operators for working with arrays.

Arrays are particularly useful when dealing with multiple values that are logically related. For instance, they can store a list of phone numbers for a contact, product categories for an e-commerce item, or even multi-dimensional data for scientific or analytical computations.

Try it on Neon!

Neon is Serverless Postgres built for the cloud. Explore Postgres features and functions in our user-friendly SQL editor. Sign up for a free account to get started.

Sign Up

Storage and syntax

Arrays in Postgres are declared by specifying the element type followed by square brackets. For example,

  • INTEGER[] defines an array of integers.
  • TEXT[][] defines a two-dimensional array of text values.
  • NUMERIC[3] defines an array of three numeric values. However, note that Postgres doesn't enforce the specified size of an array.

Array literals in Postgres are written within curly braces {} and separated by commas. For instance,

  • An array of integers might look like {1, 2, 3}.
  • Multidimensional arrays use nested curly braces, like {{1, 2, 3}, {4, 5, 6}}.

The ARRAY constructor syntax can also be used to create arrays. For example,

  • ARRAY[1, 2, 3] creates an array of integers.
  • ARRAY[[1, 2, 3], [4, 5, 6]] creates a two-dimensional array.

Example usage

Consider the case of maintaining a product catalog for an online store. The same product may belong to multiple categories. For example, an iPad could be tagged as 'Electronics', 'Computer', or 'Mobile'. In this case, we can use an array to store the categories for each product.

First, let's create a products table with some sample data:

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    categories TEXT[],
    units_sold INTEGER[][]
);

INSERT INTO products (name, categories, units_sold)
VALUES
    ('Laptop', '{"Electronics","Computer","Office"}', '{{3200, 3300, 3400, 3500}, {3600, 3700, 3800, 3900}}'),
    ('Headphones', '{"Electronics","Audio"}', '{{2400, 2500, 2600, 2700}, {2800, 2900, 3000, 3100}}'),
    ('Table', '{"Furniture","Office"}', '{{900, 950, 1000, 1050}, {1100, 1150, 1200, 1250}}'),
    ('Keyboard', '{"Electronics","Accessories"}', '{{4100, 4200, 4300, 4400}, {4500, 4600, 4700, 4800}}');

The units_sold column is a two-dimensional array that stores the number of units sold for each product. The first dimension represents the year, and the second dimension represents the quarter.

Now, we can access the values in the array column categories, and use it in our queries. For example, the query below finds products belonging to the Electronics category.

SELECT name, categories
FROM products
WHERE 'Electronics' = ANY (categories);

Note that the ANY operator checks if the value specified exists in the array. This query returns the following result:

| id | name       | categories                      |
|----|------------|---------------------------------|
| 1  | Laptop     | {Electronics, Computer, Office} |
| 2  | Headphones | {Electronics, Audio}            |
| 4  | Keyboard   | {Electronics, Accessories}      |

Other examples

Indexing arrays

Elements in an array can be accessed by their index. Postgres arrays are 1-based, meaning indexing starts at 1.

For example, to get the first category of each product:

SELECT name, categories[1] AS first_category
FROM products;

This query returns the following result:

| name       | first_category |
|------------|----------------|
| Laptop     | Electronics    |
| Headphones | Electronics    |
| Table      | Furniture      |
| Keyboard   | Electronics    |

Multiple elements can be accessed using the SLICE operator. For example, to get the first three categories of each product:

SELECT name, categories[1:3] AS first_three_categories
FROM products;

This query returns the following result:

| name       | first_three_categories          |
|------------|---------------------------------|
| Laptop     | {Electronics, Computer, Office} |
| Headphones | {Electronics, Audio}            |
| Table      | {Furniture, Office}             |
| Keyboard   | {Electronics, Accessories}      |

Multidimensional arrays can be accessed using multiple indices. For example, to get the number of units sold in the last quarter of the first year for each product, we can use the query:

SELECT name, units_sold[1][4] AS units_sold_last_quarter
FROM products;

This query returns the following:

| name       | units_sold_last_quarter |
|------------|-------------------------|
| Laptop     | 3500                    |
| Headphones | 2700                    |
| Table      | 1050                    |
| Keyboard   | 4400                    |

Modifying arrays

Array values can be modified using functions or by directly indexing into the array. You can change specific elements of an array, add or remove elements, or even replace the entire array.

For example, the query below replaces the Audio category across all products with Sound.

UPDATE products
SET categories = array_replace(categories, 'Audio', 'Sound')
WHERE 'Audio' = ANY (categories)
RETURNING *;

This query returns the following result:

| id | name       | categories            | units_sold                                   |
|----|------------|-----------------------|----------------------------------------------|
| 2  | Headphones | {Electronics,Sound}   | {{2400,2500,2600,2700},{2800,2900,3000,3100}} |

Array functions and operators

Postgres provides a variety of functions and operators for working with arrays. You can find the full list of functions and operators in the Postgres documentation.

We'll look at some commonly used functions below.

Length of an array

We can query the number of categories each product has been tagged with:

SELECT name, array_length(categories, 1) as category_count
FROM products;

This query returns the following result:

| name       | category_count |
|------------|----------------|
| Laptop     | 3              |
| Headphones | 3              |
| Table      | 2              |
| Keyboard   | 2              |

The array_length function returns the length of the array in the specified dimension. In this case, we specified the first dimension, which is the number of categories for each product.

Expanding an array into rows

We can use the unnest function to expand an array into rows. For example, to get the number of laptops sold in each quarter, we can use the query:

SELECT name, unnest(units_sold) AS units_sold
FROM products
WHERE name = 'Laptop';

This query returns the following result:

| name   | units_sold |
|--------|------------|
| Laptop | 3200       |
| Laptop | 3300       |
| Laptop | 3400       |
| Laptop | 3500       |
| Laptop | 3600       |
| Laptop | 3700       |
| Laptop | 3800       |
| Laptop | 3900       |

We could use the output of unnest to calculate the total number of units sold for each product; for example:

WITH table_units AS (
  SELECT name, unnest(units_sold) AS total_units_sold
  FROM products
)
SELECT name, sum(total_units_sold)
FROM table_units
GROUP BY name;

This query returns the following result:

| name       | sum   |
|------------|-------|
| Keyboard   | 35600 |
| Table      | 8600  |
| Laptop     | 28400 |
| Headphones | 22000 |

Concatenating arrays

We can concatenate two arrays using the || operator. For example, the query below produces a list of all categories across all products.

SELECT ARRAY[1,2,3] || ARRAY[4,5] as concatenated_array;

This query returns the following result:

| concatenated_array |
|--------------------|
| {1,2,3,4,5}        |

Aggregating values into an array

We can use the array_agg function to produce an array from a set of rows. For example, to get a list of all products that are in the Electronics category, we can use the query:

SELECT array_agg(name) AS product_names
FROM products
WHERE 'Electronics' = ANY (categories);

This query returns the following result:

| product_names                |
|------------------------------|
| {Laptop,Headphones,Keyboard} |

Additional considerations

  • Performance and UX: While arrays provide flexibility, they can be less performant than normalized data structures for large datasets. Compared to a set of rows, arrays can also be more tedious to work with for complex queries.

  • Indexing: Postgres lets you create indexes on array elements for faster searches. Specifically, an inverted index like GIN creates an entry for each element in the array. This allows for fast lookups but can be expensive to maintain for large arrays.

  • No type enforcement: Postgres supports defining the size of an array or the number of dimensions in the schema. However, Postgres does not enforce these definitions. For example, the query below works successfully:

    CREATE TABLE test_size (
      id SERIAL PRIMARY KEY,
      arr1 INTEGER[3]
    );
    INSERT INTO test_size (arr1)
    VALUES (ARRAY[1,2,3]), (ARRAY[1,2]);

    It is therefore up to the application to ensure data integrity.

Resources

Need help?

Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more details, see Getting Support.

Last updated on

Was this page helpful?