2018-12-05 23:21:45 +05:30
|
|
|
# Ordering Table Columns in PostgreSQL
|
2018-03-17 18:26:18 +05:30
|
|
|
|
|
|
|
Similar to C structures the space of a table is influenced by the order of
|
|
|
|
columns. This is because the size of columns is aligned depending on the type of
|
2018-12-05 23:21:45 +05:30
|
|
|
the following column. Let's consider an example:
|
2018-03-17 18:26:18 +05:30
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
- `id` (integer, 4 bytes)
|
|
|
|
- `name` (text, variable)
|
|
|
|
- `user_id` (integer, 4 bytes)
|
2018-03-17 18:26:18 +05:30
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
The first column is a 4-byte integer. The next is text of variable length. The
|
|
|
|
`text` data type requires 1-word alignment, and on 64-bit platform, 1 word is 8
|
|
|
|
bytes. To meet the alignment requirements, four zeros are to be added right
|
|
|
|
after the first column, so `id` occupies 4 bytes, then 4 bytes of alignment
|
|
|
|
padding, and only next `name` is being stored. Therefore, in this case, 8 bytes
|
|
|
|
will be spent for storing a 4-byte integer.
|
2018-03-17 18:26:18 +05:30
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
The space between rows is also subject to alignment padding. The `user_id`
|
|
|
|
column takes only 4 bytes, and on 64-bit platform, 4 zeroes will be added for
|
|
|
|
alignment padding, to allow storing the next row beginning with the "clear" word.
|
2018-03-17 18:26:18 +05:30
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
As a result, the actual size of each column would be (ommiting variable length
|
|
|
|
data and 24-byte tuple header): 8 bytes, variable, 8 bytes. This means that
|
|
|
|
each row will require at least 16 bytes for the two 4-byte integers. If a table
|
|
|
|
has a few rows this is not an issue. However, once you start storing millions of
|
|
|
|
rows you can save space by using a different order. For the above example, the
|
|
|
|
ideal column order would be the following:
|
|
|
|
|
|
|
|
- `id` (integer, 4 bytes)
|
|
|
|
- `user_id` (integer, 4 bytes)
|
|
|
|
- `name` (text, variable)
|
|
|
|
|
2019-03-02 22:35:43 +05:30
|
|
|
or
|
2018-12-05 23:21:45 +05:30
|
|
|
|
|
|
|
- `name` (text, variable)
|
|
|
|
- `id` (integer, 4 bytes)
|
|
|
|
- `user_id` (integer, 4 bytes)
|
|
|
|
|
|
|
|
In these examples, the `id` and `user_id` columns are packed together, which
|
|
|
|
means we only need 8 bytes to store _both_ of them. This in turn means each row
|
|
|
|
will require 8 bytes less space.
|
2018-03-17 18:26:18 +05:30
|
|
|
|
|
|
|
For GitLab we require that columns of new tables are ordered based to use the
|
|
|
|
least amount of space. An easy way of doing this is to order them based on the
|
2018-12-05 23:21:45 +05:30
|
|
|
type size in descending order with variable sizes (`text`, `varchar`, arrays,
|
|
|
|
`json`, `jsonb`, and so on) at the end.
|
2018-03-17 18:26:18 +05:30
|
|
|
|
|
|
|
## Type Sizes
|
|
|
|
|
2019-03-02 22:35:43 +05:30
|
|
|
While the [PostgreSQL documentation](https://www.postgresql.org/docs/current/datatype.html) contains plenty
|
2018-03-17 18:26:18 +05:30
|
|
|
of information we will list the sizes of common types here so it's easier to
|
|
|
|
look them up. Here "word" refers to the word size, which is 4 bytes for a 32
|
|
|
|
bits platform and 8 bytes for a 64 bits platform.
|
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
| Type | Size | Alignment needed |
|
2018-03-17 18:26:18 +05:30
|
|
|
|:-----------------|:-------------------------------------|:-----------|
|
|
|
|
| smallint | 2 bytes | 1 word |
|
|
|
|
| integer | 4 bytes | 1 word |
|
|
|
|
| bigint | 8 bytes | 8 bytes |
|
|
|
|
| real | 4 bytes | 1 word |
|
|
|
|
| double precision | 8 bytes | 8 bytes |
|
|
|
|
| boolean | 1 byte | not needed |
|
|
|
|
| text / string | variable, 1 byte plus the data | 1 word |
|
|
|
|
| bytea | variable, 1 or 4 bytes plus the data | 1 word |
|
|
|
|
| timestamp | 8 bytes | 8 bytes |
|
|
|
|
| timestamptz | 8 bytes | 8 bytes |
|
|
|
|
| date | 4 bytes | 1 word |
|
|
|
|
|
|
|
|
A "variable" size means the actual size depends on the value being stored. If
|
|
|
|
PostgreSQL determines this can be embedded directly into a row it may do so, but
|
|
|
|
for very large values it will store the data externally and store a pointer (of
|
|
|
|
1 word in size) in the column. Because of this variable sized columns should
|
|
|
|
always be at the end of a table.
|
|
|
|
|
|
|
|
## Real Example
|
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
Let's use the `events` table as an example, which currently has the following
|
2018-03-17 18:26:18 +05:30
|
|
|
layout:
|
|
|
|
|
|
|
|
| Column | Type | Size |
|
|
|
|
|:------------|:----------------------------|:---------|
|
|
|
|
| id | integer | 4 bytes |
|
|
|
|
| target_type | character varying | variable |
|
|
|
|
| target_id | integer | 4 bytes |
|
|
|
|
| title | character varying | variable |
|
|
|
|
| data | text | variable |
|
|
|
|
| project_id | integer | 4 bytes |
|
|
|
|
| created_at | timestamp without time zone | 8 bytes |
|
|
|
|
| updated_at | timestamp without time zone | 8 bytes |
|
|
|
|
| action | integer | 4 bytes |
|
|
|
|
| author_id | integer | 4 bytes |
|
|
|
|
|
|
|
|
After adding padding to align the columns this would translate to columns being
|
|
|
|
divided into fixed size chunks as follows:
|
|
|
|
|
|
|
|
| Chunk Size | Columns |
|
|
|
|
|:-----------|:------------------|
|
|
|
|
| 8 bytes | id |
|
|
|
|
| variable | target_type |
|
|
|
|
| 8 bytes | target_id |
|
|
|
|
| variable | title |
|
|
|
|
| variable | data |
|
|
|
|
| 8 bytes | project_id |
|
|
|
|
| 8 bytes | created_at |
|
|
|
|
| 8 bytes | updated_at |
|
|
|
|
| 8 bytes | action, author_id |
|
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
This means that excluding the variable sized data and tuple header, we need at
|
|
|
|
least 8 * 6 = 48 bytes per row.
|
2018-03-17 18:26:18 +05:30
|
|
|
|
|
|
|
We can optimise this by using the following column order instead:
|
|
|
|
|
|
|
|
| Column | Type | Size |
|
|
|
|
|:------------|:----------------------------|:---------|
|
|
|
|
| created_at | timestamp without time zone | 8 bytes |
|
|
|
|
| updated_at | timestamp without time zone | 8 bytes |
|
|
|
|
| id | integer | 4 bytes |
|
|
|
|
| target_id | integer | 4 bytes |
|
|
|
|
| project_id | integer | 4 bytes |
|
|
|
|
| action | integer | 4 bytes |
|
|
|
|
| author_id | integer | 4 bytes |
|
|
|
|
| target_type | character varying | variable |
|
|
|
|
| title | character varying | variable |
|
|
|
|
| data | text | variable |
|
|
|
|
|
|
|
|
This would produce the following chunks:
|
|
|
|
|
|
|
|
| Chunk Size | Columns |
|
|
|
|
|:-----------|:-------------------|
|
|
|
|
| 8 bytes | created_at |
|
|
|
|
| 8 bytes | updated_at |
|
|
|
|
| 8 bytes | id, target_id |
|
|
|
|
| 8 bytes | project_id, action |
|
|
|
|
| 8 bytes | author_id |
|
|
|
|
| variable | target_type |
|
|
|
|
| variable | title |
|
|
|
|
| variable | data |
|
|
|
|
|
2018-12-05 23:21:45 +05:30
|
|
|
Here we only need 40 bytes per row excluding the variable sized data and 24-byte
|
2019-03-02 22:35:43 +05:30
|
|
|
tuple header. 8 bytes being saved may not sound like much, but for tables as
|
2018-12-05 23:21:45 +05:30
|
|
|
large as the `events` table it does begin to matter. For example, when storing
|
|
|
|
80 000 000 rows this translates to a space saving of at least 610 MB, all by
|
|
|
|
just changing the order of a few columns.
|