Table of Contents
This chapter provides an overview of the physical storage format used by PostgreSQL databases.
This section describes the storage format at the level of files and directories.
All the data needed for a database cluster is stored within the cluster's data
directory, commonly referred to as PGDATA
(after the name of the
environment variable that can be used to define it). A common location for
PGDATA
is /var/lib/pgsql/data
. Multiple clusters,
managed by different server instances, can exist on the same machine.
The PGDATA
directory contains several subdirectories and control
files, as shown in Table 52.1, “Contents of PGDATA
”. In addition to
these required items, the cluster configuration files
postgresql.conf
, pg_hba.conf
, and
pg_ident.conf
are traditionally stored in
PGDATA
(although in PostgreSQL 8.0 and
later, it is possible to keep them elsewhere).
Table 52.1. Contents of PGDATA
Item | Description |
---|---|
PG_VERSION |
A file containing the major version number of PostgreSQL |
base |
Subdirectory containing per-database subdirectories |
global |
Subdirectory containing cluster-wide tables, such as
pg_database
|
pg_clog |
Subdirectory containing transaction commit status data |
pg_multixact |
Subdirectory containing multitransaction status data (used for shared row locks) |
pg_subtrans |
Subdirectory containing subtransaction status data |
pg_tblspc |
Subdirectory containing symbolic links to tablespaces |
pg_twophase |
Subdirectory containing state files for prepared transactions |
pg_xlog |
Subdirectory containing WAL (Write Ahead Log) files |
postmaster.opts |
A file recording the command-line options the server was last started with |
postmaster.pid |
A lock file recording the current server PID and shared memory segment ID (not present after server shutdown) |
For each database in the cluster there is a subdirectory within
PGDATA
/base
, named after the database's OID in
pg_database
. This subdirectory is the default location
for the database's files; in particular, its system catalogs are stored
there.
Each table and index is stored in a separate file, named after the table
or index's filenode number, which can be found in
pg_class
.relfilenode
.
Note that while a table's filenode often matches its OID, this is
not necessarily the case; some operations, like
TRUNCATE
, REINDEX
, CLUSTER
and some forms
of ALTER TABLE
, can change the filenode while preserving the OID.
Avoid assuming that filenode and table OID are the same.
When a table or index exceeds 1 GB, it is divided into gigabyte-sized segments. The first segment's file name is the same as the filenode; subsequent segments are named filenode.1, filenode.2, etc. This arrangement avoids problems on platforms that have file size limitations. The contents of tables and indexes are discussed further in Section 52.3, “Database Page Layout”.
A table that has columns with potentially large entries will have an
associated TOAST table, which is used for out-of-line storage of
field values that are too large to keep in the table rows proper.
pg_class
.reltoastrelid
links from a table to
its TOAST table, if any.
See Section 52.2, “TOAST” for more information.
Tablespaces make the scenario more complicated. Each user-defined tablespace
has a symbolic link inside the PGDATA
/pg_tblspc
directory, which points to the physical tablespace directory (as specified in
its CREATE TABLESPACE
command). The symbolic link is named after
the tablespace's OID. Inside the physical tablespace directory there is
a subdirectory for each database that has elements in the tablespace, named
after the database's OID. Tables within that directory follow the filenode
naming scheme. The pg_default
tablespace is not accessed through
pg_tblspc
, but corresponds to
PGDATA
/base
. Similarly, the pg_global
tablespace is not accessed through pg_tblspc
, but corresponds to
PGDATA
/global
.