A composite type describes the structure of a row or record; it is in essence just a list of field names and their data types. PostgreSQL allows values of composite types to be used in many of the same ways that simple types can be used. For example, a column of a table can be declared to be of a composite type.
Here are two simple examples of defining composite types:
CREATE TYPE complex AS ( r double precision, i double precision ); CREATE TYPE inventory_item AS ( name text, supplier_id integer, price numeric );
The syntax is comparable to CREATE TABLE
, except that only
field names and types can be specified; no constraints (such as NOT
NULL
) can presently be included. Note that the AS
keyword
is essential; without it, the system will think a quite different kind
of CREATE TYPE
command is meant, and you'll get odd syntax
errors.
Having defined the types, we can use them to create tables:
CREATE TABLE on_hand ( item inventory_item, count integer ); INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);
or functions:
CREATE FUNCTION price_extension(inventory_item, integer) RETURNS numeric AS 'SELECT $1.price * $2' LANGUAGE SQL; SELECT price_extension(item, 10) FROM on_hand;
Whenever you create a table, a composite type is also automatically created, with the same name as the table, to represent the table's row type. For example, had we said
CREATE TABLE inventory_item ( name text, supplier_id integer REFERENCES suppliers, price numeric CHECK (price > 0) );
then the same inventory_item
composite type shown above would
come into being as a
byproduct, and could be used just as above. Note however an important
restriction of the current implementation: since no constraints are
associated with a composite type, the constraints shown in the table
definition do not apply to values of the composite type
outside the table. (A partial workaround is to use domain
types as members of composite types.)
To write a composite value as a literal constant, enclose the field values within parentheses and separate them by commas. You may put double quotes around any field value, and must do so if it contains commas or parentheses. (More details appear below.) Thus, the general format of a composite constant is the following:
'(val1
,val2
, ... )'
An example is
'("fuzzy dice",42,1.99)'
which would be a valid value of the inventory_item
type
defined above. To make a field be NULL, write no characters at all
in its position in the list. For example, this constant specifies
a NULL third field:
'("fuzzy dice",42,)'
If you want an empty string rather than NULL, write double quotes:
'("",42,)'
Here the first field is a non-NULL empty string, the third is NULL.
(These constants are actually only a special case of the generic type constants discussed in Section 4.1.2.5, “Constants of Other Types”. The constant is initially treated as a string and passed to the composite-type input conversion routine. An explicit type specification might be necessary.)
The ROW
expression syntax may also be used to
construct composite values. In most cases this is considerably
simpler to use than the string-literal syntax, since you don't have
to worry about multiple layers of quoting. We already used this
method above:
ROW('fuzzy dice', 42, 1.99) ROW('', 42, NULL)
The ROW keyword is actually optional as long as you have more than one field in the expression, so these can simplify to
('fuzzy dice', 42, 1.99) ('', 42, NULL)
The ROW
expression syntax is discussed in more detail in Section 4.2.11, “Row Constructors”.
To access a field of a composite column, one writes a dot and the field
name, much like selecting a field from a table name. In fact, it's so
much like selecting from a table name that you often have to use parentheses
to keep from confusing the parser. For example, you might try to select
some subfields from our on_hand
example table with something
like:
SELECT item.name FROM on_hand WHERE item.price > 9.99;
This will not work since the name item
is taken to be a table
name, not a field name, per SQL syntax rules. You must write it like this:
SELECT (item).name FROM on_hand WHERE (item).price > 9.99;
or if you need to use the table name as well (for instance in a multitable query), like this:
SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 9.99;
Now the parenthesized object is correctly interpreted as a reference to
the item
column, and then the subfield can be selected from it.
Similar syntactic issues apply whenever you select a field from a composite value. For instance, to select just one field from the result of a function that returns a composite value, you'd need to write something like
SELECT (my_func(...)).field FROM ...
Without the extra parentheses, this will provoke a syntax error.
Here are some examples of the proper syntax for inserting and updating composite columns. First, inserting or updating a whole column:
INSERT INTO mytab (complex_col) VALUES((1.1,2.2)); UPDATE mytab SET complex_col = ROW(1.1,2.2) WHERE ...;
The first example omits ROW
, the second uses it; we
could have done it either way.
We can update an individual subfield of a composite column:
UPDATE mytab SET complex_col.r = (complex_col).r + 1 WHERE ...;
Notice here that we don't need to (and indeed cannot)
put parentheses around the column name appearing just after
SET
, but we do need parentheses when referencing the same
column in the expression to the right of the equal sign.
And we can specify subfields as targets for INSERT
, too:
INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
Had we not supplied values for all the subfields of the column, the remaining subfields would have been filled with null values.
The external text representation of a composite value consists of items that
are interpreted according to the I/O conversion rules for the individual
field types, plus decoration that indicates the composite structure.
The decoration consists of parentheses ((
and )
)
around the whole value, plus commas (,
) between adjacent
items. Whitespace outside the parentheses is ignored, but within the
parentheses it is considered part of the field value, and may or may not be
significant depending on the input conversion rules for the field data type.
For example, in
'( 42)'
the whitespace will be ignored if the field type is integer, but not if it is text.
As shown previously, when writing a composite value you may write double quotes around any individual field value. You must do so if the field value would otherwise confuse the composite-value parser. In particular, fields containing parentheses, commas, double quotes, or backslashes must be double-quoted. To put a double quote or backslash in a quoted composite field value, precede it with a backslash. (Also, a pair of double quotes within a double-quoted field value is taken to represent a double quote character, analogously to the rules for single quotes in SQL literal strings.) Alternatively, you can use backslash-escaping to protect all data characters that would otherwise be taken as composite syntax.
A completely empty field value (no characters at all between the commas
or parentheses) represents a NULL. To write a value that is an empty
string rather than NULL, write ""
.
The composite output routine will put double quotes around field values if they are empty strings or contain parentheses, commas, double quotes, backslashes, or white space. (Doing so for white space is not essential, but aids legibility.) Double quotes and backslashes embedded in field values will be doubled.
Remember that what you write in an SQL command will first be interpreted
as a string literal, and then as a composite. This doubles the number of
backslashes you need (assuming escape string syntax is used).
For example, to insert a text
field
containing a double quote and a backslash in a composite
value, you'd need to write
INSERT ... VALUES (E'("\\"\\\\")');
The string-literal processor removes one level of backslashes, so that
what arrives at the composite-value parser looks like
("\"\\")
. In turn, the string
fed to the text
data type's input routine
becomes "\
. (If we were working
with a data type whose input routine also treated backslashes specially,
bytea
for example, we might need as many as eight backslashes
in the command to get one backslash into the stored composite field.)
Dollar quoting (see Section 4.1.2.2, “Dollar-Quoted String Constants”) may be
used to avoid the need to double backslashes.
The ROW
constructor syntax is usually easier to work with
than the composite-literal syntax when writing composite values in SQL
commands.
In ROW
, individual field values are written the same way
they would be written when not members of a composite.