This chapter describes Mnemosyne, the Mnesia database query language, and the syntax, semantics, and rules which apply to Mnesia queries. The following sections are included:
The following notational conventions are used in this chapter:
table
.
<pattern>
.
Mnemosyne is the query language and the optimizing query compiler for the Mnesia Database Management System.
Database queries are used when more complex operations than just a simple key-value lookup are required on a database. A query can find all records in a table that fulfills a given property. For example, think of a table storing the status of subscriber lines in a telephone exchange. A query in such a table can take the format: "Which subscriber lines are 'blocked'?".
A query can also find records on the basis of their relationship to other records in the same table, or in other tables. If the table, which stores subscriber lines, is accompanied by a table, which pairs subscriber numbers with a subscriber line identification, we can modify our previous query and ask: "Which subscriber numbers are 'blocked' ?". This can be answered by constructing a query, which finds the blocked subscriber line identifications in the subscriber line table, and then finds the associated subscriber number in the table, which pairs subscriber number and subscriber line identifications.
However, the proposed solution may not be the most efficient solution, because it depends on what the tables look like in runtime. In other words, how many records the table contains, and the number of different values stored.
In a situation where there are only a couple of subscriber numbers but a million blocked lines, it would be far more efficient to first find the subscribers and then check if their line is blocked. The query evaluation order depends on how large the tables are compared to each other. The evaluation order also depends on key and other value distribution, and if there are any indices defined (refer to Mnesia Chapter 5: Indexing for more information).
The query compiler resolves the evaluation order. We need only express what we want to do, and the query compiler and query evaluator will determine the best evaluation order. Therefore, we can express the query in the most readable form.
Queries in Mnemosyne use first order predicate logic (similar to Prolog), but in an syntax suitable for Erlang. The "query list comprehension" used in Mnemosyne is taken from the functional languages community. The advantage over embedded SQL, is that the constructs integrate smoothly with the Erlang language.
To illustrate the Mnemosyne query language, we will show the Erlang
code for the subscriber line and subscriber number tables
discussed above. We define two tables subscriber
and
line
. Their corresponding record declarations in the
file subscriber.hrl
are:
-record(subscriber, {snb, cost_limit, li}).
-record(line, {li, state}).
The query "which subscriber numbers are blocked?" can also be expressed as "which subscribers have lines which are in state 'blocked'". This query can be coded as follows:
query [ S.snb || % collect the subscriber number S <- table(subscriber), % where S is taken from the subscriber table L <- table(line), % and L is taken from the line table L.state = blocked, % and the state of that line is blocked L.li = S.li % and L and S uses the same li ] end
In the example above, the aim is to get an answer from a logical relation. Consider also the following example:
query [E.name || E <- table(employee), E.sex = female] end
This means "the Erlang list of all female employees". A formulation
closer to the list comprehension is: "the Erlang list of all
names of E such that E is in the table
employee
and E's sex is female".
Some words have a direct correspondence to the elements in the list comprehension notation:
the Erlang list of all |
[ ]
|
such that |
||
|
is in |
<-
|
and |
,
|
Another query component is rules, which can be compared to views in Relational Databases, where the purpose is to define a "virtual table". This "table" looks like an ordinary table to the user, which means that queries can be formulated on stored data (as well as views). In the subscriber line example, a rule can give the subscriber number and the corresponding line status from both tables, and there is no need to create a third table.
The rule is a definition of how to calculate the records on demand from a query. In Erlang modules, rules are written with the same syntax as the bodies in the query list comprehensions. They are exported and can be used by other modules.
Our subscriber
example formulated as a rule would look as
follows:
blocked_subscribers(S, subscriber) :- S <- table(subscriber), L <- table(line), L.state = blocked, L.li = S.li.
This rule can be used in a query in the same manner as a table
but with the keyword rule
substituted for table
.
query [ S.snb || S <- rule(blocked_subscribers) ] end
Database queries can be included in an Erlang program, but there must be a directive in the Erlang file which informs the compiler about its behavior towards queries. This directive is:
-include_lib("mnemosyne/include/mnemosyne.hrl").
The high-level syntax of the query list comprehension is:
query [ <pattern> || <body> ] end
The <body>
is a comma-separated sequence of:
<logical-variable> <- table( <table-name> [ , <table-type> ] )
<logical-variable> <- rule( <rule-name> )
<logical-variable> <- rule( <rule-name> )
<logical-variable> <- rule( <module> : <rule-name> ])
<logical-variable> <- <erlang-list-expression>
<expression> <relop> <expression>
<erlang-test-expression>
The <relop>
operators are:
=
for unification
/=
for not unification
<
for less than
>
for greater than
=<
for equal to or less than
>=
for equal to or greater than.
A <logical-variable>
is written exactly as an Erlang
variable. The <table-name>
, <table-type>
,
<rule-name>
and
<module>
are atoms. The <table-name>
and
<table-type>
can also be
an Erlang variable. The logical variables are local to a list
comprehension and shadows any Erlang variable with the same
name.
The <pattern>
is an Erlang term without function calls. It
may contain (bound) Erlang variables and it usually has one or
more <logical-variable>
, since these are used to get
data out from the query body and into the produced list.
An <expression>
is any Erlang expression which may include
function calls and <logical-variable>
. The variant
<erlang-list-expression>
is an
<expression>
which must produce a list where all elements are
records of the same type.
The <erlang-test-expression>
is an <expression>
which has the values true
or false
.
Erlang variables are allowed in all variations of <expression>
and in <pattern>
. They must be bound in the query list
comprehension.
The constructs used in the Mnemosyne query language have the following meanings:
<logical-variable> <- ...
. This expression means
that the variable is taken from the values in the expression
to the right of the arrow. For example, E <- [#e{a=1},
#e{a=2}]
says that E
takes the values
#e{a=1}
, or #e{a=2}
<-
. These constructs usually generate
values. However, if the logical variable is bound it tests
that value. If a test fails it means that the query tries
another alternative. For example:
query [ X.a || X <- [#e{a=1}, #e{a=2}], X <- [#e{a=3}], .... ] endThe body means that the field 'a' of
X
should be 3 and at
the same time either 1 or 2. So the list of solutions will
always be empty. The test <expression> <relop> <expression>
and the
true
-or-false
returning test
<erlang-test-expression>
simply filters out the solutions. The
purpose of the latter test is to provide user defined data tests.
We will next consider the logical variables associated
records in an expression like x <- table(a)
. We
have already established the following rules and assumptions:
<logical-variable>
must have the same record association as the records produced by the right side of the <-
constructs.
In the example X <- table(a)
, the associated record of
x is a because table a stores
records with the name a. Since release 3.4 of Mnesia it has
been possible to separate record name and its table type.
If the type of the table is different from its name, this can
be specified in Mnemosyne using X <- table(Name, Type)
where Name is the Name of the table and Type is the record name.
Similar to tables, rules produce or test records. The return type
for a rule is by default the name of the rule. Rules can be declared
to return other types.
This makes it possible to construct a rule for some
special cases with a name like blocked_subscriber
which
still produces or tests subscriber records.
In Erlang we must always tell the compiler which record definition
it should use by putting the record name after a hash mark. In
general, this is not needed in Mnemosyne since, in most cases,
the query
compiler can deduce the associated record. That
is the reason S.li
is acceptable instead of the
full S#subscriber.li
. It will not cause an error
if the longer version
was written, but if we do write the record name it must be the
same record name as the one the query compiler deduces. Sometimes
the compiler is unable to find the
associated record. When this happens, an error message is
issued.
It is also preferred to write out the type of the associated record
for performance reasons. If the associated record is part of a complex
constraint, the constraint may be compiled to a function if the
type of the associated record is known (explicitly or
deducable) at Erlang compile time.
![]() |
A function used in a query list comprehension must never directly or indirectly:
|
A rule is composed of clauses and each clause has the structure:
<head> :- <body>
<head>
looks like an Erlang function with one or two arguments,
where the first argument is a variable and the second, optional, argument
an atom. If there is a second argument, it must be present in all clauses and
have the same value.
<body>
has the same syntax as the <body>
. in query list comprehensions
<rule-name> (<return-var>, <record-name>)The syntax used in previous mnemosyne versions, by declaring the the associated recordtype with an
argtype
declaration still works but
is depreciated.
![]() |
The |
Review the rule example.
blocked_subscribers(S, subscriber) :- S <- table(subscriber), L <- table(line), L.state = blocked, L.li = S.li.
It produces a list of subscriber
records.
Rules with a single argument return records of the same
type as the name of the rule. For example, the following
rule produces records of type blocked
-record (blocked, {snb, li}). blocked (X) :- S <- table (subscriber), L <- table(line), L.state = blocked, L.li = S.li, X = #blocked{snb=S#subscriber.snb, li=S#subscriber.li}.
The previous sections described how to define queries. This section describes how to evaluate queries.
The principle is simple: query list comprehensions, compile and optimize the query and return a handle. This handle is then passed on for execution:
Handle = query [ S.snb || S <- table(subscriber), S.li = none] end, AllAnswers = mnesia:transaction( fun() -> mnemosyne:eval(Handle) end)
There are three ways of evaluating queries. The mnemosyne:eval/1
is the simplest of the three. It takes a handle and returns all
solutions. Sometimes we only need to view a few solutions,
examine them and possibly get more. Think of an airline routing
database: you do not want to know all possible connections
between two cities, but usually enough information is given
after observing one or two.
Use the cursor with a query evaluation to produce a few
solutions only. With a handle we create a cursor by calling
mnemosyne:cursor/1
. With the cursor we can repeatedly
call mnemosyne:next_answers
to get more solutions. When
an empty list is returned there are no more possible solutions.
Delete the cursor with mnemosyne:delete_cursor/1
.
Handle = query [ S.snb || S <- table(subscriber), S.li = none] end, AFewAnswers = mnesia:transaction( fun() -> Cursor = mnemosyne:cursor(Handle), % now get at least two but not % more than five solutions: L = mnemosyne:next_answers(Cursor,2,5), mnemosyne:delete_cursor(Cursor), L end)
A query evaluation can be time consuming, but can be broken up by using the cursor with setup_query/1
and init_query/1
:
Handle = query [ S.snb || S <- table(subscriber), S.li = none] end, QuerySetup = mnemosyne:setup_query(Handle), AFewAnswers = mnesia:transaction( fun() -> Cursor = mnemosyne:init_query(QuerySetup), mnemosyne:next_answers(Cursor, 5, 5) end), % Here we may call more init_query-next_answers constructions % with the same Handle. Note that the query is evaluated from % "scratch" because of the call to mnemosyne:init_query/1. mnemosyne:delete_query(QuerySetup)
Because of table updates, a query which is compiled and optimized
may be incorrect when the handle returns. This can be rectified with
the function mnemosyne:reoptimize/1
which takes a handle,
re-optimizes the query and returns a new handle.
This section describes an example which illustrates the use of Mnemosyne. The example given is of a simplified local exchange, with AXE-10 exchange as a model. The purpose of this section is to show different constructs in a telecom environment. It should not be taken as a proposed data model for a modern telecom system.
Our telephone example includes the following components, relationships, and events:
We identify three tables:
subscriber
with subscriber numbers snb
, line interface number li
, and a maximum cost cost_limit
which must not be exceeded.
line
with line interface number li
, and its state
.
account
, a table which stores the cost of calls. It has an snb
field, and the accumulated cost in cost
.
The corresponding record definitions are stored in a file named subscriber.hrl
, which has the following record definitions:
-record(subscriber, {snb, cost_limit, li}).
-record(line, {li, state}).
-record(account, {snb, cost}).
The program file is titled subscriber.erl
. It declares the module name subscriber
, calls the record definition in subscriber.hrl
, and Mnesia query support mnemosyne.hrl
.
-module(subscriber). -compile(export_all). -include("subscriber.hrl"). -include_lib("mnemosyne/include/mnemosyne.hrl").
We then create the required tables and load data by entering table definitions into a file named subscriber.tables
, which has the following content:
{tables, [{subscriber, [{attributes, [snb,cost_limit,li]}]}, {line, [{attributes, [li, state]}]}, {account, [{attributes, [snb, cost]}]} ] }. %% Subscribers {subscriber, 1230, 0, none}. {subscriber, 1231, 0, none}. {subscriber, 1232, 0, none}. {subscriber, 1233, 0, none}. {subscriber, 1234, 100, {li,1}}. {subscriber, 1235, 200, {li,3}}. {subscriber, 1236, 150, {li,2}}. {subscriber, 1237, 0, none}. {subscriber, 1238, 0, none}. {subscriber, 1239, 0, none}. %% Lines {line, {li,0}, blocked}. {line, {li,1}, normal}. {line, {li,2}, normal}. {line, {li,3}, blocked}. {line, {li,4}, blocked}. {line, {li,5}, blocked}. {line, {li,6}, blocked}. {line, {li,7}, blocked}. %% Accounts {account, 1234, 0}. {account, 1235, 0}. {account, 1236, 0}. {account, 1237, 0}.
In our program, this file is called with the statement:
mnesia:load_textfile("subscriber.tables")
To retrieve a list of all free subscriber numbers we call the following function in a transaction:
free_subscriber_numbers() -> mnemosyne:eval( query [ S.snb || S <- table(subscriber), S.li = none] end ).
The rule too_high_cost/0
locates and returns all subscribers with an
accumulated cost that exceeds their limit:
limit_exceeded(S, subscriber) :- S <- table(subscriber), A <- table(account), A.snb = S.snb, A.cost > S.cost_limit.
We could find all subscriber numbers of subscribers who have exceeded their cost limit as follows:
Q = query [ S.snb || S <- rule(limit_exceeded) ] end
Mnesia provides the programmer with a method of matching objects against a pattern. This is the Mnesia matching function:
mnesia:match_object(Pattern) ->transaction abort | ObjList
.
This function matches Pattern
for objects. A Pattern
is a tuple with the name (identity) of the table as the first element.
The table collates all data retrieved.
In comparison to a list comprehension query, mnesia:match_object
is a low level
function. The following two functions both return the same
objects; however, the second example uses matching.
f1() -> Q = query [E || E <- table(employee), E.sex = female] end, F = fun() -> mnemosyne:eval(Q) end, mnesia:transaction(F).
and
f2() -> WildPat = mnesia:table_info(employee, wild_pattern), Pat = WildPat#employee{sex = female}, F = fun() -> mnesia:match_object(Pat) end, mnesia:transaction(F).
The pattern supplied to the mnesia:match_object/1
function must be a valid record, and the first element
of the provided tuple must be a valid table name. The
special element '_'
matches all the records.
There are advantages in using the Mnemosyne query syntax
instead of the mnesia:match_object/1
function:
f2/0
function.
mnesia:match_object/1
function provides are limited in both scope and number.
The mnesia:match_object
function is also performed
during run time, which in turn reduces performance.
It is also possible to use the match function if we want to check the equality of different attributes. Assume we have the following record definition:
-record(foo, {a, b, c}).
The pattern {foo, '$1', '$1', '_'}
then extracts all objects of type
foo
where the first two attributes have the same value.
If the key attribute is bound in a pattern, the match operation is very efficient.
The pattern {foo, 123, '_', elvis}
can be used to extract all
objects with key 123
, and the last attribute set to the atom
elvis
. This is the same as extracting all the elvis
objects from the result of mnesia:read({foo, 123})
, but
more efficient.
If the key attribute in a pattern is given as '_'
, or '$1'
, the
whole foo
table must be searched for objects that match. If the table
is large, this may be a time consuming operation. This can be remedied with
indices (refer to Mnesia Chapter 5: Indexing for more information).
This chapter closes with an example of information extraction from a Company database:
X
.
Dep
department.
The first example demonstrates the query execution with list comprehension notation. The second example illustrates a query coded with a matching function.
The list comprehension based implementation looks as follows:
get_emps(Salary, Dep) -> Q = query [E || E <- table(employee), At <- table(at_dep), E.salary > Salary, E.emp_no = At.emp, At.dept_id = Dep] end, F = fun() -> mnemosyne:eval(Q) end, mnesia:transaction(F).
The data model for the Company database introduced in the Mnesia documentation was designed to facilitate the posting of queries like the one shown above.
To implement the same query by directly searching the database is more complex. The following function does precisely this:
get_emps2(Salary, Dep) -> Epat = mnesia:table_info(employee, wild_pattern), Apat = mnesia:table_info(at_dep, wild_pattern), F = fun() -> All = mnesia:match_object(Epat), High = filter(All, Salary), Alldeps = mnesia:match_object(Apat), filter_deps(High, Alldeps, Dep) end, mnesia:transaction(F). filter([E|Tail], Salary) -> if E#employee.salary > Salary -> [E | filter(Tail, Salary)]; true -> filter(Tail, Salary) end; filter([], _) -> []. filter_deps([E|Tail], Deps, Dep) -> case search_deps(E#employee.name, Deps, Dep) of true -> [E | filter_deps(Tail, Deps, Dep)]; false -> filter_deps(Tail, Deps, Dep) end; filter_deps([], _,_) -> []. search_deps(Name, [D|Tail], Dep) -> if D#at_dep.emp == Name, D#at_dep.dept_id == Dep -> true; true -> search_deps(Name, Tail, Dep) end; search_deps(Name, Tail, Dep) -> false.
The function mnesia:match_object/1
will automatically
make use of indices if any exist. However, no heuristics are
performed in order to select the best index, if
more than one exists.
As can be seen, the list comprehension provides a more elegant solution.
There is a difference when matching record fields in a
mnemosyne list comprehension and in Erlang in general
(for example, a function clause header). The following code
returns true for all employee
where emp_id
is 312
or 400
:
test_employee(#employee{emp_id = 312}) -> true; test_employee(#employee{emp_id = 400}) -> true; test_employee(_) -> false.
That is, it does not check other fields of the employee record. Compare that with the following mnemosyne query:
query [E || E <- table(employee), E <- [#employee{emp_id=312}, #employee{emp_id=400}]]
The query will return all employees from the employee table
whos emp_id
is either 312
or 400
and have the other fields set to the default values
for an employee.
To select all items that have a field set to some values
(disregarding the other fields) the constraint can be put in
separate function. For example, select all employees whos emp_id
is either 312
or 400
independently of other fields:
query [E || E <- table(employee), test_employee(E)] test_employee(#employee{emp_id = 312}) -> true; test_employee(#employee{emp_id = 400}) -> true; test_employee(_) -> false.
If there is only one acceptable value for a record field
it is more efficient to write it
directly in the query. Select employees whos emp_id
is 312:
query [E || E <- table(employee), E#employee.emp_id = 312]