groonga - An open-source fulltext search engine and column store.

8.11.9. sub_filter

8.11.9.1. Summary

sub_filter evaluates query_string in match_column context.

sub_filter can be used in only --filter in select.

8.11.9.2. Syntax

sub_filter requires two arguments - match_column and query_string.

sub_filter(match_column, query_string)

8.11.9.3. Usage

Here are a schema definition and sample data to show usage.

Sample schema:

Execution example:

table_create Comment TABLE_PAT_KEY UInt32
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Comment name COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Comment content COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Blog TABLE_PAT_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Blog title COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Blog content COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Blog comments COLUMN_VECTOR Comment
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Comment blog_comment_index COLUMN_INDEX Blog comments
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Lexicon TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon comment_content COLUMN_INDEX|WITH_POSITION Comment content
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon comment_name COLUMN_INDEX|WITH_POSITION Comment name
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon blog_content COLUMN_INDEX|WITH_POSITION Blog content
# [[0, 1337566253.89858, 0.000355720520019531], true]

Sample data:

Execution example:

load --table Comment
[
{"_key": 1, "name": "A", "content": "groonga"},
{"_key": 2, "name": "B", "content": "groonga"},
{"_key": 3, "name": "C", "content": "rroonga"},
{"_key": 4, "name": "A", "content": "mroonga"},
]
# [[0, 1337566253.89858, 0.000355720520019531], 4]
load --table Blog
[
{"_key": "groonga's blog", "content": "content of groonga's blog", comments: [1, 2, 3]},
{"_key": "mroonga's blog", "content": "content of mroonga's blog", comments: [2, 3, 4]},
{"_key": "rroonga's blog", "content": "content of rroonga's blog", comments: [3]},
]
# [[0, 1337566253.89858, 0.000355720520019531], 3]
Here is the simple usage of sub_filter function which extracts the blog entry
user 'A' commented out.

Execution example:

select Blog --output_columns _key --filter "comments.name @ \"A\" && comments.content @ \"groonga\""
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ]
#       ],
#       [
#         "groonga's blog"
#       ],
#       [
#         "mroonga's blog"
#       ]
#     ]
#   ]
# ]

When executing above query, not only "groonga's blog", but also "mroonga's blog". This is not what you want because user "A" does not mention "groonga" to "mroonga's blog".

Without sub_filter, it means that following conditions are met.

  • There is at least one record that user "A" commented out.
  • There is at least one record that mentioned about "groonga".

Execution example:

select Blog --output_columns _key --filter 'sub_filter(comments, "name @ \\"A\\" && content @ \\"groonga\\"")'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ]
#       ],
#       [
#         "groonga's blog"
#       ]
#     ]
#   ]
# ]

On the other hand, executing above query returns the intended result. Because the arguments of sub_filter is evaluated in comments column's context.

It means that sub_filter requires following condition is met.

  • There are the records that user "A" mentions about "groonga".

8.11.9.4. Parameters

There are two required parameter, match_column and query_string.

8.11.9.4.1. match_column

It specifies match_column equivalent parameter.

See match_columns about match_column.

8.11.9.4.2. query_string

It specifies query equivalent parameter.

See query about query string.

8.11.9.5. Return value

sub_filter returns a value of boolean (true or false).

8.11.9.6. See also