Category expressions
A category expression is a text expression that specifies a subset of categories in a categorical variable. The Function Library uses an MDM document to resolve the expression to a simple list of category numbers: see
Value resolution for more information.
A category expression has the form:
{x1 .. y1, x2 .. y2, ... }
Where x .. y defines a range of categories in the category list in the categorical variable to which the expression is applied. x and y can be names or non-negative integer values. (Names and values can be distinguished because names cannot begin with a digit.) You can use more than one range to add to the set of categories. If two ranges overlap, each category is included only once.
Order
The resolved list of category numbers is always in the order of the category list in the variable. This means that the order in which the ranges are specified has no effect. So, providing x and y are valid categories in the original list, {x .. y} is the same as {y .. x}. Similarly, {x1..y1, x2..y2} is the same as {x2..y2, x1..y1}.
Braces
The enclosing braces, { }, are optional, but one brace is not allowed without the other being present. If the expression is enclosed in braces, everything following the closing brace is ignored.
Missing categories
If x does not exist in the variable, it is replaced with the first category in the category list. Similarly, if y does not exist in the variable, it is replaced with the last category in the category list.
Incomplete ranges
If x is omitted and the range is specified as {.. y}, the beginning of the range is taken to be the first category in the category list. Similarly if y is omitted and the range is specified as {x..}, the end of the range is taken to be the last category in the category list. If both x and y are omitted and the range is specified as {..}, the entire category list is included.
No range operator
You can include individual categories without the range operator--for example, {x1, x2}. Individual categories are treated as a range, so {x1, x2} is the same as {x1..x1, x2..x2}, except when one of the specified categories does not exist. When x1 does not exist, {x1} results in an empty list, whereas {x1..x1} results in the complete list.
Exclusive ranges
You can prefix an element with ^ to exclude categories. For example, {x1..y1, ^x2..y2} specifies that all of the categories between x2 and y2 are to be excluded from the result. Exclusive ranges remove categories from the result of the inclusive ranges. Exclusive ranges that do not overlap inclusive ranges have no effect. However, if there are only exclusive ranges in the category expression, the categories are excluded from the entire category list, so {^x1..y1} is the same as {.., ^x1..y1}. But if an expression has both inclusive and exclusive ranges like {x1, ^x2..y2}, and the inclusive range evaluates to nothing (x1 does not exist), the exclusive range does not exclude from the entire category list. The order in which you specify inclusive and exclusive ranges does not matter, so {x1..y1, ^x2..y2} is the same as {^x2..y2, x1..y1}. An exclusive range always takes precedence over an inclusive range. If a category is excluded, it is not included in the result, regardless of how many times it is included.
Special cases
The syntax has been designed to handle a number of special cases so that runtime errors are minimized. This is a summary of these special cases:
{ }
Gives an empty list.
{ x1, }
Considered as two single-category ranges. The second range is considered to be missing and is ignored. x1 is handled in the normal way.
{^x1, }
Always gives an empty list, because the second range is inclusive and evaluates to nothing.
{^x1, ^}
The second range is ignored and x1 is excluded from the entire category list.
Whitespaces
These are allowed anywhere around the operators .. ^ , { } and are optional. Whitespaces and special characters anywhere else are considered part of a token which is used for finding the category in the original list.
See also