Chapter 2 Structure of Global Variables (I)

This chapter describes a logical view of global variables and Outlines how they are physically stored on disk.

Logical structure of global variables

Global variables are named multidimensional arrays stored in a physical InterSystems IRIS® database. In an application, the mapping of global variables to physical databases is based on the current namespace — a namespace provides a logically unified view of one or more physical databases.

Global naming conventions and restrictions

A global name specifies its target and purpose. There are two types of global variables and a separate set of variables called “process private globals” :

  • Globals – these are called standard globals; Often, these variables are referred to as global variables for short. It is a permanent multidimensional array residing in the current namespace.
  • Extend global reference – This is a global reference in a namespace other than the current one.
  • Process private global variable – This is an array variable that can only be accessed by the process that created it.

The naming convention for global variables is as follows:

  • Global variable names are decarbonized (^) prefix. This caret distinguishes between global variables and local variables.
  • Global variable names are delimited (^The first character after the prefix can be:
    • Letter or percent character (%)- only for standard global variables. For global variable names, letters are defined asASCII 65toASCII 255Range of alphabetic characters. If the global name is"%"The beginning (but not"% Z"or"% z"), the global name is used by the InterSystems IRIS system.%GLOBALTypically stored in IRISSYS or IRISLIB databases.
    • A vertical bar (|) or left square bracket ([)- represents extended global references or process-specific global variables. Usage depends on subsequent characters.
  • Other characters in the global variable name can be letters, numbers, or periods (.) character.

The percentage (%) character cannot be used except as the first character of the global name. “.” Character cannot be the last character of a global name.

  • Global names can be up to 31 characters long (excluding the out-of-character prefix). Longer global names can be specified, but InterSystems IRIS only considers the first 31 characters important.
  • Global names are case sensitive.
  • InterSystems IRIS imposes a limit on the total length of global references, which in turn imposes a limit on the length of any subscript value.

In the IRISSYS database, InterSystems will divide"Z","Z","% z"and"% Z"Keep all global variable names at the beginning for yourself. In all other databases, InterSystems retains all of the following"ISC"The global name at the beginning. and"% isc.".

Example global names and their usage

Here are examples of various global names and how each name is used:

  • ^globalname– Standard global variables
  • ^|"environment"|globalname– Extends the environment syntax for global variable references
  • ^||globalname– Process private global variables
  • ^ | | "^"– Process private global variables
  • ^[namespace]globalname– Extends parenthesis syntax for explicit namespaces in global variable references
  • ^[directory,system]globalname– Extends parentheses syntax for implicit namespaces in global variable references
  • ^["^"]globalname– Process private global variables
  • ^["^",""]globalname– Process private global variables

Note: Global names can only contain valid identifier characters; By default, these characters are described above. However, NLS(National Language Support) defines a different set of valid identifiers. Global names cannot contain Unicode characters.

Therefore, the following are valid global names:

   SET ^a="The quick "
   SET ^A="brown fox "
   SET ^A7="jumped over "
   SET ^A7.="the lazy "
   SET ^A1B2C3="dog's back."WRITE ^a,^A,^A7,! ,^A7.,^A1B2C3
   KILL ^a,^A,^A7,^A7.,^A1B2C3 // keeps the database clean 
Copy the code

Introduction to global nodes and subscripts

A global usually has multiple nodes, usually identified by a subscript or a set of subscripts. Here is a basic example:

 set ^Demo(1) ="Cleopatra"
Copy the code

This statement references the global node ^Demo(1), which is one of the nodes in the global node ^Demo. This node is identified by a subscript.

Here’s another example:

 set ^Demo("subscript1"."subscript2"."subscript3") =12
Copy the code

This statement refers to the global node ^Demo(“subscript1″,”subscript2″,”subscript3”), which is another node in the same global. This node is identified by three subscripts.

Here’s another example:

 set ^Demo="hello world"
Copy the code

This statement references the global node ^Demo that does not use any subscripts.

The global nodes form a hierarchical structure. ObjectScript provides commands that take advantage of this structure. For example, you can delete a node or delete a node and all its children.

Global variable subscript

Subscripts have the following rules:

  • Subscript values are case sensitive.
  • The subscript value can be any ObjectScript expression, provided the expression evaluates to something other than an empty string ("").

This value can include all types of characters, including Spaces, non-print characters, and Unicode characters. (Note that non-print characters are not very useful in subscript values.)

  • InterSystems IRIS evaluates each subscript the same way it evaluates any other expression before parsing global references. In the example below, we set^DemoA node globally, and then reference that node in several equivalent ways:
DHC-APP>s ^Demo(1+2+3) ="a value"
 
DHC-APP>w ^Demo(3+3)
a value
DHC-APP>w ^Demo(03+03)
a value
DHC-APP>w ^Demo(03.0+03.0)
a value
DHC-APP>set x=6
 
DHC-APP>w ^Demo(x)
a value
Copy the code
  • InterSystems IRIS imposes a limit on the total length of global references, which in turn imposes a limit on the length of any subscript value.

Note: The above rules apply to all collations supported by IRIS. For older categories that are still in use for compatibility reasons, e.g"The pre - ISM - 6.1"The subscript rules are more restrictive. For example, a character subscript cannot start with a control character; There is also a limit to the number of digits that can be used in integer subscripts.

Global variable node

In an application, nodes typically contain the following types of structures:

  1. String or numeric data, including nativeUnicodeCharacters.
  2. A string with multiple fields separated by special characters:
 SET ^Data(10) = "Smith^John^Boston"
Copy the code

You can split this data using the ObjectScript $PIECE function.

  1. InterSystems IRIS $LISTThe structure contains multiple fields.$LISTA structure is a string containing encoded values of multiple lengths. It does not require a special delimiter.
  2. Empty string (""). In cases where the subscript itself is used as data, no data is stored in the actual node.
  3. A bit string. If a global variable is used to store part of a bitmap index, the value stored in the node is a bitstring. Bit strings are contained1and0Value for a logical compressed set of strings. You can use$BITThe function constructs a bit string.
  4. Part of a larger data set. For example, objects and SQL engines stream (BLOB) store as globally continuous32KNode series. Through the flow interface, users of the flow are unaware that the flow is stored this way.

Note that no global node can contain a string that is longer than the string length limit, which is very long.

Global variable collation rules

Globally, nodes are stored in sort (sort) order.

Applications typically control the sorting order of nodes by applying transformations to values used as subscripts. For example, when the SQL engine creates an index for a string value, it converts all string values to uppercase and precedes them with a space character to ensure that the index is case-insensitive and sorted as text (even if the value is stored as a string).

The maximum length of a global variable reference

The total length of a global variable reference (that is, a reference to a particular global node or subtree) is limited to511Encoding characters (less than511A typing character).

To determine the size of a given global variable reference conservatively, use the following guidelines:

  1. Global variable name: plus per character1.
  2. For pure numeric subscripts: add each digit, symbol, or decimal point1.
  3. For subscripts containing non-numeric characters: add for each character3.

If the subscript is not purely numeric, the actual length of the subscript will vary depending on the character set used to encode the string. A multi-byte character can occupy a maximum of 3 bytes.

Note that ASCII characters may take up 1 or 2 bytes. If the collation is case-folded, then ASCII characters can use 1 byte for the character and 1 byte for the disambiguation byte. If sorting does not perform case folding, ASCII characters take up 1 byte.

  1. Plus each subscript1.

If the sum of the numbers is greater than 511, the reference is too long.

Because of the way constraints are determined, this helps to avoid using a large number of subscript levels if you must use long subscripts or global names. Conversely, long global names and long subscripts should be avoided if multiple subscript levels are used. Because you have no control over the character set being used, it is useful to keep global names and subscripts shorter.

When in doubt about a particular reference, it is useful to create a test version of a global variable reference that is equal to (or even slightly longer than) the longest expected global variable reference. The data from these tests provides guidance on naming conventions that may be revised before building the application.