Types

Casting is a way to convert a value of one type to a compatible type. Sometimes the conversion is exact; other times information is lost. Enumeration and parsing in q also fit into the cast pattern.

The type Operator

The non-atomic unary function (type) can be applied to any entity in q to return its data type expressed as a short. It is a “feature” of q that the data type of an atom is negative whereas the type of a simple list is positive.

Q)type 42-7h q)type 10 20 30 7h q)type 98.6-9h q)type 1.1 2.2 3.3 9h q)type 'a-11h q)type' a 'b' C 11h q)type "z" -10h q)type "abc" 10hCopy the code

Type of a Variable

The type of a variable is the type of the value associated with the variable’s name.

q)a:42
q)type a
-7h
q)a:"abc"
q)type a
10h
q)value `.
a| 42
Copy the code

Cast

Since q is dynamically typed, casting occurs at run-time using the binary operator $, which is atomic in both operands. The right operand is the source value and the left operand specifies the target type. There are three ways to spec

  • A (positive) numeric short type value
  • A char type value
  • A type name symbol

Casts that Widen

In these examples, no information is lost in the cast, as the target type is wider than the source type. Here are examples using the short type specification in the target.

q)7h$42i / int to long
42
q)6h$42 / long to int
42i
q)9h$42 / long to float
42f
Copy the code

It is arguably most readable to use the symbolic type name.

q)`int$42
42i
q)`long$42i
42
q)`float$42
42f
Copy the code

Casts across Disparate Types

The underlying value of a char is its position in the ASCII collation sequence, so we can cast char to and from integers, provided the integer is less than 256.

q)`char$42
"*"
q)`long$"\n"
10
Copy the code

The underlying value of a date is its count of days from the millennium, so we can cast to and from anint.

Q) 'date$0 2000.01.01 Q)' int$2001.01.01 / Millennium Occurred on Leap Year 366Copy the code

Casts that Narrow

Some casts lose information. This includes the usual suspects of float to integer and wider integers to narrower ones.

Q) ` long 12 $12.345 q) ` short $123456789, 32767 hCopy the code

Cast any numeric to a boolean using the C philosophy that zero is 0b and anything else is 1b.

We can also extract constituents from complex types.

Q) ` date $2015.01.02 D10:20:30. 123456789 2015.01.02 q) ` year $2015 I q) ` 2015.01.02 month $$2015.01.02 ` 2015.01.02 2015.01 m q) mm 1i q) 'dd$2015.01.02 2i q)' hh$10:20:30.123456789 10i q) 'minute$10:20:30.123456789 10:20q)' uU $10:20:30.123456789 20i Q) ` second $10:20:30. 123456789 10:20:30 q) $10:20:30 ` ss. 123456789 30 ICopy the code

Casting Integral Infinities

When integral infinities are cast to integers of wider type, they are their underlying bit patterns, reinterpreted. Since these bit patterns are legitimate values for the wider type, the cast results in a finite value.

q)`int$0Wh
32767i
q)`int$-0Wh
-32767i
q)`long$0Wi
2147483647
q)`long$-0Wi
-2147483647
Copy the code

Coercing Types

q)L:10 20 30 40
q)L[1]:42h
'type
Copy the code

This situation can arise when the list and the assignment value are created dynamically. Coerce the type by casting it to that of the target, provided of course that the cast is legitimate.

q)L[1]:(type L)$42h
q)L,:(type L)$43h
q)L
10 20 30
Copy the code

Cast is Atomic

Cast is atomic in the right operand.

q)"i"$10 20 30
10 20 30i
q)`float$(42j; 42i; 42j)
42 42 42f
Copy the code

Cast is atomic in the left operand.

Q) 'short' int 'long$42 42h 42i 42 q)"ijf"$98.6 99i 99 98.6Copy the code

Cast is atomic in both operands simultaneously.

q)"ijf"$10 20 30
10i
20
30f
Copy the code

Data to and from Text

A q string is a simple list of char.

Data to Strings

The function string can be applied to any q entity to produce a text representation suitable for console display or storage in a file. Here are the key features of string.

  • The result is always a list of char, never a single char. Thus you will see singleton char lists from single digits.

  • The result contains no q type indicators or other decorations. In general, the result is the most compact representation of the input, which may not actually be convertible (i.e., parsed) back to the original value.

  • Applying string to an actual string (i.e., list of char) probably will not give you what you want.

    Q)string 42″ 42″ q)string 4,”4″ Q)string 42i “42” q)a:2.0 q)string a,”2″ q)f:{xx} q)string f “{xx}”

The string function is clearly not atomic

q)string 1 2 3
,"1"
,"2"
,"3"
q)string "string"
,"s"
,"t"
,"r"
,"i"
,"n"
,"g"
q)string (1 2 3; 10 20 30)
,"1" ,"2" ,"3"
"10" "20" "30"
q)string `Life`the`Universe`and`Everything
"Life"
"the"
"Universe"
"and"
"Everything"
Copy the code

Creating Symbols from Strings

To cast a char or a string to a symbol, use `$.

q)`$"abc"
`abc
q
q)`$"Hello World"
`Hello word
Copy the code

You can include any characters in a symbol this way but you may need to escape them into the string.

q)`$"Zaphod \"Z\""
`Zaphod "Z"
q)`$"Zaphod \n"
`Zaphod
Copy the code

The unary `$ is atomic and will thus convert an entire list (or column) of strings to symbols.

q)`$("Life";" the";" Universe";" and";" Everything") `Life`the`Universe`and`EverythingCopy the code

Parsing Data from Strings

The $ operator is overloaded to parse strings into data of any type exactly as the q interpreter does. This overload is invoked by using an uppercase type char as the target left operand and a string in the right operand. If the specified parse cannot be performed, a null of the target type is returned

Q) "J" $" 42 "42 q)" F "$" 42" 42 F q) "F" $" 42.0 "42 F q)" I "$" 42.0" 0 ni q) "I" $" "0 niCopy the code

Date parsing is flexible with respect to the format of the date.

Q) "D" $" 12.31.2014 2014.12.31 q) "D" $" "12-31-2014" 2014.12.31 q) "D" $" 12/31/2014 2014.12.31 q) "D" $" "12/1/2014 2014.12.01" Q) "D" $" 2014/12/31 2014.12.31"Copy the code

Creating Typed Empty Lists

Q) c1: ` float $() q) c1:98.6Copy the code

Notice that an operation that yields a simple list retains the type on an empty result.

q)0#10 20 30
`long$()
Copy the code

Enumerations

Traditional Enumerations

To begin, recall that in traditional languages, an enumerated type is a way of associating a series of names with a corresponding set of integral values. Often the sequence of numbers is consecutive and begins with 0. The association is usually given a name and represents a new type.

A traditional enumerated type serves multiple purposes.

  • It allows a descriptive name to be used instead of an arbitrary number — e.g., ‘red’, ‘green’, ‘blue’ instead of 0, 1 and 2.
  • It enables type checking to ensure that only permissible values are supplied – e.g., choosing a color name from a list instead of remembering its number is less prone to error.
  • It provides namespacing, meaning the same name can be reused in different domains without fear of confusion — e.g., color.blue and note.blue (the flatted fifth).

There is also a subtler, more powerful use of enumerations: normalizing data.

Data Normalization

Broadly speaking, data normalization seeks to eliminate duplication, retaining only the minimum required data. In the archetypal example, suppose you know that you will have a list of text entries taken from a fixed and reasonably short set of values

v:`ccccccc`bbbbbbb`aaaaaaa`ccccccc`ccccccc`bbbbbbb u:distinct v u `ccccccc`bbbbbbb`aaaaaaa k:u? v k 0 1 2 0 0 1Copy the code

Enumerating Symbols

The process of converting a list of symbols to the equivalent list of indices described in the previous section is called enumeration in q.

It uses (yet another overload of) $ with the name of the variable holding the unique symbols as the left operand and a list of symbols drawn from that domain on the right.

q)`u$v
`u$`ccc`bbb`aaa`ccc`bbb`aaa
q)`int$(`u$v)
0 1 2 0 1 2i
Copy the code

Using Enumerated Symbols

We continue with the example of the previous section, renamed to use the standard sym domain.

q)sym:`g`aapl`msft`ibm q)v:1000000? sym q)ev:`sym$v q)ev `sym$`g`g`msft`aapl`msft`aapl`msft`ibm`msft`aapl`g`ibm`aapl`msft`msft`aapl`g`.. q)`int$ev 0 0 2 1 2 1 2 3 2 1 0 3 1 2 2 1 0 1 2 0 1 1 2 3 0 1 2 1 2 1 0 0 1 2 1 2 3 3 0..Copy the code

The enumerated ev can be substituted for the original v in nearly all situations.

q)v[3] `aapl q)ev[3] `u$`aapl q)v[3]:`ibm q)ev[3]:`ibm q)v=`ibm 000100010010011101000010010100000000100100000001000000001100001001011.. q)ev=`ibm q)where v=`aapl 4 5 19 20 21 31 33 34 41 42 43 49 58 59 61 74 81 83 90 94 95 98 114.. q)where ev=`aapl 4 5 19 20 21 31 33 34 41 42 43 49 58 59 61 74 81 83 90 94 95 98 114.. 000100010010011101000010010100000000100100000001000000001100001001011.. q)v? `aapl 4 q)ev? `aapl 4 q)v in `ibm`aapl 000111010010011101011110010100010110100101110001010000001111011001011.. q)ev in `ibm`aapl 000111010010011101011110010100010110100101110001010000001111011001011..Copy the code

 While the enumerated version is item-wise equal to the original, the entities are not identical.

q)all v=ev
1b
q)v~ev
0b
Copy the code

This is because the types matter with ~.

Type of Enumerations

Each enumeration is assigned a new numeric data type, beginning with 20h. Starting with q version 3.2, the type 20h is reserved for the conventional enumeration domain sym, whether you use it or not (you should). The types of other enumerations you create will begin with 21h and proceed sequentially. The convention of negative type for atoms and positive type for simple lists still holds. In a fresh q session we see the following.

q)sym1:`g`aapl`msft`ibm q)type `sym1$1000000? sym1 21h q)sym2:`a`b`c q)type `sym2$`c -22hCopy the code

Updating an Enumerated List

The normalization provided by an enumeration reduces updating all occurrences of a given value to a single operation. This can have significant performance implications for large lists with many repetitions. Continuing with our example above, suppose the list u contains the items in a stock index and we wish to change one of the constituents. A single update to u suffices.

q)sym:`g`aapl`msft`ibm
q)ev:`sym$`g`g`msft`ibm`aapl`aapl`msft`ibm`msft`g`ibm`g..
q)sym[0]:`twit
q)sym
`twit`aapl`msft`ibm
q)ev
`sym$`twit`twit`msft`ibm`aapl`aapl`msft`ibm`msft`twit`ibm`twit..
Copy the code

In contrast, to make the equivalent update to v requires changing every occurrence.

Q) v ` ` g g ` MSFT ` IBM ` aapl ` aapl ` ` MSFT MSFT ` IBM ` g ` IBM ` g... q)@[v; where v=`g; :; `twit]Copy the code

Dynamically Appending to an Enumeration Domain

One situation in which an enumeration is more complicated than working with the denormalized data is when you want to add a new value. Continuing with the example above, appending a new item to an ordinary list of symbols is a single operation. 

q)sym:`g`aapl`msft`ibm q)v:1000000? sym q)ev:`sym$v q)v,:`twtr q)ev,:`twtr 'castCopy the code

The new value must first be added to the unique list.

q)sym,:`twtr
q)ev,:`twtr
Copy the code

Resolving an Enumeration

An enumerated symbol can be substituted for its equivalent symbol value in most expressions. However, there are some situations in which you need non-enumerated values. One case is converting from one enumeration domain to another, which happens when copying from one kdb+ database to another or in merging two databases.

Given an enumerated symbol, or a list of such, you can recover the un-enumerated value(s) by applying the built-in value. In our on-going example,

q)sym:`g`aapl`msft`ibm q)v:1000000? sym q)ev:`sym$v q)value ev `aapl`g`msft`msft`ibm`msft`msft`msft`msft`msft`g`ibm`ibm`ibm.. q)v~value ev 1bCopy the code

Reference: code.kx.com/q4m3/7_Tran…

By Jeffry A. Borror