Function Call Rules
The following rules apply to all functions unless explicitly indicated otherwise in the function description:
- For functions that accept numeric types, if one operand is a floating point operand and the other operand is another numeric type, both operands are converted to FLOAT64 before the function is evaluated.
- If an operand is NULL , the result is NULL , with the exception of the IS operator.
- For functions that are time zone sensitive (as indicated in the function description), the default time zone, UTC, is used if a time zone is not specified.
SAFE. prefix
Syntax:
1 |
SAFE.function_name() |
Desctiption
If you begin a function with the SAFE. prefix, it will return NULL instead of an error. The SAFE. prefix only prevents errors from the prefixed function itself: it does not prevent errors that occur while evaluating argument expressions. The SAFE. prefix only prevents errors that occur because of the value of the function inputs, such as “value out of range” errors; other errors, such as internal or system errors, may still occur. If the function does not return an error, SAFE. has no effect on the output. If the function never returns an error, like RAND , then SAFE. has no effect.
Operators, such as + and = , do not support the SAFE. prefix. To prevent errors from a division operation, use SAFE_DIVIDE. Some operators, such as IN , ARRAY , and UNNEST , resemble functions, but do not support the SAFE. prefix. The CAST and EXTRACT functions also do not support the SAFE. prefix. To prevent errors from casting, use SAFE_CAST.
Example:
In the following example, the first use of the SUBSTR function would normally return an error, because the function does not support length arguments with negative values. However, the SAFE. prefix causes the function to return NULL instead. The second use of the SUBSTR function provides the expected output: the SAFE. prefix has no effect.
1 2 3 4 5 6 7 8 9 |
SELECT SAFE.SUBSTR('foo', 0, -2) AS safe_output UNION ALL SELECT SAFE.SUBSTR('bar', 0, 2) AS safe_output; +-------------+ | safe_output | +-------------+ | NULL | | ba | +-------------+ |
Supported Functions
Kochava Query supports the use of the SAFE. prefix with most scalar functions that can raise errors, including STRING functions, math functions, DATE functions, DATETIME functions, and TIMESTAMP functions. Kochava Query does not support the use of the SAFE. prefix with aggregate, analytic, or user-defined functions.
Conversion Rules
“Conversion” includes, but is not limited to, casting and coercion.
- Casting is explicit conversion and uses the CAST() function.
- Coercion is implicit conversion, which Kochava Query performs automatically under the conditions described below.
- There is a third group of conversion functions that have their own function names, such as UNIX_DATE() .
The table below summarizes all possible CAST and coercion possibilities for Kochava Query data types. “Coercion To” applies to all expressions of a given data type (e.g., a column) , but literals and parameters can also be coerced. See Literal Coercion and Parameter Coercion for details.
From Type | CAST to | Coercion To |
---|---|---|
INT64 | BOOL INT64 NUMERIC FLOAT64 STRING |
FLOAT64 NUMERIC |
NUMERIC | INT64 NUMERIC FLOAT64 STRING |
FLOAT64 |
FLOAT64 | INT64 NUMERIC FLOAT64 STRING |
|
BOOL | BOOL INT64 STRING |
|
STRING | BOOL INT64 NUMERIC FLOAT64 STRING BYTES DATE DATETIME TIME TIMESTAMP |
|
BYTES | BYTES STRING |
|
DATE | DATE DATETIME STRING TIMESTAMP |
|
DATETIME | DATE DATETIME STRING TIME TIMESTAMP |
|
TIME | STRING TIME |
|
TIMESTAMP | DATE DATETIME STRING TIME TIMESTAMP |
|
ARRAY | ARRAY | |
STRUCT | STRUCT |
Casting
Syntax:
1 |
CAST(expr AS typename) |
Cast syntax is used in a query to indicate that the result type of an expression should be converted to some other type.
Example:
1 |
CAST(x=1 AS STRING) |
This results in "true" if x is 1 , "false" for any other non-NULL value, and NULL if x is NULL .
Casts between supported types that do not successfully map from the original value to the target domain produce runtime errors. For example, casting BYTES to STRING where the byte sequence is not valid UTF-8 results in a runtime error.
When casting an expression x of the following types, these rules apply:
From | To | Rule(s) when casting x |
---|---|---|
INT64 | FLOAT64 | Returns a close but potentially not exact FLOAT64 value. |
INT64 | BOOL | Returns FALSE if x is 0, TRUE otherwise. |
NUMERIC | Floating Point | NUMERIC will convert to the closest floating point number with a possible loss of precision. |
FLOAT64 | INT64 | Returns the closest INT64 value. Halfway cases such as 1.5 or -0.5 round away from zero. |
FLOAT64 | STRING | Returns an approximate string representation. |
FLOAT64 | NUMERIC | If the floating point number has more than nine digits after the decimal point, it will round half away from zero. Casting a NaN, +inf or -inf will return an error. Casting a value outside the range of NUMERIC will return an overflow error. |
BOOL | INT64 | Returns 1 if x is TRUE, 0 otherwise. |
BOOL | STRING | Returns “true” if x is TRUE, “false” otherwise. |
STRING | FLOAT64 | Returns x as a FLOAT64 value, interpreting it as having the same form as a valid FLOAT64 literal. Also supports casts from “inf“, “+inf“, “-inf“, and “nan“. Conversions are case-insensitive. |
STRING | NUMERIC | The numeric literal contained in the STRING must not exceed the maximum precision or range of the NUMERIC type, or an error will occur. If the number of digits after the decimal point exceeds nine, then the resulting NUMERIC value will round half away from zero to have nine digits after the decimal point. |
STRING | BOOL | Returns TRUE if x is “true” and FALSE if x is “false” All other values of x are invalid and throw an error instead of casting to BOOL. STRINGs are case-insensitive when converting to BOOL. |
STRING | BYTES | STRINGs are cast to BYTES using UTF-8 encoding. For example, the STRING “©”, when cast to BYTES, would become a 2-byte sequence with the hex values C2 and A9. |
BYTES | STRING | Returns x interpreted as a UTF-8 STRING. For example, the BYTES literal b’\xc2\xa9′, when cast to STRING, is interpreted as UTF-8 and becomes the unicode character “©”. An error occurs if x is not valid UTF-8. |
ARRAY | ARRAY | Must be the exact same ARRAY type. |
STRUCT | STRUCT | Allowed if the following conditions are met:
|
Safe Casting
When using CAST , a query can fail if Kochava Query is unable to perform the cast. For example, the following query generates an error:
1 |
SELECT CAST("apple" AS INT64) AS not_a_number; |
If you want to protect your queries from these types of errors, you can use SAFE_CAST . SAFE_CAST is identical to CAST , except it returns NULL instead of raising an error.
1 2 3 4 5 6 7 |
SELECT SAFE_CAST("apple" AS INT64) AS not_a_number; +--------------+ | not_a_number | +--------------+ | NULL | +--------------+ |
If you are casting from bytes to strings, you can also use the function, SAFE_CONVERT_BYTES_TO_STRING . Any invalid UTF-8 characters are replaced with the unicode replacement character, U+FFFD . See SAFE_CONVERT_BYTES_TO_STRING for more information.
Casting Hex Strings to Integers
If you are working with hex strings (0x123), you can cast those strings as integers:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
SELECT '0x123' as hex_value, CAST('0x123' as INT64) as hex_to_int; +-----------+------------+ | hex_value | hex_to_int | +-----------+------------+ | 0x123 | 291 | +-----------+------------+ SELECT '0x123' as hex_value, CAST('-0x123' as INT64) as hex_to_int; +-----------+------------+ | hex_value | hex_to_int | +-----------+------------+ | 0x123 | -291 | +-----------+------------+ |
Casting Date Types
Kochava Query supports casting date types to/from strings as follows:
1 2 |
CAST(date_expression AS STRING) CAST(string_expression AS DATE) |
Casting from a date type to a string is independent of time zone and is of the form YYYY-MM-DD . When casting from string to date, the string must conform to the supported date literal format, and is independent of time zone. If the string expression is invalid or represents a date that is outside of the supported min/max range, then an error is produced.
Casting Timestamp Types
Kochava Query supports casting timestamp types to/from strings as follows:
1 2 |
CAST(timestamp_expression AS STRING) CAST(string_expression AS TIMESTAMP) |
When casting from timestamp types to string, the timestamp is interpreted using the default time zone, UTC. The number of subsecond digits produced depends on the number of trailing zeroes in the subsecond part: the CAST function will truncate zero, three, or six digits.
When casting from string to a timestamp, string_expression must conform to the supported timestamp literal formats, or else a runtime error occurs. The string_expression may itself contain a time_zone —see time zones. If there is a time zone in the string_expression , that time zone is used for conversion, otherwise the default time zone, UTC, is used. If the string has fewer than six digits, then it is implicitly widened.
An error is produced if the string_expression is invalid, has more than six subsecond digits (i.e. precision greater than microseconds), or represents a time outside of the supported timestamp range.
Casting Between Date and Timestamp Types
Kochava Query supports casting between date and timestamp types as follows:
1 2 |
CAST(date_expression AS TIMESTAMP) CAST(timestamp_expression AS DATE) |
Casting from a date to a timestamp interprets date_expression as of midnight (start of the day) in the default time zone, UTC. Casting from a timestamp to date effectively truncates the timestamp as of the default time zone.
Coercion
Kochava
Query coerces the result type of an expression to another type if needed to match function signatures. For example, if function func() is defined to take a single argument of type INT64 and an expression is used as an argument that has a result type of FLOAT64, then the result of the expression will be coerced to INT64 type before func() is computed.
Literal Coercion
Kochava Query supports the following literal coercions:
Input Data Type | Result Data Type | Notes |
---|---|---|
STRING literal | DATE TIMESTAMP |
Literal coercion is needed when the actual literal type is different from the type expected by the function in question. For example, if function func() takes a DATE argument, then the expression func(“2014-09-27”) is valid because the STRING literal “2014-09-27” is coerced to DATE.
Literal conversion is evaluated at analysis time, and gives an error if the input literal cannot be converted successfully to the target type.
NOTE: String literals do not coerce to numeric types.
Parameter Coercion
Kochava Query supports the following parameter coercions:
Input Data Type | Result Data Type |
---|---|
STRING parameter |
If the parameter value cannot be coerced successfully to the target type, an error is provided.
Additional Conversion Functions
Kochava Query provides the following additional conversion functions:
- DATE functions
- DATETIME functions
- TIME functions
- TIMESTAMP functions
Aggregate Functions
An aggregate function is a function that performs a calculation on a set of values. COUNT, MIN and MAX are examples of aggregate functions.
1 2 3 |
SELECT COUNT(*) as total_count, COUNT(fruit) as non_null_count, MIN(fruit) as min, MAX(fruit) as max FROM UNNEST([NULL, "apple", "pear", "orange"]) as fruit; |
1 2 3 4 5 |
+-------------+----------------+-------+------+ | total_count | non_null_count | min | max | +-------------+----------------+-------+------+ | 4 | 3 | apple | pear | +-------------+----------------+-------+------+ |
The following sections describe the aggregate functions that Kochava Query supports.
ANY_VALUE
1 |
ANY_VALUE(expression) [OVER (...)] |
Description
Returns any value from the input or NULL if there are zero input rows. The value returned is non-deterministic, which means you might receive a different result each time you use this function.
Supported Argument Types
Any
Optional Clause
OVER : Specifies a window. See Analytic Functions.
Returned Data Types
Matches the input data type.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
SELECT ANY_VALUE(fruit) as any_value FROM UNNEST(["apple", "banana", "pear"]) as fruit; +-----------+ | any_value | +-----------+ | apple | +-----------+ SELECT fruit, ANY_VALUE(fruit) OVER (ORDER BY LENGTH(fruit) ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS any_value FROM UNNEST(["apple", "banana", "pear"]) as fruit; +--------+-----------+ | fruit | any_value | +--------+-----------+ | pear | pear | | apple | pear | | banana | apple | +--------+-----------+ |
ARRAY_AGG
1 2 3 |
ARRAY_AGG([DISTINCT] expression [{IGNORE|RESPECT} NULLS] [ORDER BY key [{ASC|DESC}] [, ... ]] [LIMIT n]) [OVER (...)] |
Description
Returns an ARRAY of expression values.
Supported Argument Types
All data types except ARRAY.
Optional Clauses
The clauses are applied in the following order:
- OVER : Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within ARRAY_AGG() .
- DISTINCT : Each distinct value of expression is aggregated only once into the result.
- IGNORE NULLS or RESPECT NULLS : If IGNORE NULLS is specified, the NULL values are excluded from the result. If RESPECT NULLS is specified or if neither is specified, the NULL values are included in the result (an error is raised, however, if an array in the final query result contains a NULL element).
- ORDER BY : Specifies the order of the values.
- For each sort key, the default sort direction is ASC.
- NULLs: In the context of the ORDER BY clause, NULLs are the minimum possible value; that is, NULLs appear first in ASC sorts and last in DESC sorts.
- Floating point data types: see Floating Point Semantics on ordering and grouping.
- If DISTINCT is also specified, then the sort key must be the same as expression .
- If ORDER BY is not specified, the order of the elements in the output array is non-deterministic, which means you might receive a different result each time you use this function.
- LIMIT: Specifies the maximum number of expression inputs in the result. The limit n must be a constant INT64.
Returned Data Types
ARRAY
If there are zero input rows, this function returns NULL.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
SELECT FORMAT("%T", ARRAY_AGG(x)) AS array_agg FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x; +-------------------------+ | array_agg | +-------------------------+ | [2, 1, -2, 3, -2, 1, 2] | +-------------------------+ SELECT FORMAT("%T", ARRAY_AGG(DISTINCT x)) AS array_agg FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x; +---------------+ | array_agg | +---------------+ | [2, 1, -2, 3] | +---------------+ SELECT FORMAT("%T", ARRAY_AGG(x IGNORE NULLS)) AS array_agg FROM UNNEST([NULL, 1, -2, 3, -2, 1, NULL]) AS x; +-------------------+ | array_agg | +-------------------+ | [1, -2, 3, -2, 1] | +-------------------+ SELECT FORMAT("%T", ARRAY_AGG(x ORDER BY ABS(x))) AS array_agg FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x; +-------------------------+ | array_agg | +-------------------------+ | [1, 1, 2, -2, -2, 2, 3] | +-------------------------+ SELECT FORMAT("%T", ARRAY_AGG(x LIMIT 5)) AS array_agg FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x; +-------------------+ | array_agg | +-------------------+ | [2, 1, -2, 3, -2] | +-------------------+ SELECT FORMAT("%T", ARRAY_AGG(DISTINCT x IGNORE NULLS ORDER BY x LIMIT 2)) AS array_agg FROM UNNEST([NULL, 1, -2, 3, -2, 1, NULL]) AS x; +-----------+ | array_agg | +-----------+ | [-2, 1] | +-----------+ SELECT x, FORMAT("%T", ARRAY_AGG(x) OVER (ORDER BY ABS(x))) AS array_agg FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x; +----+-------------------------+ | x | array_agg | +----+-------------------------+ | 1 | [1, 1] | | 1 | [1, 1] | | 2 | [1, 1, 2, -2, -2, 2] | | -2 | [1, 1, 2, -2, -2, 2] | | -2 | [1, 1, 2, -2, -2, 2] | | 2 | [1, 1, 2, -2, -2, 2] | | 3 | [1, 1, 2, -2, -2, 2, 3] | +----+-------------------------+ |
ARRAY_CONCAT_AGG
1 |
ARRAY_CONCAT_AGG(expression [ORDER BY key [{ASC|DESC}] [, ... ]] [LIMIT n]) |
Description
Concatenates elements from expression of type ARRAY, returning a single ARRAY as a result. This function ignores NULL input arrays, but respects the NULL elements in non-NULL input arrays (an error is raised, however, if an array in the final query result contains a NULL element).
Supported Argument Types
ARRAY
Optional Clauses
The clauses are applied in the following order:
- ORDER BY: Specifies the order of the values.
- For each sort key, the default sort direction is ASC.
- Array ordering is not supported, and thus the sort key cannot be the same as expression.
- NULLs: In the context of the ORDER BY clause, NULLs are the minimum possible value; that is, NULLs appear first in ASC sorts and last in DESC sorts.
- Floating point data types: see Floating Point Semantics on ordering and grouping.
- If ORDER BY is not specified, the order of the elements in the output array is non-deterministic, which means you might receive a different result each time you use this function.
- LIMIT: Specifies the maximum number of expression inputs in the result. The limit applies to the number of input arrays, not the number of elements in the arrays. An empty array counts as 1. A NULL array is not counted. The limit n must be a constant INT64.
Returned Data Types
ARRAY
Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x)) AS array_concat_agg FROM ( SELECT [NULL, 1, 2, 3, 4] AS x UNION ALL SELECT NULL UNION ALL SELECT [5, 6] UNION ALL SELECT [7, 8, 9] ); +-----------------------------------+ | array_concat_agg | +-----------------------------------+ | [NULL, 1, 2, 3, 4, 5, 6, 7, 8, 9] | +-----------------------------------+ SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x ORDER BY ARRAY_LENGTH(x))) AS array_concat_agg FROM ( SELECT [NULL, 1, 2, 3, 4] AS x UNION ALL SELECT NULL UNION ALL SELECT [5, 6] UNION ALL SELECT [7, 8, 9] ); +-----------------------------------+ | array_concat_agg | +-----------------------------------+ | [5, 6, 7, 8, 9, NULL, 1, 2, 3, 4] | +-----------------------------------+ SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x LIMIT 2)) AS array_concat_agg FROM ( SELECT [NULL, 1, 2, 3, 4] AS x UNION ALL SELECT NULL UNION ALL SELECT [5, 6] UNION ALL SELECT [7, 8, 9] ); +--------------------------+ | array_concat_agg | +--------------------------+ | [NULL, 1, 2, 3, 4, 5, 6] | +--------------------------+ SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x ORDER BY ARRAY_LENGTH(x) LIMIT 2)) AS array_concat_agg FROM ( SELECT [NULL, 1, 2, 3, 4] AS x UNION ALL SELECT NULL UNION ALL SELECT [5, 6] UNION ALL SELECT [7, 8, 9] ); +------------------+ | array_concat_agg | +------------------+ | [5, 6, 7, 8, 9] | +------------------+ |
AVG
1 |
AVG([DISTINCT] expression) [OVER (...)] |
Description
Returns the average of non- NULL input values, or NaN if the input contains a NaN.
Supported Argument Types
Any numeric input type, such as INT64. Note that, for floating point input types, the return result is non-deterministic, which means you might receive a different result each time you use this function.
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within AVG().
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
Returned Data Types
- NUMERIC if the input type is NUMERIC.
- FLOAT64
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
SELECT AVG(x) as avg FROM UNNEST([0, 2, NULL, 4, 4, 5]) as x; +-----+ | avg | +-----+ | 3 | +-----+ SELECT AVG(DISTINCT x) AS avg FROM UNNEST([0, 2, NULL, 4, 4, 5]) AS x; +------+ | avg | +------+ | 2.75 | +------+ SELECT x, AVG(x) OVER (ORDER BY x ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS avg FROM UNNEST([0, 2, NULL, 4, 4, 5]) AS x; +------+------+ | x | avg | +------+------+ | NULL | NULL | | 0 | 0 | | 2 | 1 | | 4 | 3 | | 4 | 4 | | 5 | 4.5 | +------+------+ |
BIT_AND
1 |
BIT_AND(expression) |
Description
Performs a bitwise AND operation on expression and returns the result.
Supported Argument Types
- INT64
Returned Data Types
INT64
Examples
1 2 3 4 5 6 7 |
SELECT BIT_AND(x) as bit_and FROM UNNEST([0xF001, 0x00A1]) as x; +---------+ | bit_and | +---------+ | 1 | +---------+ |
BIT_OR
1 |
BIT_OR(expression) |
Description
Performs a bitwise OR operation on expression and returns the result.
Supported Argument Types
- INT64
Returned Data Types
INT64
Examples
1 2 3 4 5 6 7 |
SELECT BIT_OR(x) as bit_or FROM UNNEST([0xF001, 0x00A1]) as x; +--------+ | bit_or | +--------+ | 61601 | +--------+ |
BIT_XOR
1 |
BIT_XOR([DISTINCT] expression) |
Description
Performs a bitwise XOR operation on expression and returns the result.
Supported Argument Types
- INT64
Optional Clause
DISTINCT: Each distinct value of expression is aggregated only once into the result.
Returned Data Types
INT64
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SELECT BIT_XOR(x) AS bit_xor FROM UNNEST([5678, 1234]) AS x; +---------+ | bit_xor | +---------+ | 4860 | +---------+ SELECT BIT_XOR(x) AS bit_xor FROM UNNEST([1234, 5678, 1234]) AS x; +---------+ | bit_xor | +---------+ | 5678 | +---------+ SELECT BIT_XOR(DISTINCT x) AS bit_xor FROM UNNEST([1234, 5678, 1234]) AS x; +---------+ | bit_xor | +---------+ | 4860 | +---------+ |
COUNT
- COUNT(*) [OVER (...)]
- COUNT([DISTINCT] expression) [OVER (...)]
Description
- Returns the number of rows in the input.
- Returns the number of rows with expression evaluated to any value other than NULL.
Supported Argument Types
expression can be any data type.
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions.
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
Return Data Types
INT64
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
SELECT COUNT(*) AS count_star, COUNT(x) AS count_x, COUNT(DISTINCT x) AS count_dist_x FROM UNNEST([1, 4, NULL, 4, 5]) AS x; +------------+---------+--------------+ | count_star | count_x | count_dist_x | +------------+---------+--------------+ | 5 | 4 | 3 | +------------+---------+--------------+ SELECT x, COUNT(*) OVER (PARTITION BY MOD(x, 3)) AS count_star, COUNT(x) OVER (PARTITION BY MOD(x, 3)) AS count_x, COUNT(DISTINCT x) OVER (PARTITION BY MOD(x, 3)) AS count_dist_x FROM UNNEST([1, 4, NULL, 4, 5]) AS x; +------+------------+---------+--------------+ | x | count_star | count_x | count_dist_x | +------+------------+---------+--------------+ | NULL | 1 | 0 | 0 | | 1 | 3 | 3 | 2 | | 4 | 3 | 3 | 2 | | 4 | 3 | 3 | 2 | | 5 | 1 | 1 | 1 | +------+------------+---------+--------------+ |
COUNTIF
1 |
COUNTIF(expression) [OVER (...)] |
Description
Returns the count of TRUE values for expression. Returns if there are zero input rows or expression evaluates to FALSE for all rows.
Supported Argument Types
BOOL
Optional Clause
OVER: Specifies a window. See Analytic Functions.
Return Data Types
INT64
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
SELECT COUNTIF(x<0) AS num_negative, COUNTIF(x>0) AS num_positive FROM UNNEST([5, -2, 3, 6, -10, NULL, -7, 4, 0]) AS x; +--------------+--------------+ | num_negative | num_positive | +--------------+--------------+ | 3 | 4 | +--------------+--------------+ SELECT x, COUNTIF(x<0) OVER (ORDER BY ABS(x) ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS num_negative FROM UNNEST([5, -2, 3, 6, -10, NULL, -7, 4, 0]) AS x; +------+--------------+ | x | num_negative | +------+--------------+ | NULL | 0 | | 0 | 1 | | -2 | 1 | | 3 | 1 | | 4 | 0 | | 5 | 0 | | 6 | 1 | | -7 | 2 | | -10 | 2 | +------+--------------+ |
LOGICAL_AND
1 |
LOGICAL_AND(expression) |
Description
Returns the logical AND of all non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.
Supported Argument Types
BOOL
Return Data Types
BOOL
Examples
1 2 3 4 5 6 7 |
SELECT LOGICAL_AND(x) as logical_and FROM UNNEST([true, false, true]) as x; +-------------+ | logical_and | +-------------+ | false | +-------------+ |
LOGICAL_OR
1 |
LOGICAL_OR(expression) |
Description
Returns the logical OR of all non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.
Supported Argument Types
BOOL
Return Data Types
BOOL
Examples
1 2 3 4 5 6 7 |
SELECT LOGICAL_OR(x) as logical_or FROM UNNEST([true, false, true]) as x; +------------+ | logical_or | +------------+ | true | +------------+ |
MAX
1 |
MAX(expression) [OVER (...)] |
Description
Returns the maximum value of non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows. Returns NaN if the input contains a NaN .
Supported Argument Types
Any data type except: ARRAY STRUCT
Optional Clause
OVER : Specifies a window. See Analytic Functions.
Return Data Types
Same as the data type used as the input values.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SELECT MAX(x) AS max FROM UNNEST([8, NULL, 37, 4, NULL, 55]) AS x; +-----+ | max | +-----+ | 55 | +-----+ SELECT x, MAX(x) OVER (PARTITION BY MOD(x, 2)) AS max FROM UNNEST([8, NULL, 37, 4, NULL, 55]) AS x; +------+------+ | x | max | +------+------+ | NULL | NULL | | NULL | NULL | | 8 | 8 | | 4 | 8 | | 37 | 55 | | 55 | 55 | +------+------+ |
MIN
1 |
MIN(expression) [OVER (...)] |
Description
Returns the minimum value of non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows. Returns NaN if the input contains a NaN.
Supported Argument Types
Any data type except: ARRAY STRUCT
Optional Clause
OVER: Specifies a window. See Analytic Functions.
Return Data Types
Same as the data type used as the input values.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SELECT MIN(x) AS min FROM UNNEST([8, NULL, 37, 4, NULL, 55]) AS x; +-----+ | min | +-----+ | 4 | +-----+ SELECT x, MIN(x) OVER (PARTITION BY MOD(x, 2)) AS min FROM UNNEST([8, NULL, 37, 4, NULL, 55]) AS x; +------+------+ | x | min | +------+------+ | NULL | NULL | | NULL | NULL | | 8 | 4 | | 4 | 4 | | 37 | 37 | | 55 | 37 | +------+------+ |
STRING_AGG
1 2 |
STRING_AGG([DISTINCT] expression [, delimiter] [ORDER BY key [{ASC|DESC}] [, ... ]] [LIMIT n]) [OVER (...)] |
Description
Returns a value (either STRING or BYTES) obtained by concatenating non-null values.
If a delimiter is specified, concatenated values are separated by that delimiter; otherwise, a comma is used as a delimiter.
Supported Argument Types
STRING BYTES
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STRING_AGG().
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
- ORDER BY: Specifies the order of the values.
- For each sort key, the default sort direction is ASC.
- NULLs: In the context of the ORDER BY clause, NULLs are the minimum possible value; that is, NULLs appear first in ASC sorts and last in DESC sorts.
- Floating point data types: see Floating Point Semantics on ordering and grouping.
- If DISTINCT is also specified, then the sort key must be the same as expression.
- If ORDER BY is not specified, the order of the elements in the output array is non-deterministic, which means you might receive a different result each time you use this function.
- LIMIT: Specifies the maximum number of expression inputs in the result. The limit applies to the number of input strings, not the number of characters or bytes in the inputs. An empty string counts as 1. A NULL string is not counted. The limit n must be a constant INT64.
Return Data Types
STRING BYTES
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
SELECT STRING_AGG(fruit) AS string_agg FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit; +------------------------+ | string_agg | +------------------------+ | apple,pear,banana,pear | +------------------------+ SELECT STRING_AGG(fruit, " & ") AS string_agg FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit; +------------------------------+ | string_agg | +------------------------------+ | apple & pear & banana & pear | +------------------------------+ SELECT STRING_AGG(DISTINCT fruit, " & ") AS string_agg FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit; +-----------------------+ | string_agg | +-----------------------+ | apple & banana & pear | +-----------------------+ SELECT STRING_AGG(fruit, " & " ORDER BY LENGTH(fruit)) AS string_agg FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit; +------------------------------+ | string_agg | +------------------------------+ | pear & pear & apple & banana | +------------------------------+ SELECT STRING_AGG(fruit, " & " LIMIT 2) AS string_agg FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit; +--------------+ | string_agg | +--------------+ | apple & pear | +--------------+ SELECT STRING_AGG(DISTINCT fruit, " & " ORDER BY fruit DESC LIMIT 2) AS string_agg FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit; +---------------+ | string_agg | +---------------+ | pear & banana | +---------------+ SELECT fruit, STRING_AGG(fruit, " & ") OVER (ORDER BY LENGTH(fruit)) AS string_agg FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit; +--------+------------------------------+ | fruit | string_agg | +--------+------------------------------+ | NULL | NULL | | pear | pear & pear | | pear | pear & pear | | apple | pear & pear & apple | | banana | pear & pear & apple & banana | +--------+------------------------------+ |
SUM
1 |
SUM([DISTINCT] expression) [OVER (...)] |
Description
Returns the sum of non-null values.
If the expression is a floating point value, the sum is non-deterministic, which means you might receive a different result each time you use this function.
Supported Argument Types
Any supported numeric data types.
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions.
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
Return Data Types
- Returns INT64 if the input is an integer.
- Returns NUMERIC if the input type is NUMERIC.
- Returns FLOAT64 if the input is a floating point value.
Returns NULL if the input contains only NULLs.
Returns Inf if the input contains Inf.
Returns -Inf if the input contains -Inf.
Returns NaN if the input contains a NaN.
Returns NaN if the input contains a combination of Inf and -Inf.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
SELECT SUM(x) AS sum FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x; +-----+ | sum | +-----+ | 25 | +-----+ SELECT SUM(DISTINCT x) AS sum FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x; +-----+ | sum | +-----+ | 15 | +-----+ SELECT x, SUM(x) OVER (PARTITION BY MOD(x, 3)) AS sum FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x; +---+-----+ | x | sum | +---+-----+ | 3 | 6 | | 3 | 6 | | 1 | 10 | | 4 | 10 | | 4 | 10 | | 1 | 10 | | 2 | 9 | | 5 | 9 | | 2 | 9 | +---+-----+ SELECT x, SUM(DISTINCT x) OVER (PARTITION BY MOD(x, 3)) AS sum FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x; +---+-----+ | x | sum | +---+-----+ | 3 | 3 | | 3 | 3 | | 1 | 5 | | 4 | 5 | | 4 | 5 | | 1 | 5 | | 2 | 7 | | 5 | 7 | | 2 | 7 | +---+-----+ |
Statistical Aggregate Functions
Kochava Query supports the following statistical aggregate functions.
CORR
1 |
CORR(X1, X2) [OVER (...)] |
Description
Returns the Pearson coefficient of correlation of a set of number pairs. For each number pair, the first number is the dependent variable and the second number is the independent variable. The return result is between -1 and 1. A result of 0 indicates no correlation.
This function ignores any input pairs that contain one or more NULL values. If there are fewer than two input pairs without NULL values, this function returns NULL.
Supported Input Types
FLOAT64
Optional Clause
OVER: Specifies a window. See Analytic Functions.
Return Data Type
FLOAT64
COVAR_POP
1 |
COVAR_POP(X1, X2) [OVER (...)] |
Description
Returns the population covariance of a set of number pairs. The first number is the dependent variable; the second number is the independent variable. The return result is between -Inf and +Inf.
This function ignores any input pairs that contain one or more NULL values. If there is no input pair without NULL values, this function returns NULL. If there is exactly one input pair without NULL values, this function returns 0.
Supported Input Types
FLOAT64
Optional Clause
OVER : Specifies a window. See Analytic Functions.
Return Data Type
FLOAT64
COVAR_SAMP
1 |
COVAR_SAMP(X1, X2) [OVER (...)] |
Description
Returns the sample covariance of a set of number pairs. The first number is the dependent variable; the second number is the independent variable. The return result is between -Inf and +Inf.
This function ignores any input pairs that contain one or more NULL values. If there are fewer than two input pairs without NULL values, this function returns NULL.
Supported Input Types
FLOAT64
Optional Clause
OVER: Specifies a window. See Analytic Functions.
Return Data Type
FLOAT64
STDDEV_POP
1 |
STDDEV_POP([DISTINCT] expression) [OVER (...)] |
Description
Returns the population (biased) standard deviation of the values. The return result is between 0 and +Inf.
This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.
If this function receives a single non-NULL input, it returns 0.
Supported Input Types
FLOAT64
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STDDEV_POP().
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
Return Data Type
FLOAT64
STDDEV_SAMP
1 |
STDDEV_SAMP([DISTINCT] expression) [OVER (...)] |
Description
Returns the sample (unbiased) standard deviation of the values. The return result is between 0 and +Inf.
This function ignores any NULL inputs. If there are fewer than two non-NULL inputs, this function returns NULL.
Supported Input Types
FLOAT64
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STDDEV_SAMP().
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
Return Data Type
FLOAT64
STDDEV
1 |
STDDEV([DISTINCT] expression) [OVER (...)] |
Description
An alias of STDDEV_SAMP.
VAR_POP
1 |
VAR_POP([DISTINCT] expression) [OVER (...)] |
Description
Returns the population (biased) variance of the values. The return result is between 0 and +Inf.
This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.
If this function receives a single non-NULL input, it returns 0.
Supported Input Types
FLOAT64
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within VAR_POP().
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
Return Data Type
FLOAT64
VAR_SAMP
1 |
VAR_SAMP([DISTINCT] expression) [OVER (...)] |
Description
Returns the sample (unbiased) variance of the values. The return result is between 0 and +Inf.
This function ignores any NULL inputs. If there are fewer than two non-NULL inputs, this function returns NULL.
Supported Input Types
FLOAT64
Optional Clauses
The clauses are applied in the following order:
- OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within VAR_SAMP().
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
Return Data Type
FLOAT64
VARIANCE
1 |
VARIANCE([DISTINCT] expression) [OVER (...)] |
Description
An alias of VAR_SAMP.
Approximate Aggregate Functions
Approximate aggregate functions are scalable in terms of memory usage and time, but produce approximate results instead of exact results. For more background, see Approximate Aggregation.
APPROX_COUNT_DISTINCT
1 |
APPROX_COUNT_DISTINCT(expression) |
Description
Returns the approximate result for COUNT(DISTINCT expression). The value returned is a statistical estimate—not necessarily the actual value.
This function is less accurate than COUNT(DISTINCT expression), but performs better on huge input.
Supported Argument Types
Any data type except: ARRAY STRUCT
Returned Data Types
INT64
Examples
1 2 3 4 5 6 7 8 |
SELECT APPROX_COUNT_DISTINCT(x) as approx_distinct FROM UNNEST([0, 1, 1, 2, 3, 5]) as x; +-----------------+ | approx_distinct | +-----------------+ | 5 | +-----------------+ |
APPROX_QUANTILES
1 |
APPROX_QUANTILES([DISTINCT] expression, number [{IGNORE|RESPECT} NULLS]) |
Description
Returns the approximate boundaries for a group of expression values, where number represents the number of quantiles to create. This function returns an array of number + 1 elements, where the first element is the approximate minimum and the last element is the approximate maximum.
Supported Argument Types
expression can be any supported data type except: ARRAY STRUCT number must be INT64.
Optional Clauses
The clauses are applied in the following order:
- DISTINCT: Each distinct value of expression is aggregated only once into the result.
- IGNORE NULLS or RESPECT NULLS: If IGNORE NULLS is specified or if neither is specified, the NULL values are excluded from the result. If RESPECT NULLS is specified, the NULL values are included in the result (an error is raised, however, if an array in the final query result contains a NULL element).
Returned Data Types
An ARRAY of the type specified by the expression parameter.
Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
SELECT APPROX_QUANTILES(x, 2) AS approx_quantiles FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x; +------------------+ | approx_quantiles | +------------------+ | [1, 5, 10] | +------------------+ SELECT APPROX_QUANTILES(x, 100)[OFFSET(90)] AS percentile_90 FROM UNNEST([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) AS x; +---------------+ | percentile_90 | +---------------+ | 9 | +---------------+ SELECT APPROX_QUANTILES(DISTINCT x, 2) AS approx_quantiles FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x; +------------------+ | approx_quantiles | +------------------+ | [1, 6, 10] | +------------------+ SELECT FORMAT("%T", APPROX_QUANTILES(x, 2 RESPECT NULLS)) AS approx_quantiles FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x; +------------------+ | approx_quantiles | +------------------+ | [NULL, 4, 10] | +------------------+ SELECT FORMAT("%T", APPROX_QUANTILES(DISTINCT x, 2 RESPECT NULLS)) AS approx_quantiles FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x; +------------------+ | approx_quantiles | +------------------+ | [NULL, 6, 10] | +------------------+ |
APPROX_TOP_COUNT
1 |
APPROX_TOP_COUNT(expression, number) |
Description
Returns the approximate top elements of expression. The number parameter specifies the number of elements returned.
Supported Argument Types
expression can be of any data type that the GROUP BY clause supports. number must be INT64.
Returned Data Types
An ARRAY of type STRUCT. The STRUCT contains two fields. The first field (named value) contains an input value. The second field (named count) contains an INT64 specifying the number of times the value was returned.
Returns NULL if there are zero input rows.
Examples
1 2 3 4 5 6 7 8 |
SELECT APPROX_TOP_COUNT(x, 2) as approx_top_count FROM UNNEST(["apple", "apple", "pear", "pear", "pear", "banana"]) as x; +-------------------------+ | approx_top_count | +-------------------------+ | [{pear, 3}, {apple, 2}] | +-------------------------+ |
NULL Handling
APPROX_TOP_COUNT does not ignore NULLs in the input. For example:
1 2 3 4 5 6 7 8 |
SELECT APPROX_TOP_COUNT(x, 2) as approx_top_count FROM UNNEST([NULL, "pear", "pear", "pear", "apple", NULL]) as x; +------------------------+ | approx_top_count | +------------------------+ | [{pear, 3}, {NULL, 2}] | +------------------------+ |
APPROX_TOP_SUM
1 |
APPROX_TOP_SUM(expression, weight, number) |
Description
Returns the approximate top elements of expression, based on the sum of an assigned weight. The number parameter specifies the number of elements returned.
If the weight input is negative or NaN, this function returns an error.
Supported Argument Types
expression can be of any data type that the GROUP BY clause supports. weight must be one of the following:- INT64
- FLOAT64
Returned Data Types
An ARRAY of type STRUCT. The STRUCT contains two fields: value and sum. The value field contains the value of the input expression. The sum field is the same type as weight, and is the approximate sum of the input weight associated with the value field.
Returns NULL if there are zero input rows.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM UNNEST([ STRUCT("apple" AS x, 3 AS weight), ("pear", 2), ("apple", 0), ("banana", 5), ("pear", 4) ]); +--------------------------+ | approx_top_sum | +--------------------------+ | [{pear, 6}, {banana, 5}] | +--------------------------+ |
NULL Handling
APPROX_TOP_SUM does not ignore NULL values for the expression and weight parameters.
1 2 3 4 5 6 7 8 |
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM UNNEST([STRUCT("apple" AS x, NULL AS weight), ("pear", 0), ("pear", NULL)]); +----------------------------+ | approx_top_sum | +----------------------------+ | [{pear, 0}, {apple, NULL}] | +----------------------------+ |
1 2 3 4 5 6 7 8 |
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM UNNEST([STRUCT("apple" AS x, 0 AS weight), (NULL, 2)]); +-------------------------+ | approx_top_sum | +-------------------------+ | [{NULL, 2}, {apple, 0}] | +-------------------------+ |
1 2 3 4 5 6 7 8 |
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM UNNEST([STRUCT("apple" AS x, 0 AS weight), (NULL, NULL)]); +----------------------------+ | approx_top_sum | +----------------------------+ | [{apple, 0}, {NULL, NULL}] | +----------------------------+ |
HyperLogLog++ Functions
Kochava Query supports the following approximate aggregate functions using the HyperLogLog++ algorithm. For an explanation of how approximate aggregate functions work, see Approximate Aggregation.
HLL_COUNT.INT
1 |
HLL_COUNT.INIT(input [, precision]) |
Description
A scalar function that takes one or more input values and aggregates them into a HyperLogLog++ sketch. Each sketch is represented using the BYTES data type. You can then merge sketches using HLL_COUNT.MERGE or HLL_COUNT.MERGE_PARTIAL . If no merging is needed, you can extract the final count of distinct values from the sketch using HLL_COUNT.EXTRACT .
An input can be one of the following:
- INT64
- STRING
- BYTES
This function supports an optional parameter, precision. This parameter defines the accuracy of the estimate at the cost of additional memory required to process the sketches or store them on disk. The following table shows the allowed precision values, the maximum sketch size per group, and confidence interval (CI) of typical precisions:
Precision | Max. Sketch Size (MB) | 65% CI | 95% CI | 99% CI |
---|---|---|---|---|
10 | 1 | ±1.63% | ±3.25% | ±6.50% |
11 | 2 | ±1.15% | ±2.30% | ±4.60% |
12 | 4 | ±0.81% | ±1.63% | ±3.25% |
13 | 8 | ±0.57% | ±1.15% | ±1.72% |
14 | 16 | ±0.41% | ±0.81% | ±1.22% |
15(default) | 32 | ±0.29% | ±0.57% | ±0.86% |
16 | 64 | ±0.20% | ±0.41% | ±0.61% |
17 | 128 | ±0.14% | ±0.29% | ±0.43% |
18 | 256 | ±0.10% | ±0.20% | ±0.41% |
19 | 512 | ±0.07% | ±0.14% | ±0.29% |
20 | 1024 | ±0.05% | ±0.10% | ±0.20% |
21 | 2048 | ±0.04% | ±0.07% | ±0.14% |
22 | 4096 | ±0.03% | ±0.05% | ±0.10% |
23 | 8192 | ±0.02% | ±0.04% | ±0.07% |
24 | 16384 | ±0.01% | ±0.03% | ±0.05% |
If the input is NULL, this function returns NULL.
For more information, see HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm.
Supported Input Type
BYTES
Return Type
BYTES
Example
1 2 3 4 5 6 7 8 9 10 |
SELECT HLL_COUNT.INIT(respondent) AS respondents_hll, flavor, country FROM UNNEST([ STRUCT(1 AS respondent, "Vanilla" AS flavor, "CH" AS country), (1, "Chocolate", "CH"), (2, "Chocolate", "US"), (2, "Strawberry", "US")]) GROUP BY flavor, country; |
HLL_COUNT.MERGE
1 |
HLL_COUNT.MERGE(sketch) |
Description
An aggregate function that returns the cardinality of several HyperLogLog++ set sketches by computing their union.
Each sketch must have the same precision and be initialized on the same type. Attempts to merge sketches with different precisions or for different types results in an error. For example, you cannot merge a sketch initialized from INT64 data with one initialized from STRING data.
This function ignores NULL values when merging sketches. If the merge happens over zero rows or only over NULL values, the function returns 0.
Supported Input Types
BYTES
Return Type
INT64
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT HLL_COUNT.MERGE(respondents_hll) AS num_respondents, flavor FROM ( SELECT HLL_COUNT.INIT(respondent) AS respondents_hll, flavor, country FROM UNNEST([ STRUCT(1 AS respondent, "Vanilla" AS flavor, "CH" AS country), (1, "Chocolate", "CH"), (2, "Chocolate", "US"), (2, "Strawberry", "US")]) GROUP BY flavor, country) GROUP BY flavor; |
HLL_COUNT.MERGE_PARTIAL
1 |
HLL_COUNT.MERGE_PARTIAL(sketch) |
Description
An aggregate function that takes one or more HyperLogLog++ sketch inputs and merges them into a new sketch.
This function returns NULL if there is no input or all inputs are NULL.
Supported Input Types
BYTES
Return Types
BYTES
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT HLL_COUNT.MERGE_PARTIAL(respondents_hll) AS num_respondents, flavor FROM ( SELECT HLL_COUNT.INIT(respondent) AS respondents_hll, flavor, country FROM UNNEST([ STRUCT(1 AS respondent, "Vanilla" AS flavor, "CH" AS country), (1, "Chocolate", "CH"), (2, "Chocolate", "US"), (2, "Strawberry", "US")]) GROUP BY flavor, country) GROUP BY flavor; |
HLL_COUNT.EXTRACT
1 |
HLL_COUNT.EXTRACT(sketch) |
Description
A scalar function that extracts an cardinality estimate of a single HyperLogLog++ sketch.
If sketch is NULL, this function returns a cardinality estimate of 0.
Supported Input Types
BYTES
Return Type
INT64
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
SELECT flavor, country, HLL_COUNT.EXTRACT(respondents_hll) AS num_respondents FROM ( SELECT HLL_COUNT.INIT(respondent) AS respondents_hll, flavor, country FROM UNNEST([ STRUCT(1 AS respondent, "Vanilla" AS flavor, "CH" AS country), (1, "Chocolate", "CH"), (2, "Chocolate", "US"), (2, "Strawberry", "US")]) GROUP BY flavor, country); +------------+---------+-----------------+ | flavor | country | num_respondents | +------------+---------+-----------------+ | Vanilla | CH | 1 | | Chocolate | CH | 1 | | Chocolate | US | 1 | | Strawberry | US | 1 | +------------+---------+-----------------+ |
Numbering Functions
The following sections describe the numbering functions that Kochava Query supports. Numbering functions are a subset of analytic functions. For an explanation of how analytic functions work, see Analytic Function Concepts. For a description of how numbering functions work, see the Numbering Function Concepts.
OVER clause requirements:- PARTITION BY : Optional.
- ORDER BY : Required, except for ROW_NUMBER().
- window_frame_clause: Disallowed.
RANK
Description
Returns the ordinal (1-based) rank of each row within the ordered partition. All peer rows receive the same rank value. The next row or set of peer rows receives a rank value which increments by the number of peers with the previous rank value, instead of DENSE_RANK, which always increments by 1.
Supported Argument Types
INT64
DENSE_RANK
Description
Returns the ordinal (1-based) rank of each row within the window partition. All peer rows receive the same rank value, and the subsequent rank value is incremented by one.
Supported Argument Types
INT64
PERCENT_RANK
Description
Return the percentile rank of a row defined as (RK-1)/(NR-1), where RK is the RANK of the row and NR is the number of rows in the partition. Returns 0 if NR=1.
Supported Argument Types
FLOAT64
CUME_DIST
Description
Return the relative rank of a row defined as NP/NR. NP is defined to be the number of rows that either precede or are peers with the current row. NR is the number of rows in the partition.
Supported Argument Types
FLOAT64
NTILE
1 |
NTILE(constant_integer_expression) |
Description
This function divides the rows into constant_integer_expression buckets based on row ordering and returns the 1-based bucket number that is assigned to each row. The number of rows in the buckets can differ by at most 1. The remainder values (the remainder of number of rows divided by buckets) are distributed one for each bucket, starting with bucket 1. If constant_integer_expression evaluates to NULL, 0 or negative, an error is provided.
Supported Argument Types
INT64
ROW_NUMBER
Description
Does not require the ORDER BY clause. Returns the sequential row ordinal (1-based) of each row for each ordered partition. If the ORDER BY clause is unspecified then the result is non-deterministic.
Supported Argument Types
INT64
Bit Functions
Kochava Query supports the following bit functions.
BIT_COUNT
1 |
BIT_COUNT(expression) |
Description
The input, expression , must be an integer or BYTES.
Returns the number of bits that are set in the input expression. For signed integers, this is the number of bits in two’s complement form.
Return Data Type
INT64
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
SELECT a, BIT_COUNT(a) AS a_bits, FORMAT("%T", b) as b, BIT_COUNT(b) AS b_bits FROM UNNEST([ STRUCT(0 AS a, b'' AS b), (0, b'\x00'), (5, b'\x05'), (8, b'\x00\x08'), (0xFFFF, b'\xFF\xFF'), (-2, b'\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFE'), (-1, b'\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF'), (NULL, b'\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF') ]) AS x; +-------+--------+---------------------------------------------+--------+ | a | a_bits | b | b_bits | +-------+--------+---------------------------------------------+--------+ | 0 | 0 | b"" | 0 | | 0 | 0 | b"\x00" | 0 | | 5 | 2 | b"\x05" | 2 | | 8 | 1 | b"\x00\x08" | 1 | | 65535 | 16 | b"\xff\xff" | 16 | | -2 | 63 | b"\xff\xff\xff\xff\xff\xff\xff\xfe" | 63 | | -1 | 64 | b"\xff\xff\xff\xff\xff\xff\xff\xff" | 64 | | NULL | NULL | b"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" | 80 | +-------+--------+---------------------------------------------+--------+ |
Mathematical Functions
All mathematical functions have the following behaviors:
- They return NULL if any of the input parameters is NULL.
- They return NaN if any of the arguments is NaN.
ABS
1 |
ABS(X) |
Description
Computes absolute value. Returns an error if the argument is an integer and the output value cannot be represented as the same type; this happens only for the largest negative input value, which has no positive representation. Returns +inf for a +/-inf argument.
SIGN
1 |
SIGN(X) |
Description
Returns -1, 0, or +1 for negative, zero and positive arguments respectively. For floating point arguments, this function does not distinguish between positive and negative zero. Returns NaN for a NaN argument.
IS_INF
1 |
IS_INF(X) |
Description
Returns TRUE if the value is positive or negative infinity. Returns NULL for NULL inputs.
IS_NAN
1 |
IS_NAN(X) |
Description
Returns TRUE if the value is a NaN value. Returns NULL for NULL inputs.
IEEE_DIVIDE
1 |
IEEE_DIVIDE(X, Y) |
Description
Divides X by Y; this function never fails. Returns FLOAT64. Unlike the division operator (/), this function does not generate errors for division by zero or overflow.
Special cases:
- If the result overflows, returns +/-inf.
- If Y=0 and X=0, returns NaN.
- If Y=0 and X!=0, returns +/-inf.
- If X = +/-inf and Y = +/-inf , returns NaN.
The behavior of IEEE_DIVIDE is further illustrated in the table below.
Special Cases for IEEE_DIVIDE
The following table lists special cases for IEEE_DIVIDE.
Numerator Data Type (X) | Denominator Data Type (Y) | Result Value |
---|---|---|
Anything except 0 | 0 | +/-inf |
0 | 0 | NaN |
0 | NaN | NaN |
NaN | 0 | NaN |
+/-inf | +/-inf | NaN |
RAND
1 |
RAND() |
Description
Generates a pseudo-random value of type FLOAT64 in the range of [0, 1), inclusive of 0 and exclusive of 1.
SQRT
SQRT(X)
Description
Computes the square root of X. Generates an error if X is less than 0. Returns +inf if X is +inf.
POW
1 |
POW(X, Y) |
Description
Returns the value of X raised to the power of Y. If the result underflows and is not representable, then the function returns a value of zero. Returns an error if one of the following is true:
- X is a finite value less than 0 and Y is a noninteger
- X is 0 and Y is a finite value less than 0
The behavior of POW() is further illustrated in the table below.
POWER
1 |
POWER(X, Y) |
Description
Synonym of POW().
Special cases for POW(X, Y) and POWER(X, Y)
The following are special cases for POW(X, Y) and POWER(X, Y).
X | Y | POW(X, Y) or POWER(X, Y) |
---|---|---|
1.0 | Any value including NaN | 1.0 |
any including NaN | 0 | 1.0 |
-1.0 | +/-inf | 1.0 |
ABS(X) < 1 | -inf | +inf |
ABS(X) < 1 | -inf | 0 |
ABS(X) < 1 | +inf | 0 |
ABS(X) < 1 | +inf | +inf |
-inf | Y < 0 | 0 |
-inf | Y < 0 | -inf if Y is an odd integer, +inf otherwise |
+inf | Y < 0 | 0 |
+inf | Y < 0 | +inf |
EXP
1 |
EXP(X) |
Description
Computes e to the power of X, also called the natural exponential function. If the result underflows, this function returns a zero. Generates an error if the result overflows. If X is +/-inf, then this function returns +inf or 0.
LN
1 |
LN(X) |
Description
Computes the natural logarithm of X. Generates an error if X is less than or equal to zero. If X is +inf, then this function returns +inf.
LOG
1 |
LOG(X [, Y]) |
Description
If only X is present, LOG is a synonym of LN. If Y is also present, LOG computes the logarithm of X to base Y. Generates an error in these cases:
- X is less than or equal to zero
- Y is 1.0
- Y is less than or equal to zero.
The behavior of LOG(X, Y) is further illustrated in the table below.
Special Cases for LOG(X, Y)
X | Y | LOG(X, Y) |
---|---|---|
-inf | Any Value | NaN |
Any Value | +inf | NaN |
+inf | 0.0 Y < 1.0 | -inf |
+inf | Y > 1.0 | +inf |
LOG10
1 |
LOG10(X) |
Description
Similar to LOG, but computes logarithm to base 10.
GREATEST
1 |
GREATEST(X1,...,XN) |
Description
Returns NULL if any of the inputs is NULL . Otherwise, returns NaN if any of the inputs is NaN . Otherwise, returns the largest value among X1,…,XN according to the < comparison.
LEAST
1 |
LEAST(X1,...,XN) |
Description
Returns NULL if any of the inputs is NULL . Returns NaN if any of the inputs is NaN . Otherwise, returns the smallest value among X1,…,XN according to the > comparison.
DIV
1 |
DIV(X, Y) |
Description
Returns the result of integer division of X by Y. Division by zero returns an error. Division by -1 may overflow. See the table below for possible result types.
SAFE_DIVIDE
1 |
SAFE_DIVIDE(X, Y) |
Description
Equivalent to the division operator (/), but returns NULL if an error occurs, such as a division by zero error.
MOD
1 |
MOD(X, Y) |
Description
Modulo function: returns the remainder of the division of X by Y. Returned value has the same sign as X. An error is generated if Y is 0. See the table below for possible result types.
ROUND
1 |
ROUND(X [, N]) |
Description
If only X is present, ROUND rounds X to the nearest integer. If N is present, ROUND rounds X to N decimal places after the decimal point. If N is negative, ROUND will round off digits to the left of the decimal point. Rounds halfway cases away from zero. Generates an error if overflow occurs.
TRUNC
1 |
TRUNC(X [, N]) |
Description
If only X is present, TRUNC rounds X to the nearest integer whose absolute value is not greater than the absolute value of X. If N is also present, TRUNC behaves like ROUND(X, N) , but always rounds towards zero and never overflows.
CEIL
1 |
CEIL(X) |
Description
Returns the smallest integral value (with FLOAT64 type) that is not less than X.
CEILING
1 |
CEILING(X) |
Description
Synonym of CEIL(X)
FLOOR
1 |
FLOOR(X) |
Description
Returns the largest integral value (with FLOAT64 type) that is not greater than X.
Example Rounding Function Behavior
Example behavior of Kochava Query rounding functions:
Input “X” | ROUND(X) | TRUNC(X) | CEIL(X) | FLOOR(X) |
---|---|---|---|---|
2.0 | 2.0 | 2.0 | 2.0 | 2.0 |
2.3 | 2.0 | 2.0 | 3.0 | 2.0 |
2.8 | 3.0 | 2.0 | 3.0 | 2.0 |
2.5 | 3.0 | 2.0 | 3.0 | 2.0 |
-2.3 | -2.0 | -2.0 | -2.0 | -3.0 |
-2.8 | -3.0 | -2.0 | -2.0 | -3.0 |
-2.5 | -3.0 | -2.0 | -2.0 | -3.0 |
0 | 0 | 0 | 0 | 0 |
+/-inf | +/-inf | +/-inf | +/-inf | +/-inf |
NaN | NaN | NaN | NaN | NaN |
COS
1 |
COS(X) |
Description
Computes cosine of X. Never fails.
COSH
1 |
COSH(X) |
Description
Computes the hyperbolic cosine of X. Generates an error if an overflow occurs.
ACOS
1 |
ACOS(X) |
Description
Computes the principal value of the arc cosine of X. The return value is in the range [0,]. Generates an error if X is a finite value outside of range [-1, 1].
ACOSH
1 |
ACOSH(X) |
Description
Computes the inverse hyperbolic cosine of X. Generates an error if X is a finite value less than 1.
SIN
1 |
SIN(X) |
Description
Computes the sine of X. Never fails.
SINH
1 |
SINH(X) |
Description
Computes the hyperbolic sine of X. Generates an error if an overflow occurs.
ASIN
1 |
ASIN(X) |
Description
Computes the principal value of the arc sine of X. The return value is in the range [-π/2,π/2]. Generates an error if X is a finite value outside of range [-1, 1].
ASINH
1 |
ASINH(X) |
Description
Computes the inverse hyperbolic sine of X. Does not fail.
TAN
1 |
TAN(X) |
Description
Computes tangent of X. Generates an error if an overflow occurs.
TANH
1 |
TANH(X) |
Description
Computes hyperbolic tangent of X. Does not fail.
ATAN
1 |
ATAN(X) |
Description
Computes the principal value of the arc tangent of X. The return value is in the range [-π/2,π/2]. Does not fail.
ATANH
1 |
ATANH(X) |
Description
Computes the inverse hyperbolic tangent of X. Generates an error if the absolute value of X is greater or equal 1.
ATAN2
1 |
ATAN2(Y, X) |
Description
Calculates the principal value of the arc tangent of Y/X using the signs of the two arguments to determine the quadrant. The return value is in the range [-π,π]. The behavior of this function is further illustrated in the table below.
Special cases for ATAN2()
Y | X | ATAN2(Y, X) |
---|---|---|
NaN | Any Value | NaN |
Any Value | NaN | NaN |
0 | 0 | 0, π or -π depending on the sign of X and Y |
Finite value | -inf | π or -π depending on the sign of Y |
Finite value | +inf | 0 |
+/-inf | Finite value | π/2 or π/2 depending on the sign of Y |
+/-inf | -inf | ¾π or -¾π depending on the sign of Y |
+/-inf | +inf | π/4 or -π/4 depending on the sign of Y |
Special Cases for Trigonometric and Hyperbolic Rounding Functions
X | COS(X) | COSH(X) | ACOS(X) | ACOSH(X) | SIN(X) | SINH(X) | ASIN(X) | ASINH(X) | TAN(X) | TANH(X) | ATAN(X) | ATANH(X) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
+/-inf | NaN | =+inf | NaN | =+inf | NaN | =+inf | NaN | =+inf | NaN | =+1.0 | π/2 | NaN |
-inf | NaN | =+inf | NaN | NaN | NaN | -inf | NaN | -inf | NaN | -1.0 | -π/2 | NaN |
NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
Navigation Functions
The following sections describe the navigation functions that Kochava Query supports. Navigation functions are a subset of analytic functions. For an explanation of how analytic functions work, see Analytic Function Concepts. For an explanation of how navigation functions work, see Navigation Function Concepts.
FIRST_VALUE
1 |
FIRST_VALUE (value_expression [ |