Support Home > Advanced Tools > Custom Query > Standard SQL Functions and Operators

Standard SQL Functions and Operators

This page explains Kochava Query expressions, including functions and operators.

Function Call Rules

The following rules apply to all functions unless explicitly indicated otherwise in the function description:

  • For functions that accept numeric types, if one operand is a floating point operand and the other operand is another numeric type, both operands are converted to FLOAT64 before the function is evaluated.
  • If an operand is NULL , the result is NULL , with the exception of the IS operator.
  • For functions that are time zone sensitive (as indicated in the function description), the default time zone, UTC, is used if a time zone is not specified.

SAFE. prefix

Syntax:

 

Desctiption

If you begin a function with the SAFE. prefix, it will return NULL instead of an error. The SAFE. prefix only prevents errors from the prefixed function itself: it does not prevent errors that occur while evaluating argument expressions. The SAFE. prefix only prevents errors that occur because of the value of the function inputs, such as “value out of range” errors; other errors, such as internal or system errors, may still occur. If the function does not return an error, SAFE. has no effect on the output. If the function never returns an error, like RAND , then SAFE. has no effect.

Operators, such as + and = , do not support the SAFE. prefix. To prevent errors from a division operation, use SAFE_DIVIDE. Some operators, such as IN , ARRAY , and UNNEST , resemble functions, but do not support the SAFE. prefix. The CAST and EXTRACT functions also do not support the SAFE. prefix. To prevent errors from casting, use SAFE_CAST.

 

Example:

In the following example, the first use of the SUBSTR function would normally return an error, because the function does not support length arguments with negative values. However, the SAFE. prefix causes the function to return NULL instead. The second use of the SUBSTR function provides the expected output: the SAFE. prefix has no effect.

 

Supported Functions

Kochava Query supports the use of the SAFE. prefix with most scalar functions that can raise errors, including STRING functions, math functions, DATE functions, DATETIME functions, and TIMESTAMP functions. Kochava Query does not support the use of the SAFE. prefix with aggregate, analytic, or user-defined functions.


Conversion Rules

“Conversion” includes, but is not limited to, casting and coercion.

  • Casting is explicit conversion and uses the CAST() function.
  • Coercion is implicit conversion, which Kochava Query performs automatically under the conditions described below.
  • There is a third group of conversion functions that have their own function names, such as UNIX_DATE() .

The table below summarizes all possible CAST and coercion possibilities for Kochava Query data types. “Coercion To” applies to all expressions of a given data type (e.g., a column) , but literals and parameters can also be coerced. See Literal Coercion and Parameter Coercion for details.

From Type CAST to Coercion To
INT64 BOOL
INT64
NUMERIC
FLOAT64
STRING
FLOAT64
NUMERIC
NUMERIC INT64
NUMERIC
FLOAT64
STRING
FLOAT64
FLOAT64 INT64
NUMERIC
FLOAT64
STRING
BOOL BOOL
INT64
STRING
STRING BOOL
INT64
NUMERIC
FLOAT64
STRING
BYTES
DATE
DATETIME
TIME
TIMESTAMP
BYTES BYTES
STRING
DATE DATE
DATETIME
STRING
TIMESTAMP
DATETIME DATE
DATETIME
STRING
TIME
TIMESTAMP
TIME STRING
TIME
TIMESTAMP DATE
DATETIME
STRING
TIME
TIMESTAMP
ARRAY ARRAY
STRUCT STRUCT

 

Casting

Syntax:

 

Cast syntax is used in a query to indicate that the result type of an expression should be converted to some other type.

 

Example:

 

This results in "true" if x is 1 , "false" for any other non-NULL value, and NULL if x is NULL .

Casts between supported types that do not successfully map from the original value to the target domain produce runtime errors. For example, casting BYTES to STRING where the byte sequence is not valid UTF-8 results in a runtime error.

When casting an expression x of the following types, these rules apply:

 

From To Rule(s) when casting x
INT64 FLOAT64 Returns a close but potentially not exact FLOAT64 value.
INT64 BOOL Returns FALSE if x is 0, TRUE otherwise.
NUMERIC Floating Point NUMERIC will convert to the closest floating point number with a possible loss of precision.
FLOAT64 INT64 Returns the closest INT64 value.
Halfway cases such as 1.5 or -0.5 round away from zero.
FLOAT64 STRING Returns an approximate string representation.
FLOAT64 NUMERIC If the floating point number has more than nine digits after the decimal point, it will round half away from zero. Casting a NaN, +inf or -inf will return an error. Casting a value outside the range of NUMERIC will return an overflow error.
BOOL INT64 Returns 1 if x is TRUE, 0 otherwise.
BOOL STRING Returns “true” if x is TRUE, “false” otherwise.
STRING FLOAT64 Returns x as a FLOAT64 value, interpreting it as having the same form as a valid FLOAT64 literal.
Also supports casts from “inf“, “+inf“, “-inf“, and “nan“.
Conversions are case-insensitive.
STRING NUMERIC The numeric literal contained in the STRING must not exceed the maximum precision or range of the NUMERIC type, or an error will occur. If the number of digits after the decimal point exceeds nine, then the resulting NUMERIC value will round half away from zero to have nine digits after the decimal point.
STRING BOOL Returns TRUE if x is “true” and FALSE if x is “false
All other values of x are invalid and throw an error instead of casting to BOOL.
STRINGs are case-insensitive when converting to BOOL.
STRING BYTES STRINGs are cast to BYTES using UTF-8 encoding. For example, the STRING “©”, when cast to BYTES, would become a 2-byte sequence with the hex values C2 and A9.
BYTES STRING Returns x interpreted as a UTF-8 STRING.
For example, the BYTES literal b’\xc2\xa9′, when cast to STRING, is interpreted as UTF-8 and becomes the unicode character “©”.
An error occurs if x is not valid UTF-8.
ARRAY ARRAY Must be the exact same ARRAY type.
STRUCT STRUCT Allowed if the following conditions are met:

  1. The two STRUCTs have the same number of fields.
  2. The original STRUCT field types can be explicitly cast to the corresponding target STRUCT field types (as defined by field order, not field name).

 

Safe Casting

When using CAST , a query can fail if Kochava Query is unable to perform the cast. For example, the following query generates an error:

 

If you want to protect your queries from these types of errors, you can use SAFE_CAST . SAFE_CAST is identical to CAST , except it returns NULL instead of raising an error.

 

If you are casting from bytes to strings, you can also use the function, SAFE_CONVERT_BYTES_TO_STRING . Any invalid UTF-8 characters are replaced with the unicode replacement character, U+FFFD . See SAFE_CONVERT_BYTES_TO_STRING for more information.

 

Casting Hex Strings to Integers

If you are working with hex strings (0x123), you can cast those strings as integers:

 

Casting Date Types

Kochava Query supports casting date types to/from strings as follows:

 

Casting from a date type to a string is independent of time zone and is of the form YYYY-MM-DD . When casting from string to date, the string must conform to the supported date literal format, and is independent of time zone. If the string expression is invalid or represents a date that is outside of the supported min/max range, then an error is produced.

 

Casting Timestamp Types

Kochava Query supports casting timestamp types to/from strings as follows:

 

When casting from timestamp types to string, the timestamp is interpreted using the default time zone, UTC. The number of subsecond digits produced depends on the number of trailing zeroes in the subsecond part: the CAST function will truncate zero, three, or six digits.

When casting from string to a timestamp, string_expression must conform to the supported timestamp literal formats, or else a runtime error occurs. The string_expression may itself contain a time_zone —see time zones. If there is a time zone in the string_expression , that time zone is used for conversion, otherwise the default time zone, UTC, is used. If the string has fewer than six digits, then it is implicitly widened.

An error is produced if the string_expression is invalid, has more than six subsecond digits (i.e. precision greater than microseconds), or represents a time outside of the supported timestamp range.

 

Casting Between Date and Timestamp Types

Kochava Query supports casting between date and timestamp types as follows:

 

Casting from a date to a timestamp interprets date_expression as of midnight (start of the day) in the default time zone, UTC. Casting from a timestamp to date effectively truncates the timestamp as of the default time zone.


Coercion

Kochava
Query coerces the result type of an expression to another type if needed to match function signatures. For example, if function func() is defined to take a single argument of type INT64 and an expression is used as an argument that has a result type of FLOAT64, then the result of the expression will be coerced to INT64 type before func() is computed.

 

Literal Coercion

Kochava Query supports the following literal coercions:

Input Data Type Result Data Type Notes
STRING literal DATE
TIMESTAMP

Literal coercion is needed when the actual literal type is different from the type expected by the function in question. For example, if function func() takes a DATE argument, then the expression func(“2014-09-27”) is valid because the STRING literal “2014-09-27” is coerced to DATE.

Literal conversion is evaluated at analysis time, and gives an error if the input literal cannot be converted successfully to the target type.

NOTE: String literals do not coerce to numeric types.


Parameter Coercion

Kochava Query supports the following parameter coercions:

Input Data Type Result Data Type
STRING parameter

If the parameter value cannot be coerced successfully to the target type, an error is provided.


Additional Conversion Functions

Kochava Query provides the following additional conversion functions:

  • DATE functions
  • DATETIME functions
  • TIME functions
  • TIMESTAMP functions

Aggregate Functions

An aggregate function is a function that performs a calculation on a set of values. COUNT, MIN and MAX are examples of aggregate functions.

The following sections describe the aggregate functions that Kochava Query supports.


ANY_VALUE

 

Description

Returns any value from the input or NULL if there are zero input rows. The value returned is non-deterministic, which means you might receive a different result each time you use this function.

 

Supported Argument Types

Any

 

Optional Clause

OVER : Specifies a window. See Analytic Functions.

 

Returned Data Types

Matches the input data type.

 

Examples


ARRAY_AGG

 

Description

Returns an ARRAY of expression values.

 

Supported Argument Types

All data types except ARRAY.

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER : Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within ARRAY_AGG() .
  2. DISTINCT : Each distinct value of expression is aggregated only once into the result.
  3. IGNORE NULLS or RESPECT NULLS : If IGNORE NULLS is specified, the NULL values are excluded from the result. If RESPECT NULLS is specified or if neither is specified, the NULL values are included in the result (an error is raised, however, if an array in the final query result contains a NULL element).
  4. ORDER BY : Specifies the order of the values.
    • For each sort key, the default sort direction is ASC.
    • NULLs: In the context of the ORDER BY clause, NULLs are the minimum possible value; that is, NULLs appear first in ASC sorts and last in DESC sorts.
    • Floating point data types: see Floating Point Semantics on ordering and grouping.
    • If DISTINCT is also specified, then the sort key must be the same as expression .
    • If ORDER BY is not specified, the order of the elements in the output array is non-deterministic, which means you might receive a different result each time you use this function.
  5. LIMIT: Specifies the maximum number of expression inputs in the result. The limit n must be a constant INT64.

 

Returned Data Types

ARRAY

If there are zero input rows, this function returns NULL.

Examples


ARRAY_CONCAT_AGG

 

Description

Concatenates elements from expression of type ARRAY, returning a single ARRAY as a result. This function ignores NULL input arrays, but respects the NULL elements in non-NULL input arrays (an error is raised, however, if an array in the final query result contains a NULL element).

 

Supported Argument Types

ARRAY

 

Optional Clauses

The clauses are applied in the following order:

  1. ORDER BY: Specifies the order of the values.
    • For each sort key, the default sort direction is ASC.
    • Array ordering is not supported, and thus the sort key cannot be the same as expression.
    • NULLs: In the context of the ORDER BY clause, NULLs are the minimum possible value; that is, NULLs appear first in ASC sorts and last in DESC sorts.
    • Floating point data types: see Floating Point Semantics on ordering and grouping.
    • If ORDER BY is not specified, the order of the elements in the output array is non-deterministic, which means you might receive a different result each time you use this function.
  2. LIMIT: Specifies the maximum number of expression inputs in the result. The limit applies to the number of input arrays, not the number of elements in the arrays. An empty array counts as 1. A NULL array is not counted. The limit n must be a constant INT64.

 

Returned Data Types

ARRAY

Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

Examples


AVG

Description

Returns the average of non- NULL input values, or NaN if the input contains a NaN.

 

Supported Argument Types

Any numeric input type, such as INT64. Note that, for floating point input types, the return result is non-deterministic, which means you might receive a different result each time you use this function.

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within AVG().
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Returned Data Types

  • NUMERIC if the input type is NUMERIC.
  • FLOAT64

Examples


BIT_AND

 

Description

Performs a bitwise AND operation on expression and returns the result.

 

Supported Argument Types

  • INT64

 

Returned Data Types

INT64

Examples


BIT_OR

 

Description

Performs a bitwise OR operation on expression and returns the result.

 

Supported Argument Types

  • INT64

 

Returned Data Types

INT64

Examples


BIT_XOR

 

Description

Performs a bitwise XOR operation on expression and returns the result.

 

Supported Argument Types

  • INT64

 

Optional Clause

DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Returned Data Types

INT64

Examples


COUNT

  1. COUNT(*)  [OVER (...)]
  2. COUNT([DISTINCT] expression)  [OVER (...)]

 

Description

  1. Returns the number of rows in the input.
  2. Returns the number of rows with expression evaluated to any value other than NULL.

 

Supported Argument Types

expression can be any data type.

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions.
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Return Data Types

INT64

Examples


COUNTIF

 

Description

Returns the count of TRUE values for expression. Returns if there are zero input rows or expression evaluates to FALSE for all rows.

 

Supported Argument Types

BOOL

 

Optional Clause

OVER: Specifies a window. See Analytic Functions.

 

Return Data Types

INT64

Examples


LOGICAL_AND

 

Description

Returns the logical AND of all non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

 

Supported Argument Types

BOOL

 

Return Data Types

BOOL

Examples


LOGICAL_OR

 

Description

Returns the logical OR of all non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

 

Supported Argument Types

BOOL

 

Return Data Types

BOOL

Examples


MAX

 

Description

Returns the maximum value of non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows. Returns NaN if the input contains a NaN .

 

Supported Argument Types

Any data type except: ARRAY STRUCT

 

Optional Clause

OVER : Specifies a window. See Analytic Functions.

 

Return Data Types

Same as the data type used as the input values.

Examples


MIN

 

Description

Returns the minimum value of non- NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows. Returns NaN if the input contains a NaN.

 

Supported Argument Types

Any data type except: ARRAY STRUCT

 

Optional Clause

OVER: Specifies a window. See Analytic Functions.

 

Return Data Types

Same as the data type used as the input values.

Examples


STRING_AGG

 

Description

Returns a value (either STRING or BYTES) obtained by concatenating non-null values.

If a delimiter is specified, concatenated values are separated by that delimiter; otherwise, a comma is used as a delimiter.

 

Supported Argument Types

STRING BYTES

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STRING_AGG().
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.
  3. ORDER BY: Specifies the order of the values.
    • For each sort key, the default sort direction is ASC.
    • NULLs: In the context of the ORDER BY clause, NULLs are the minimum possible value; that is, NULLs appear first in ASC sorts and last in DESC sorts.
    • Floating point data types: see Floating Point Semantics on ordering and grouping.
    • If DISTINCT is also specified, then the sort key must be the same as expression.
    • If ORDER BY is not specified, the order of the elements in the output array is non-deterministic, which means you might receive a different result each time you use this function.
  4. LIMIT: Specifies the maximum number of expression inputs in the result. The limit applies to the number of input strings, not the number of characters or bytes in the inputs. An empty string counts as 1. A NULL string is not counted. The limit n must be a constant INT64.

 

Return Data Types

STRING BYTES

Examples


SUM

 

Description

Returns the sum of non-null values.

If the expression is a floating point value, the sum is non-deterministic, which means you might receive a different result each time you use this function.

 

Supported Argument Types

Any supported numeric data types.

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions.
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Return Data Types

  • Returns INT64 if the input is an integer.
  • Returns NUMERIC if the input type is NUMERIC.
  • Returns FLOAT64 if the input is a floating point value.

Returns NULL if the input contains only NULLs.

Returns Inf if the input contains Inf.

Returns -Inf if the input contains -Inf.

Returns NaN if the input contains a NaN.

Returns NaN if the input contains a combination of Inf and -Inf.

Examples


Statistical Aggregate Functions

Kochava Query supports the following statistical aggregate functions.


CORR

 

Description

Returns the Pearson coefficient of correlation of a set of number pairs. For each number pair, the first number is the dependent variable and the second number is the independent variable. The return result is between -1 and 1. A result of 0 indicates no correlation.

This function ignores any input pairs that contain one or more NULL values. If there are fewer than two input pairs without NULL values, this function returns NULL.

 

Supported Input Types

FLOAT64

 

Optional Clause

OVER: Specifies a window. See Analytic Functions.

 

Return Data Type

FLOAT64


COVAR_POP

 

Description

Returns the population covariance of a set of number pairs. The first number is the dependent variable; the second number is the independent variable. The return result is between -Inf and +Inf.

This function ignores any input pairs that contain one or more NULL values. If there is no input pair without NULL values, this function returns NULL. If there is exactly one input pair without NULL values, this function returns 0.

 

Supported Input Types

FLOAT64

 

Optional Clause

OVER : Specifies a window. See Analytic Functions.

 

Return Data Type

FLOAT64


COVAR_SAMP

 

Description

Returns the sample covariance of a set of number pairs. The first number is the dependent variable; the second number is the independent variable. The return result is between -Inf and +Inf.

This function ignores any input pairs that contain one or more NULL values. If there are fewer than two input pairs without NULL values, this function returns NULL.

 

Supported Input Types

FLOAT64

 

Optional Clause

OVER: Specifies a window. See Analytic Functions.

 

Return Data Type

FLOAT64


STDDEV_POP

 

Description

Returns the population (biased) standard deviation of the values. The return result is between 0 and +Inf.

This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.

If this function receives a single non-NULL input, it returns 0.

 

Supported Input Types

FLOAT64

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STDDEV_POP().
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Return Data Type

FLOAT64


STDDEV_SAMP

 

Description

Returns the sample (unbiased) standard deviation of the values. The return result is between 0 and +Inf.

This function ignores any NULL inputs. If there are fewer than two non-NULL inputs, this function returns NULL.

 

Supported Input Types

FLOAT64

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STDDEV_SAMP().
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Return Data Type

FLOAT64


STDDEV

 

Description

An alias of STDDEV_SAMP.


VAR_POP

 

Description

Returns the population (biased) variance of the values. The return result is between 0 and +Inf.

This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.

If this function receives a single non-NULL input, it returns 0.

 

Supported Input Types

FLOAT64

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within VAR_POP().
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Return Data Type

FLOAT64


VAR_SAMP

 

Description

Returns the sample (unbiased) variance of the values. The return result is between 0 and +Inf.

This function ignores any NULL inputs. If there are fewer than two non-NULL inputs, this function returns NULL.

 

Supported Input Types

FLOAT64

 

Optional Clauses

The clauses are applied in the following order:

  1. OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within VAR_SAMP().
  2. DISTINCT: Each distinct value of expression is aggregated only once into the result.

 

Return Data Type

FLOAT64


VARIANCE

 

Description

An alias of VAR_SAMP.


Approximate Aggregate Functions

Approximate aggregate functions are scalable in terms of memory usage and time, but produce approximate results instead of exact results. For more background, see Approximate Aggregation.


APPROX_COUNT_DISTINCT

 

Description

Returns the approximate result for COUNT(DISTINCT expression). The value returned is a statistical estimate—not necessarily the actual value.

This function is less accurate than COUNT(DISTINCT expression), but performs better on huge input.

 

Supported Argument Types

Any data type except: ARRAY STRUCT

 

Returned Data Types

INT64

Examples


APPROX_QUANTILES

 

Description

Returns the approximate boundaries for a group of expression values, where number represents the number of quantiles to create. This function returns an array of number + 1 elements, where the first element is the approximate minimum and the last element is the approximate maximum.

 

Supported Argument Types

expression can be any supported data type except: ARRAY STRUCT

number must be INT64.

 

Optional Clauses

The clauses are applied in the following order:

  1. DISTINCT: Each distinct value of expression is aggregated only once into the result.
  2. IGNORE NULLS or RESPECT NULLS: If IGNORE NULLS is specified or if neither is specified, the NULL values are excluded from the result. If RESPECT NULLS is specified, the NULL values are included in the result (an error is raised, however, if an array in the final query result contains a NULL element).

 

Returned Data Types

An ARRAY of the type specified by the expression parameter.

Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

Examples


APPROX_TOP_COUNT

 

Description

Returns the approximate top elements of expression. The number parameter specifies the number of elements returned.

 

Supported Argument Types

expression can be of any data type that the GROUP BY clause supports.

number must be INT64.

 

Returned Data Types

An ARRAY of type STRUCT. The STRUCT contains two fields. The first field (named value) contains an input value. The second field (named count) contains an INT64 specifying the number of times the value was returned.

Returns NULL if there are zero input rows.

Examples

 

NULL Handling

APPROX_TOP_COUNT does not ignore NULLs in the input. For example:


APPROX_TOP_SUM

 

Description

Returns the approximate top elements of expression, based on the sum of an assigned weight. The number parameter specifies the number of elements returned.

If the weight input is negative or NaN, this function returns an error.

 

Supported Argument Types

expression can be of any data type that the GROUP BY clause supports.

weight must be one of the following:

  • INT64
  • FLOAT64
number must be INT64.

 

Returned Data Types

An ARRAY of type STRUCT. The STRUCT contains two fields: value and sum. The value field contains the value of the input expression. The sum field is the same type as weight, and is the approximate sum of the input weight associated with the value field.

Returns NULL if there are zero input rows.

Examples

 

NULL Handling

APPROX_TOP_SUM does not ignore NULL values for the expression and weight parameters.


HyperLogLog++ Functions

Kochava Query supports the following approximate aggregate functions using the HyperLogLog++ algorithm. For an explanation of how approximate aggregate functions work, see Approximate Aggregation.


HLL_COUNT.INT

 

Description

A scalar function that takes one or more input values and aggregates them into a HyperLogLog++ sketch. Each sketch is represented using the BYTES data type. You can then merge sketches using HLL_COUNT.MERGE or HLL_COUNT.MERGE_PARTIAL . If no merging is needed, you can extract the final count of distinct values from the sketch using HLL_COUNT.EXTRACT .

An input can be one of the following:

  • INT64
  • STRING
  • BYTES

This function supports an optional parameter, precision. This parameter defines the accuracy of the estimate at the cost of additional memory required to process the sketches or store them on disk. The following table shows the allowed precision values, the maximum sketch size per group, and confidence interval (CI) of typical precisions:

Precision Max. Sketch Size (MB) 65% CI 95% CI 99% CI
10 1 ±1.63% ±3.25% ±6.50%
11 2 ±1.15% ±2.30% ±4.60%
12 4 ±0.81% ±1.63% ±3.25%
13 8 ±0.57% ±1.15% ±1.72%
14 16 ±0.41% ±0.81% ±1.22%
15(default) 32 ±0.29% ±0.57% ±0.86%
16 64 ±0.20% ±0.41% ±0.61%
17 128 ±0.14% ±0.29% ±0.43%
18 256 ±0.10% ±0.20% ±0.41%
19 512 ±0.07% ±0.14% ±0.29%
20 1024 ±0.05% ±0.10% ±0.20%
21 2048 ±0.04% ±0.07% ±0.14%
22 4096 ±0.03% ±0.05% ±0.10%
23 8192 ±0.02% ±0.04% ±0.07%
24 16384 ±0.01% ±0.03% ±0.05%

If the input is NULL, this function returns NULL.

For more information, see HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm.

 

Supported Input Type

BYTES

 

Return Type

BYTES

Example


HLL_COUNT.MERGE

 

Description

An aggregate function that returns the cardinality of several HyperLogLog++ set sketches by computing their union.

Each sketch must have the same precision and be initialized on the same type. Attempts to merge sketches with different precisions or for different types results in an error. For example, you cannot merge a sketch initialized from INT64 data with one initialized from STRING data.

This function ignores NULL values when merging sketches. If the merge happens over zero rows or only over NULL values, the function returns 0.

 

Supported Input Types

BYTES

 

Return Type

INT64

Example


HLL_COUNT.MERGE_PARTIAL

 

Description

An aggregate function that takes one or more HyperLogLog++ sketch inputs and merges them into a new sketch.

This function returns NULL if there is no input or all inputs are NULL.

 

Supported Input Types

BYTES

 

Return Types

BYTES

Example


HLL_COUNT.EXTRACT

 

Description

A scalar function that extracts an cardinality estimate of a single HyperLogLog++ sketch.

If sketch is NULL, this function returns a cardinality estimate of 0.

 

Supported Input Types

BYTES

 

Return Type

INT64

Example


Numbering Functions

The following sections describe the numbering functions that Kochava Query supports. Numbering functions are a subset of analytic functions. For an explanation of how analytic functions work, see Analytic Function Concepts. For a description of how numbering functions work, see the Numbering Function Concepts.

OVER clause requirements:

  • PARTITION BY : Optional.
  • ORDER BY : Required, except for ROW_NUMBER().
  • window_frame_clause: Disallowed.

RANK

Description

Returns the ordinal (1-based) rank of each row within the ordered partition. All peer rows receive the same rank value. The next row or set of peer rows receives a rank value which increments by the number of peers with the previous rank value, instead of DENSE_RANK, which always increments by 1.

 

Supported Argument Types

INT64


DENSE_RANK

Description

Returns the ordinal (1-based) rank of each row within the window partition. All peer rows receive the same rank value, and the subsequent rank value is incremented by one.

 

Supported Argument Types

INT64


PERCENT_RANK

Description

Return the percentile rank of a row defined as (RK-1)/(NR-1), where RK is the RANK of the row and NR is the number of rows in the partition. Returns 0 if NR=1.

 

Supported Argument Types

FLOAT64


CUME_DIST

Description

Return the relative rank of a row defined as NP/NR. NP is defined to be the number of rows that either precede or are peers with the current row. NR is the number of rows in the partition.

 

Supported Argument Types

FLOAT64


NTILE

 

Description

This function divides the rows into constant_integer_expression buckets based on row ordering and returns the 1-based bucket number that is assigned to each row. The number of rows in the buckets can differ by at most 1. The remainder values (the remainder of number of rows divided by buckets) are distributed one for each bucket, starting with bucket 1. If constant_integer_expression evaluates to NULL, 0 or negative, an error is provided.

 

Supported Argument Types

INT64


ROW_NUMBER

Description

Does not require the ORDER BY clause. Returns the sequential row ordinal (1-based) of each row for each ordered partition. If the ORDER BY clause is unspecified then the result is non-deterministic.

 

Supported Argument Types

INT64


Bit Functions

Kochava Query supports the following bit functions.


BIT_COUNT

 

Description

The input, expression , must be an integer or BYTES.

Returns the number of bits that are set in the input expression. For signed integers, this is the number of bits in two’s complement form.

 

Return Data Type

INT64

Example


Mathematical Functions

All mathematical functions have the following behaviors:

  • They return NULL if any of the input parameters is NULL.
  • They return NaN if any of the arguments is NaN.

ABS

 

Description

Computes absolute value. Returns an error if the argument is an integer and the output value cannot be represented as the same type; this happens only for the largest negative input value, which has no positive representation. Returns +inf for a +/-inf argument.


SIGN

 

Description

Returns -1, 0, or +1 for negative, zero and positive arguments respectively. For floating point arguments, this function does not distinguish between positive and negative zero. Returns NaN for a NaN argument.


IS_INF

 

Description

Returns TRUE if the value is positive or negative infinity. Returns NULL for NULL inputs.


IS_NAN

 

Description

Returns TRUE if the value is a NaN value. Returns NULL for NULL inputs.


IEEE_DIVIDE

 

Description

Divides X by Y; this function never fails. Returns FLOAT64. Unlike the division operator (/), this function does not generate errors for division by zero or overflow.

Special cases:

  • If the result overflows, returns +/-inf.
  • If Y=0 and X=0, returns NaN.
  • If Y=0 and X!=0, returns +/-inf.
  • If X = +/-inf and Y = +/-inf , returns NaN.

The behavior of IEEE_DIVIDE is further illustrated in the table below.

 

Special Cases for IEEE_DIVIDE

The following table lists special cases for IEEE_DIVIDE.

Numerator Data Type (X) Denominator Data Type (Y) Result Value
Anything except 0 0 +/-inf
0 0 NaN
0 NaN NaN
NaN 0 NaN
+/-inf +/-inf NaN

RAND

 

Description

Generates a pseudo-random value of type FLOAT64 in the range of [0, 1), inclusive of 0 and exclusive of 1.


SQRT

SQRT(X)

 

Description

Computes the square root of X. Generates an error if X is less than 0. Returns +inf if X is +inf.


POW

 

Description

Returns the value of X raised to the power of Y. If the result underflows and is not representable, then the function returns a value of zero. Returns an error if one of the following is true:

  • X is a finite value less than 0 and Y is a noninteger
  • X is 0 and Y is a finite value less than 0

The behavior of POW() is further illustrated in the table below.


POWER

 

Description

Synonym of POW().

 

Special cases for POW(X, Y) and POWER(X, Y)

The following are special cases for POW(X, Y) and POWER(X, Y).

X Y POW(X, Y) or POWER(X, Y)
1.0 Any value including NaN 1.0
any including NaN 0 1.0
-1.0 +/-inf 1.0
ABS(X) < 1 -inf +inf
ABS(X) < 1 -inf 0
ABS(X) < 1 +inf 0
ABS(X) < 1 +inf +inf
-inf Y < 0 0
-inf Y < 0 -inf if Y is an odd integer, +inf otherwise
+inf Y < 0 0
+inf Y < 0 +inf

EXP

 

Description

Computes e to the power of X, also called the natural exponential function. If the result underflows, this function returns a zero. Generates an error if the result overflows. If X is +/-inf, then this function returns +inf or 0.


LN

 

Description

Computes the natural logarithm of X. Generates an error if X is less than or equal to zero. If X is +inf, then this function returns +inf.


LOG

 

Description

If only X is present, LOG is a synonym of LN. If Y is also present, LOG computes the logarithm of X to base Y. Generates an error in these cases:

  • X is less than or equal to zero
  • Y is 1.0
  • Y is less than or equal to zero.

The behavior of LOG(X, Y) is further illustrated in the table below.

 

Special Cases for LOG(X, Y)

X Y LOG(X, Y)
-inf Any Value NaN
Any Value +inf NaN
+inf 0.0 Y < 1.0 -inf
+inf Y > 1.0 +inf

LOG10

 

Description

Similar to LOG, but computes logarithm to base 10.


GREATEST

 

Description

Returns NULL if any of the inputs is NULL . Otherwise, returns NaN if any of the inputs is NaN . Otherwise, returns the largest value among X1,…,XN according to the < comparison.


LEAST

 

Description

Returns NULL if any of the inputs is NULL . Returns NaN if any of the inputs is NaN . Otherwise, returns the smallest value among X1,…,XN according to the > comparison.


DIV

 

Description

Returns the result of integer division of X by Y. Division by zero returns an error. Division by -1 may overflow. See the table below for possible result types.


SAFE_DIVIDE

 

Description

Equivalent to the division operator (/), but returns NULL if an error occurs, such as a division by zero error.


MOD

 

Description

Modulo function: returns the remainder of the division of X by Y. Returned value has the same sign as X. An error is generated if Y is 0. See the table below for possible result types.


ROUND

 

Description

If only X is present, ROUND rounds X to the nearest integer. If N is present, ROUND rounds X to N decimal places after the decimal point. If N is negative, ROUND will round off digits to the left of the decimal point. Rounds halfway cases away from zero. Generates an error if overflow occurs.


TRUNC

 

Description

If only X is present, TRUNC rounds X to the nearest integer whose absolute value is not greater than the absolute value of X. If N is also present, TRUNC behaves like ROUND(X, N) , but always rounds towards zero and never overflows.


CEIL

 

Description

Returns the smallest integral value (with FLOAT64 type) that is not less than X.


CEILING

 

Description

Synonym of CEIL(X)


FLOOR

 

Description

Returns the largest integral value (with FLOAT64 type) that is not greater than X.

 

Example Rounding Function Behavior

Example behavior of Kochava Query rounding functions:

Input “X” ROUND(X) TRUNC(X) CEIL(X) FLOOR(X)
2.0 2.0 2.0 2.0 2.0
2.3 2.0 2.0 3.0 2.0
2.8 3.0 2.0 3.0 2.0
2.5 3.0 2.0 3.0 2.0
-2.3 -2.0 -2.0 -2.0 -3.0
-2.8 -3.0 -2.0 -2.0 -3.0
-2.5 -3.0 -2.0 -2.0 -3.0
0 0 0 0 0
+/-inf +/-inf +/-inf +/-inf +/-inf
NaN NaN NaN NaN NaN

COS

 

Description

Computes cosine of X. Never fails.


COSH

 

Description

Computes the hyperbolic cosine of X. Generates an error if an overflow occurs.


ACOS

 

Description

Computes the principal value of the arc cosine of X. The return value is in the range [0,]. Generates an error if X is a finite value outside of range [-1, 1].


ACOSH

 

Description

Computes the inverse hyperbolic cosine of X. Generates an error if X is a finite value less than 1.


SIN

 

Description

Computes the sine of X. Never fails.


SINH

 

Description

Computes the hyperbolic sine of X. Generates an error if an overflow occurs.


ASIN

 

Description

Computes the principal value of the arc sine of X. The return value is in the range [-π/2,π/2]. Generates an error if X is a finite value outside of range [-1, 1].


ASINH

 

Description

Computes the inverse hyperbolic sine of X. Does not fail.


TAN

 

Description

Computes tangent of X. Generates an error if an overflow occurs.


TANH

 

Description

Computes hyperbolic tangent of X. Does not fail.


ATAN

 

Description

Computes the principal value of the arc tangent of X. The return value is in the range [-π/2,π/2]. Does not fail.


ATANH

 

Description

Computes the inverse hyperbolic tangent of X. Generates an error if the absolute value of X is greater or equal 1.


ATAN2

 

Description

Calculates the principal value of the arc tangent of Y/X using the signs of the two arguments to determine the quadrant. The return value is in the range [-π,π]. The behavior of this function is further illustrated in the table below.

 

Special cases for ATAN2()

Y X ATAN2(Y, X)
NaN Any Value NaN
Any Value NaN NaN
0 0 0, π or -π depending on the sign of X and Y
Finite value -inf π or -π depending on the sign of Y
Finite value +inf 0
+/-inf Finite value π/2 or π/2 depending on the sign of Y
+/-inf -inf ¾π or -¾π depending on the sign of Y
+/-inf +inf π/4 or -π/4 depending on the sign of Y

 

Special Cases for Trigonometric and Hyperbolic Rounding Functions

X COS(X) COSH(X) ACOS(X) ACOSH(X) SIN(X) SINH(X) ASIN(X) ASINH(X) TAN(X) TANH(X) ATAN(X) ATANH(X)
+/-inf NaN =+inf NaN =+inf NaN =+inf NaN =+inf NaN =+1.0 π/2 NaN
-inf NaN =+inf NaN NaN NaN -inf NaN -inf NaN -1.0 -π/2 NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

Navigation Functions

The following sections describe the navigation functions that Kochava Query supports. Navigation functions are a subset of analytic functions. For an explanation of how analytic functions work, see Analytic Function Concepts. For an explanation of how navigation functions work, see Navigation Function Concepts.


FIRST_VALUE