SQL Functions
Built-in SQL function reference for Apache Phoenix — aggregate, string, numeric, array, date/time, and general functions.
Functions (Aggregate)
AVG
The average (mean) value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.
Example
AVG(X)COUNT
The count of all row, or of the non-null values. This method returns a long. When DISTINCT is used, it counts only distinct values. If no rows are selected, the result is 0. Aggregates are only allowed in select statements.
Example
COUNT(*)APPROX_COUNT_DISTINCT
The approximate distinct count of all row, or of the non-null values. The relative error of approximation by default is less than 0.00405 This method returns a long. If no rows are selected, the result is 0. Aggregates are only allowed in select statements.
Example
APPROX_COUNT_DISTINCT(*)MAX
The highest value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.
Example
MAX(NAME)MIN
The lowest value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.
Example
MIN(NAME)SUM
The sum of all values. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.
Example
SUM(X)PERCENTILE_CONT
The nth percentile of values in the column. The percentile value can be between 0 and 1 inclusive. Aggregates are only allowed in select statements. The returned value is of decimal data type.
Example
PERCENTILE_CONT( 0.9 ) WITHIN GROUP (ORDER BY X ASC)PERCENTILE_DISC
PERCENTILE_DISC is an inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. Nulls are ignored in the calculation.
Example
PERCENTILE_DISC( 0.9 ) WITHIN GROUP (ORDER BY X DESC)PERCENT_RANK
The percentile rank for a hypothetical value, if inserted into the column. Aggregates are only allowed in select statements. The returned value is of decimal data type.
Example
PERCENT_RANK( 100 ) WITHIN GROUP (ORDER BY X ASC)FIRST_VALUE
The first value in each distinct group ordered according to the ORDER BY specification.
Example
FIRST_VALUE( name ) WITHIN GROUP (ORDER BY salary DESC)LAST_VALUE
The last value in each distinct group ordered according to the ORDER BY specification.
Example
LAST_VALUE( name ) WITHIN GROUP (ORDER BY salary DESC)FIRST_VALUES
Returns an array of at most the given numeric size of the first values in each distinct group ordered according to the ORDER BY specification.
Example
FIRST_VALUES( name, 3 ) WITHIN GROUP (ORDER BY salary DESC)LAST_VALUES
Returns an array of at most the given numeric size of the last values in each distinct group ordered according to the ORDER BY specification.
Example
LAST_VALUES( name, 3 ) WITHIN GROUP (ORDER BY salary DESC)NTH_VALUE
The nth value in each distinct group ordered according to the ORDER BY specification.
Example
NTH_VALUE( name, 2 ) WITHIN GROUP (ORDER BY salary DESC)STDDEV_POP
The population standard deviation of all values. Aggregates are only allowed in select statements. The returned value is of decimal data type.
Example
STDDEV_POP( X )STDDEV_SAMP
The sample standard deviation of all values. Aggregates are only allowed in select statements. The returned value is of decimal data type.
Example
STDDEV_SAMP( X )Functions (Numeric)
ROUND
Rounds the numeric or timestamp expression to the nearest scale or time unit specified. If the expression is a numeric type, then the second argument is the scale to be used for rounding off the number, defaulting to zero. If the expression is a date/time type, then the second argument may be one of the time units listed to determine the remaining precision of the date/time. A default of MILLISECONDS is used if not present. The multiplier is only applicable for a date/time type and is used to round to a multiple of a time unit (i.e. 10 minute) and defaults to 1 if not specified. This method returns the same type as its first argument.
Example
ROUND(number)
ROUND(number, 2)
ROUND(timestamp)
ROUND(time, 'HOUR')
ROUND(date, 'MINUTE', 30)CEIL
Same as ROUND, except it rounds any fractional value up to the next even multiple.
Example
CEIL(number, 3)
CEIL(2.34)
CEIL(timestamp, 'SECOND', 30)
CEIL(date, 'DAY', 7)FLOOR
Same as ROUND, except it rounds any fractional value down to the previous even multiple.
Example
FLOOR(timestamp)
FLOOR(date, 'DAY', 7)TRUNC
Same as FLOOR
Example
TRUNC(timestamp, 'SECOND', 30)
TRUNC(date, 'DAY', 7)TO_NUMBER
Formats a string or date/time type as a number, optionally accepting a format string. For details on the format, see java.text.DecimalFormat. For date, time, and timeStamp terms, the result is the time in milliseconds since the epoch. This method returns a decimal number.
Example
TO_NUMBER('$123.33', '\u00A4###.##')RAND
Function that produces a random, uniformly distributed double value between 0.0 (inclusive) and 1.0 (exclusive). If a seed is provided, then the the returned value is identical across each invocation for the same row. If a seed is not provided, then the returned value is different for each invocation. The seed must be a constant.
Example
RAND()
RAND(5)Functions (String)
SUBSTR
Returns a substring of a string starting at the one-based position. If zero is used, the position is zero-based. If the start index is negative, then the start index is relative to the end of the string. The length is optional and if not supplied, the rest of the string will be returned.
Example
SUBSTR('[Hello]', 2, 5)
SUBSTR('Hello World', -5)INSTR
Returns the one-based position of the initial occurrence of the second argument in the first argument. If the second argument is not contained in the first argument, then zero is returned.
Example
INSTR('Hello World', 'World')
INSTR('Simon says', 'mon')
INSTR('Peace on earth', 'war')TRIM
Removes leading and trailing spaces from the input string.
Example
TRIM(' Hello ')LTRIM
Removes leading spaces from the input string.
Example
LTRIM(' Hello')RTRIM
Removes trailing spaces from the input string.
Example
RTRIM('Hello ')LPAD
Pads the string expression with the specific pad character (space by default) up to the length argument.
Example
LPAD('John',30)LENGTH
Returns the length of the string in characters.
Example
LENGTH('Hello')REGEXP_SUBSTR
Returns a substring of a string by applying a regular expression start from the offset of a one-based position. Just like with SUBSTR, if the start index is negative, then it is relative to the end of the string. If not specified, the start index defaults to 1.
Example
REGEXP_SUBSTR('na1-appsrv35-sj35', '[^-]+') evaluates to 'na1'REGEXP_REPLACE
Returns a string by applying a regular expression and replacing the matches with the replacement string. If the replacement string is not specified, it defaults to an empty string.
Example
REGEXP_REPLACE('abc123ABC', '[0-9]+', '#') evaluates to 'abc#ABC'REGEXP_SPLIT
Splits a string into a VARCHAR ARRAY using a regular expression. If characters that have a special meaning in regular expressions are to be used as a regular delimiter in the pattern string, they must be escaped with backslashes.
Example
REGEXP_SPLIT('ONE,TWO,THREE', ',') evaluates to ARRAY['ONE', 'TWO', 'THREE']
REGEXP_SPLIT('ONE!#TWO#,!THREE', '[,!#]+') evaluates to ARRAY['ONE', 'TWO', 'THREE']UPPER
Returns upper case string of the string argument. If localeString (available in Phoenix 4.14) is provided, it identifies the locale whose rules are used for the conversion. If localeString is not provided, the default locale is used. The localeString must be of the form returned by the Java 6 implementation of java.util.Locale.toString() e.g. 'zh_TW_STROKE' or 'en_US' or 'fr_FR'.
Example
UPPER('Hello')
UPPER('Hello', 'tr_TR')LOWER
Returns lower case string of the string argument. If localeString (available in Phoenix 4.14) is provided, it identifies the locale whose rules are used for the conversion. If localeString is not provided, the default locale is used. The localeString must be of the form returned by the Java 6 implementation of java.util.Locale.toString() e.g. 'zh_TW_STROKE' or 'en_US' or 'fr_FR'.
Example
LOWER('HELLO')
LOWER('HELLO', 'en_US')REVERSE
Returns reversed string of the string argument.
Example
REVERSE('Hello')TO_CHAR
Formats a date, time, timestamp, or number as a string. The default date format is "yyyy-MM-dd HH:mm:ss" and the default number format is "#,##0.###". For details, see java.text.SimpleDateFormat for date/time values and java.text.DecimalFormat for numbers. This method returns a string.
Example
TO_CHAR(myDate, '2001-02-03 04:05:06')
TO_CHAR(myDecimal, '#,##0.###')COLLATION_KEY
Calculates a collation key that can be used to sort strings in a natural-language-aware way. The localeString must be of the form returned by the Java 6 implementation of java.util.Locale.toString() e.g. 'zh_TW_STROKE' or 'en_US' or 'fr_FR'. The third, fourth and fifth arguments are optional and determine respectively whether to use a special upper-case collator, the strength value of the collator, and the decomposition value of the collator. (See java.text.Collator to learn about strength and decomposition).
Example
SELECT NAME FROM EMPLOYEE ORDER BY COLLATION_KEY(NAME, 'zh_TW')Functions (Array)
ARRAY_ELEM
Alternative to using array subscript notation to access an array element. Returns the element in the array at the given position. The position is one-based.
Example
ARRAY_ELEM(my_array_col, 5)
ARRAY_ELEM(ARRAY[1,2,3], 1)ARRAY_LENGTH
Returns the current length of the array.
Example
ARRAY_LENGTH(my_array_col)
ARRAY_LENGTH(ARRAY[1,2,3])ARRAY_APPEND
Appends the given element to the end of the array.
Example
ARRAY_APPEND(my_array_col, my_element_col)
ARRAY_APPEND(ARRAY[1,2,3], 4) evaluates to ARRAY[1,2,3,4]ARRAY_PREPEND
Appends the given element to the beginning of the array.
Example
ARRAY_PREPEND(my_element_col, my_array_col)
ARRAY_PREPEND(0, ARRAY[1,2,3]) evaluates to ARRAY[0,1,2,3]ARRAY_CAT
Concatenates the input arrays and returns the result.
Example
ARRAY_CAT(my_array_col1, my_array_col2)
ARRAY_CAT(ARRAY[1,2], ARRAY[3,4]) evaluates to ARRAY[1,2,3,4]ARRAY_FILL
Returns an array initialized with supplied value and length.
Example
ARRAY_FILL(my_element_col, my_length_col)
ARRAY_FILL(1, 3) evaluates to ARRAY[1,1,1]ARRAY_TO_STRING
Concatenates array elements using supplied delimiter and optional null string and returns the resulting string. If the nullString parameter is omitted or NULL, any null elements in the array are simply skipped and not represented in the output string.
Example
ARRAY_TO_STRING(my_array_col, my_delimiter_col, my_null_string_col)
ARRAY_TO_STRING(ARRAY['a','b','c'], ',') evaluates to 'a,b,c'
ARRAY_TO_STRING(ARRAY['a','b',null,'c'], ',') evaluates to 'a,b,c'
ARRAY_TO_STRING(ARRAY['a','b',null,'c'], ',', 'NULL') evaluates to 'a,b,NULL,c'ANY
Used on the right-hand side of a comparison expression to test that any array element satisfies the comparison expression against the left-hand side.
Example
1 = ANY(my_array)
10 > ANY(my_array)ALL
Used on the right-hand side of a comparison expression to test that all array elements satisfy the comparison expression against the left-hand side. of the array.
Example
1 = ALL(my_array)
10 > ALL(my_array)Functions (General)
MD5
Computes the MD5 hash of the argument, returning the result as a BINARY(16).
Example
MD5(my_column)INVERT
Inverts the bits of the argument. The return type will be the same as the argument.
Example
INVERT(my_column)ENCODE
Encodes the expression according to the encoding format provided and returns the resulting string. For 'BASE62', converts the given base 10 number to a base 62 number and returns a string representing the number.
Example
ENCODE(myNumber, 'BASE62')DECODE
Decodes the expression according to the encoding format provided and returns the resulting value as a VARBINARY. For 'HEX', converts the hex string expression to its binary representation, providing a mechanism for inputting binary data through the console.
Example
DECODE('000000008512af277ffffff8', 'HEX')COALESCE
Returns the value of the first argument if not null and the second argument otherwise. Useful to guarantee that a column in an UPSERT SELECT command will evaluate to a non null value.
Example
COALESCE(last_update_date, CURRENT_DATE())GET_BIT
Retrieves the bit at the given index in the given binary value.
Example
GET_BIT(CAST('FFFF' as BINARY), 1)GET_BYTE
Retrieves the byte at the given index in the given binary value.
Example
GET_BYTE(CAST('FFFF' as BINARY), 1)OCTET_LENGTH
Returns the number of bytes in a binary value.
Example
OCTET_LENGTH(NAME)SET_BIT
Replaces the bit at the given index in the binary value with the provided newValue.
Example
SET_BIT(CAST('FFFF' as BINARY), 1, 61)SET_BYTE
Replaces the byte at the given index in the binary value with the provided newValue.
Example
SET_BYTE(CAST('FFFF' as BINARY), 1, 61)Functions (Time and Date)
TO_DATE
Parses a string and returns a date. Note that the returned date is internally represented as the number of milliseconds since the java epoch. The most important format characters are: y year, M month, d day, H hour, m minute, s second. The default format string is "yyyy-MM-dd HH:mm:ss". For details of the format, see java.text.SimpleDateFormat. By default, GMT will be used as the time zone when parsing the date. However, a time zone id can also be supplied. This is a time zone id such as 'GMT+1'. If 'local' is provided as the time zone id, the local time zone will be used for parsing. The configuration setting "phoenix.query.dateFormatTimeZone" can also be set to a time zone id, which will cause the default of GMT to be overridden with the configured time zone id. Please see the Data Type reference guide about how Apache Phoenix presently defines the DATE datatype. Additionally, Phoenix supports the ANSI SQL "date" literal which acts similarly to the single-argument "TO_DATE" function.
Example
TO_DATE('Sat, 3 Feb 2001 03:05:06 GMT', 'EEE, d MMM yyyy HH:mm:ss z')
TO_DATE('1970-01-01', 'yyyy-MM-dd', 'GMT+1')
date '1970-01-01 12:30:00'CURRENT_DATE
Returns the current server-side date, bound at the start of the execution of a query based on the current time on the region server owning the metadata of the table being queried. Please see the Data Type reference guide about how Apache Phoenix presently defines the DATE datatype.
Example
CURRENT_DATE()TO_TIME
Converts the given string into a TIME instance. When a date format is not provided it defaults to "yyyy-MM-dd HH:mm:ss.SSS" or whatever is defined by the configuration property phoenix.query.dateFormat. The configuration setting phoenix.query.dateFormatTimeZone can also be set to a time zone id, which will cause the default of GMT to be overridden with the configured time zone id. Additionally, Phoenix supports the ANSI SQL "time" literal which acts similarly to the single-argument "TO_TIME" function.
Example
TO_TIME('2005-10-01 14:03:22.559')
TO_TIME('1970-01-01', 'yyyy-MM-dd', 'GMT+1')
time '2005-10-01 14:03:22.559'TO_TIMESTAMP
Converts the given string into a TIMESTAMP instance. When a date format is not provided it defaults to "yyyy-MM-dd HH:mm:ss.SSS" or whatever is defined by the configuration property phoenix.query.dateFormat. The configuration setting phoenix.query.dateFormatTimeZone can also be set to a time zone id, which will cause the default of GMT to be overridden with the configured time zone id. Additionally, Phoenix supports the ANSI SQL "timestamp" literal which acts similarly to the single-argument "TO_TIMESTAMP" function.
Example
TO_TIMESTAMP('2005-10-01 14:03:22.559')
TO_TIMESTAMP('1970-01-01', 'yyyy-MM-dd', 'GMT+1')
timestamp '2005-10-01 14:03:22.559'CURRENT_TIME
Same as CURRENT_DATE(), except returns a value of type TIME. In either case, the underlying representation is the epoch time as a long value. Please see the Data Type reference guide about how Apache Phoenix presently defines the TIME datatype.
Example
CURRENT_TIME()CONVERT_TZ
Converts date/time from one time zone to another returning the shifted date/time value.
Example
CONVERT_TZ(myDate, 'UTC', 'Europe/Prague')TIMEZONE_OFFSET
Returns offset (shift in minutes) of a time zone at particular date/time in minutes.
Example
TIMEZONE_OFFSET('Indian/Cocos', myDate)NOW
Returns the current date, bound at the start of the execution of a query based on the current time on the region server owning the metadata of the table being queried.
Example
NOW()YEAR
Returns the year of the specified date.
Example
YEAR(TO_DATE('2015-6-05'))MONTH
Returns the month of the specified date.
Example
MONTH(TO_TIMESTAMP('2015-6-05'))WEEK
Returns the week of the specified date.
Example
WEEK(TO_TIME('2010-6-15'))DAYOFYEAR
Returns the day of the year of the specified date.
Example
DAYOFYEAR(TO_DATE('2004-01-18 10:00:10'))DAYOFMONTH
Returns the day of the month of the specified date.
Example
DAYOFMONTH(TO_DATE('2004-01-18 10:00:10'))DAYOFWEEK
Returns the day of the week of the specified date.
Example
DAYOFWEEK(TO_DATE('2004-01-18 10:00:10'))HOUR
Returns the hour of the specified date.
Example
HOUR(TO_TIMESTAMP('2015-6-05'))MINUTE
Returns the minute of the specified date.
Example
MINUTE(TO_TIME('2015-6-05'))SECOND
Returns the second of the specified date.
Example
SECOND(TO_DATE('2015-6-05'))Functions (Math)
SIGN
Returns the signum function of the given numeric expression as an INTEGER. The return value is -1 if the given numeric expression is negative; 0 if the given numeric expression is zero; and 1 if the given numeric expression is positive.
Example
SIGN(number)
SIGN(1.1)
SIGN(-1)ABS
Returns the absolute value of the given numeric expression maintaining the same type.
Example
ABS(number)
ABS(1.1)
ABS(-1)SQRT
Returns the correctly rounded square root of the given non-negative numeric expression as a DOUBLE.
Example
SQRT(number)
SQRT(1.1)CBRT
Returns the cube root of the given numeric expression as a DOUBLE.
Example
CBRT(number)
CBRT(1.1)
CBRT(-1)EXP
Returns Euler's number e raised to the power of the given numeric value as a DOUBLE.
Example
EXP(number)
EXP(1.1)
EXP(-1)POWER
Returns the value of the first argument raised to the power of the second argument as a DOUBLE.
Example
POWER(number, number)
POWER(3, 2)
POWER(2, 3)LN
Returns the natural logarithm (base e) of the given positive expression as a DOUBLE.
Example
LN(number)
LN(3)
LN(2)LOG
Returns the logarithm of the first argument computed at the base of the second argument as a DOUBLE. If omitted, a base of 10 will be used for the second argument.
Example
LOG(3, 2)
LOG(2, 3)
LOG(2)