Phoenix favicon

Apache Phoenix

Fundamentals

Configuration

Phoenix configuration properties and tuning defaults.

Configuration

Phoenix provides many different knobs and dials to configure and tune the system to run more optimally on your cluster. The configuration is done through a series of Phoenix-specific properties specified both on client and server-side hbase-site.xml files. In addition to these properties, there are of course all the HBase configuration properties with the most important ones documented here.

The table below outlines the full set of Phoenix-specific configuration properties and their defaults.

PropertyDescriptionDefault
data.tx.snapshot.dirServer-side property specifying the HDFS directory used to store snapshots of the transaction state. No default value.None
data.tx.timeoutServer-side property specifying the timeout in seconds for a transaction to complete. Default is 30 seconds.30
phoenix.query.timeoutMsClient-side property specifying the number of milliseconds after which a query will timeout on the client. Default is 10 min.600000
phoenix.query.keepAliveMsMaximum time in milliseconds that excess idle threads will wait for new tasks before terminating when the number of threads is greater than the cores in the client side thread pool executor. Default is 60 sec.60000
phoenix.query.threadPoolSizeNumber of threads in client side thread pool executor. As the number of machines/cores in the cluster grows, this value should be increased.128
phoenix.query.queueSizeMax queue depth of the bounded round robin backing the client side thread pool executor, beyond which an attempt to queue additional work is rejected. If zero, a SynchronousQueue is used instead of the bounded round robin queue. The default value is 5000.5000
phoenix.stats.guidepost.widthServer-side parameter that specifies the number of bytes between guideposts. A smaller amount increases parallelization, but also increases the number of chunks which must be merged on the client side. The default value is 100 MB.104857600
phoenix.stats.guidepost.per.regionServer-side parameter that specifies the number of guideposts per region. If set to a value greater than zero, then the guidepost width is determined by MAX_FILE_SIZE of table / phoenix.stats.guidepost.per.region. Otherwise, if not set, then the phoenix.stats.guidepost.width parameter is used. No default value.None
phoenix.stats.updateFrequencyServer-side parameter that determines the frequency in milliseconds for which statistics will be refreshed from the statistics table and subsequently used by the client. The default value is 15 min.900000
phoenix.stats.minUpdateFrequencyClient-side parameter that determines the minimum amount of time in milliseconds that must pass before statistics may again be manually collected through another UPDATE STATISTICS call. The default value is phoenix.stats.updateFrequency / 2.450000
phoenix.stats.useCurrentTimeServer-side parameter that if true causes the current time on the server-side to be used as the timestamp of rows in the statistics table when background tasks such as compactions or splits occur. If false, then the max timestamp found while traversing the table over which statistics are being collected is used as the timestamp. Unless your client is controlling the timestamps while reading and writing data, this parameter should be left alone. The default value is true.true
phoenix.query.spoolThresholdBytesThreshold size in bytes after which results from parallelly executed query results are spooled to disk. Default is 20 mb.20971520
phoenix.query.maxSpoolToDiskBytesThreshold size in bytes up to which results from parallelly executed query results are spooled to disk above which the query will fail. Default is 1 GB.1024000000
phoenix.query.maxGlobalMemoryPercentagePercentage of total heap memory (i.e. Runtime.getRuntime().maxMemory()) that all threads may use. Only coarse-grained memory usage is tracked, mainly accounting for memory usage in the intermediate map built during group by aggregation. When this limit is reached clients block while attempting to get more memory, essentially throttling memory usage. Defaults to 15%15
phoenix.query.maxGlobalMemorySizeMax size in bytes of total tracked memory usage. By default not specified, however, if present, the lower of this parameter and phoenix.query.maxGlobalMemoryPercentage will be used. 
phoenix.query.maxGlobalMemoryWaitMsMaximum amount of time that a client will block while waiting for more memory to become available. After this amount of time, an InsufficientMemoryException is thrown. Default is 10 sec.10000
phoenix.query.maxTenantMemoryPercentageMaximum percentage of phoenix.query.maxGlobalMemoryPercentage that any one tenant is allowed to consume. After this percentage, an InsufficientMemoryException is thrown. Default is 100%100
phoenix.query.dateFormatDefault pattern to use for conversion of a date to/from a string, whether through the TO_CHAR(<date>) or TO_DATE(<date-string>) functions, or through resultSet.getString(<date-column>). Default is yyyy-MM-dd HH:mm:ss.SSSyyyy-MM-dd HH:mm:ss.SSS
phoenix.query.dateFormatTimeZoneA timezone id that specifies the default time zone in which date, time, and timestamp literals should be interpreted when interpreting string literals or using the TO_DATE function. A time zone id can be a timezone abbreviation such as "PST", or a full name such as "America/Los_Angeles", or a custom offset such as "GMT-9:00". The time zone id "LOCAL" can also be used to interpret all date, time, and timestamp literals as being in the current timezone of the client.GMT
phoenix.query.timeFormatDefault pattern to use for conversion of TIME to/from a string, whether through the TO_CHAR(<time>) or TO_TIME(<time-string>) functions, or through resultSet.getString(<time-column>). Default is yyyy-MM-dd HH:mm:ss.SSSyyyy-MM-dd HH:mm:ss.SSS
phoenix.query.timestampFormatDefault pattern to use for conversion of TIMESTAMP to/from a string, whether through the TO_CHAR(<timestamp>) or TO_TIMESTAMP(<timestamp-string>) functions, or through resultSet.getString(<timestamp-column>). Default is yyyy-MM-dd HH:mm:ss.SSSyyyy-MM-dd HH:mm:ss.SSS
phoenix.query.numberFormatDefault pattern to use for conversion of a decimal number to/from a string, whether through the TO_CHAR(<decimal-number>) or TO_NUMBER(<decimal-string>) functions, or through resultSet.getString(<decimal-column>). Default is #,##0.####,##0.###
phoenix.mutate.maxSizeThe maximum number of rows that may be batched on the client before a commit or rollback must be called.500000
phoenix.mutate.batchSizeThe number of rows that are batched together and automatically committed during the execution of an UPSERT SELECT or DELETE statement. This property may be overridden at connection time by specifying the UpsertBatchSize property value. Note that the connection property value does not affect the batch size used by the coprocessor when these statements are executed completely on the server side.1000
phoenix.query.maxServerCacheBytesMaximum size (in bytes) of a single sub-query result (usually the filtered result of a table) before compression and conversion to a hash map. Attempting to hash an intermediate sub-query result of a size bigger than this setting will result in a MaxServerCacheSizeExceededException. Default 100MB.104857600
phoenix.coprocessor.maxServerCacheTimeToLiveMsMaximum living time (in milliseconds) of server caches. A cache entry expires after this amount of time has passed since last access. Consider adjusting this parameter when a server-side IOException("Could not find hash cache for joinId") happens. Getting warnings like "Earlier hash cache(s) might have expired on servers" might also be a sign that this number should be increased.30000
phoenix.query.useIndexesClient-side property determining whether or not indexes are considered by the optimizer to satisfy a query. Default is truetrue
phoenix.index.failure.handling.rebuildServer-side property determining whether or not a mutable index is rebuilt in the background in the event of a commit failure. Only applicable for indexes on mutable, non transactional tables. Default is truetrue
phoenix.index.failure.block.writeServer-side property determining whether or not writes to the data table are disallowed in the event of a commit failure until the index can be caught up with the data table. Requires that phoenix.index.failure.handling.rebuild is true as well. Only applicable for indexes on mutable, non transactional tables. Default is falsefalse
phoenix.index.failure.handling.rebuild.intervalServer-side property controlling the millisecond frequency at which the server checks whether or not a mutable index needs to be partially rebuilt to catch up with updates to the data table. Only applicable for indexes on mutable, non transactional tables. Default is 10 seconds.10000
phoenix.index.failure.handling.rebuild.overlap.timeServer-side property controlling how many milliseconds to go back from the timestamp at which the failure occurred to go back when a partial rebuild is performed. Only applicable for indexes on mutable, non transactional tables. Default is 1 millisecond.1
phoenix.index.mutableBatchSizeThresholdNumber of mutations in a batch beyond which index metadata will be sent as a separate RPC to each region server as opposed to included inline with each mutation. Defaults to 5.5
phoenix.schema.dropMetaDataDetermines whether or not an HBase table is dropped when the Phoenix table is dropped. Default is truetrue
phoenix.groupby.spillableDetermines whether or not a GROUP BY over a large number of distinct values is allowed to spill to disk on the region server. If false, an InsufficientMemoryException will be thrown instead. Default is truetrue
phoenix.groupby.spillFilesNumber of memory mapped spill files to be used when spilling GROUP BY distinct values to disk. Default is 22
phoenix.groupby.maxCacheSizeSize in bytes of pages cached during GROUP BY spilling. Default is 100Mb102400000
phoenix.groupby.estimatedDistinctValuesNumber of estimated distinct values when a GROUP BY is performed. Used to perform initial sizing with growth of 1.5x each time reallocation is required. Default is 10001000
phoenix.distinct.value.compress.thresholdSize in bytes beyond which aggregate operations which require tracking distinct value counts (such as COUNT DISTINCT) will use Snappy compression. Default is 1Mb1024000
phoenix.index.maxDataFileSizePercPercentage used to determine the MAX_FILESIZE for the shared index table for views relative to the data table MAX_FILESIZE. The percentage should be estimated based on the anticipated average size of a view index row versus the data row. Default is 50%.50
phoenix.coprocessor.maxMetaDataCacheTimeToLiveMsTime in milliseconds after which the server-side metadata cache for a tenant will expire if not accessed. Default is 30mins180000
phoenix.coprocessor.maxMetaDataCacheSizeMax size in bytes of total server-side metadata cache after which evictions will begin to occur based on least recent access time. Default is 20Mb20480000
phoenix.client.maxMetaDataCacheSizeMax size in bytes of total client-side metadata cache after which evictions will begin to occur based on least recent access time. Default is 10Mb10240000
phoenix.sequence.cacheSizeNumber of sequence values to reserve from the server and cache on the client when the next sequence value is allocated. Only used if not defined by the sequence itself. Default is 100100
phoenix.clock.skew.intervalDelay interval(in milliseconds) when opening SYSTEM.CATALOG to compensate possible time clock skew when SYSTEM.CATALOG moves among region servers.2000
phoenix.index.failure.handling.rebuildBoolean flag which turns on/off auto-rebuild a failed index from when some updates are failed to be updated into the index.true
phoenix.index.failure.handling.rebuild.intervalTime interval(in milliseconds) for index rebuild backend Job to check if there is an index to be rebuilt10000
phoenix.index.failure.handling.rebuild.overlap.timeIndex rebuild job builds an index from when it failed - the time interval(in milliseconds) in order to create a time overlap to prevent missing updates when there exists time clock skew.300000
phoenix.query.force.rowkeyorderWhether or not a non aggregate query returns rows in row key order for salted tables. For version prior to 4.4, use phoenix.query.rowKeyOrderSaltedTable instead. Default is true.true
phoenix.connection.autoCommitWhether or not a new connection has auto-commit enabled when it is created.false
phoenix.table.default.store.nullsThe default value of the STORE_NULLS flag used for table creation which determines whether or not null values should be explicitly stored in HBase. This is a client side parameter. Available starting from Phoenix 4.3.false
phoenix.table.istransactional.defaultThe default value of the TRANSACTIONAL flag used for table creation which determines whether or not a table is transactional . This is a client side parameter. Available starting from Phoenix 4.7.false
phoenix.transactions.enabledDetermines whether or not transactions are enabled in Phoenix. A table may not be declared as transactional if transactions are disabled. This is a client side parameter. Available starting from Phoenix 4.7.false
phoenix.mapreduce.split.by.statsDetermines whether to use the splits determined by statistics for MapReduce input splits. Default is true. This is a server side parameter. Available starting from Phoenix 4.10. Set to false to enable behavior from previous versions.true
phoenix.log.levelClient-side property enabling query (only SELECT statement) logging. The logs are written to the SYSTEM.LOG table (requires a user to have W access on SYSTEM.LOG table). Possible values: OFF = No logging; INFO = Enables query logging; DEBUG = More details on Query (Explain plan, HBase Scan Details etc); TRACE = Logs query bind parameters as well. Available starting from Phoenix 4.14. WARNING: Enabling this feature may leak sensitive information to anyone who can access the SYSTEM.LOG table.OFF
phoenix.log.sample.rateClient-side property controlling the probability of logging a query to the query log. Set to a value between 0.0(no query) and 1.0(100% queries) . Available starting from Phoenix 4.14.1.0
Edit on GitHub

On this page