Error messages and remedies for Kubernetes

This describes common error messages and their remedies when running the Keycloak Benchmark suite on Kubernetes.

Keycloak message ERROR: Failed to obtain JDBC connection

Context

This error message can appear when running Keycloak with a relational database like PostgreSQL or CockroachDB.

Similar other error messages:

  • “Unable to acquire JDBC Connection”

  • “Sorry, acquisition timeout!”

This might happen during the startup and then the startup fails. It might also happen during a load test when Keycloak creates new database connections.

Cause

The database is either not started, or the number of database connections is exhausted in the current setup.

Remedy
  • Ensure that the database is running.

  • Ensure that the database didn’t restart, for example, due to an out-of-memory problem.

  • Ensure that the number of DB connections in total doesn’t exceed the maximum number of connections of the database. See the Keycloak deployment configuration options KC_DB_* for details.

  • Ensure that Keycloak doesn’t try to use more connections than configured as maximum numbers of connections.

Caution
  • Under high load, the number of database connections is usually the constraint of the system. Having Keycloak running into a “Sorry, acquisition timeout” and returning an HTTP 5xx code to the caller is a sensible load shedding mechanism. See Keycloak under load for details.

Keycloak message RETRY_SERIALIZABLE

Full message

org.postgresql.util.PSQLException, ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict: intent on key /Table/137/…​ See: https://www.cockroachlabs.com/docs/v22.1/transaction-retry-error-reference.html#retry_serializable

Context

This error message can appear when running Keycloak with CockroachDB, both single node or multi node with the operator.

This might happen during the load test while Keycloak processes requests.

Cause

Some transactions are not fully serializable as data has been modified in parallel transactions.

Effect

The database rolls back the transaction and asks the caller to repeat the request. Some users see error messages.

Remedy
  • Analyze the request/URL where this happens by looking at the log, and discuss this with engineers.

  • Use the following SQL to determine the table which is causing the problems, which would by 137 in the example given above:

    SELECT DISTINCT ti.descriptor_name as table_name, us.table_id
      FROM  crdb_internal.index_usage_statistics us, crdb_internal.table_indexes ti
      WHERE us.table_id = ti.descriptor_id ORDER BY us.table_id ASC;

Keycloak message prepared transactions are disabled

Full message

org.postgresql.util.PSQLException: ERROR: prepared transactions are disabled. Hint: Set max_prepared_transactions to a nonzero value.

Context

This happens when the transaction manager or Quarkus handles more than one transaction in a request, and one of the transactions is handled by a PostgreSQL database.

Cause

Once the transaction manager of Quarkus bundles two transactions, it sends a PREPARE TRANSACTION command to the database before sending the commit after all transactions have been prepared. For this to work, the database needs to have the max_prepared_transactions parameter set.

Effect

No transactions against the PostgreSQL database complete.

Remedy

Pass the parameters -c max_prepared_transactions=xxx to the database. For the containerized database in the Kubernetes setup, this has been configured in postgres-deployment.yaml.

Keycloak message ARJUNA012225: cannot access root of object store

Full message

ARJUNA012225: FileSystemStore::setupStore - cannot access root of object store: ObjectStore/ShadowNoFileLockStore/defaultStore/

Context

This happens when the transaction manager or Quarkus handles more than one transaction in a request and attempts to locally persist the state of the transaction.

Cause

The working directory of Keycloak is not writable in the Keycloak container, therefore writing the state fails.

Effect

No transactions against any store participating in the JTA transaction complete, therefore Keycloak will not start.

Remedy

Via the environment variable QUARKUS_TRANSACTION_MANAGER_OBJECT_STORE_DIRECTORY pass in a folder that is writable. This has been fixed upstream for Keycloak 21.1 in keycloak#19384.

Keycloak message 'org.jgroups.util.ThreadPool: thread pool is full'

Full message

org.jgroups.util.ThreadPool: thread pool is full (max=xx, active=xx); thread dump (dumped once, until thread_dump is reset)

Context

This happens when the thread pool in JGroups runs out of threads.

Cause

Each Keycloak pod has executor threads which execute the RESTEasy requests. While those can request data from the local caches directly (either the primary or the back ones), they need to call out to remote caches via JGroups. While some of these requests might get bundled, each of the Keycloak requests calling out remotely might need to acquire one JGroups thread.

Effect

When JGroups runs out of threads in its thread pool, those remote calls to be enqueued are rejected and discarded. This will lead to longer processing times and eventually to timeouts on Infinispan where JGroups is the underlying transport.

Remedy

The number of executor threads in all Keycloak Pods in a cluster combined should be equal to the maximum size of the JGroups thread pool.

Due to the bug ISPN-14780 there was currently no known way to configure the size different from the default JGroups 100 max threads before Keycloak 23.0.0 and 22.0.2, so assuming a Keycloak cluster with 4 Pods, each Pod shouldn’t have more than 25 worker threads. Use the Quarkus configuration options quarkus.thread-pool.max-threads to configure the maximum number of worker threads.