jOOQ tip: don’t convert JSONB to a String

jOOQ tip: don’t convert JSONB to a String

A few weeks ago, while investigating possible performance improvements for Kestra‘s JDBC backend, I noticed that a method we were using to map an entity to be persisted in the database into its JSONB representation was taking up a lot of time in our CPU profiles.

In the following flame graph, we can see that the JdbcQueue.map() method accounts for more than 21% of the samples and the Repository.map() method for 3.2% of the samples of a CPU profile obtained using async-profiler.

These two methods map a JSONB column into the entity’s target type. To do this, we use a Jackson object mapper. The code can be simplified to :

MAPPER.readValue(record.get("value", String.class), MyEntity.class);

Instinctively, I thought I’d better read the record in JSONB rather than in String, so I made the following change:

MAPPER.readValue(record.get("value", JSONB.class).data(), MyEntity.class);

And the results was immediate: a clear improvement in performance!

JdbcQueue.map() drops from over 21% of samples to a mere 4.6% and the Repository.map() method from 3.2% of samples to 2.7%. In my test, the Kestra queue is heavily used, which explains why the impact is much greater on this component.

Pleased with the improvement, I opened a PR which I merged quickly: https: //github.com/kestra-io/kestra/pull/4899/files.

But, after thought, I’d like to know why?

So I open my favorite IDE and browse the code:

  • Record.get(String, Class) is implemented by AbstractRecord.get(String, Class).
  • AbstractRecord.get(String, Class) lead to AbstractRecord.get(int, Class) where a method converterOrFail() is called. Oh!, a conversion can explain the impact on performance!
  • Conversion takes place via a ConvertedProvider which provides a Converter.
  • After a quick search, converters use the utility class Convert for type conversion, we’re getting closer!
  • Convert.from(), in line 716, converts any type to String by calling its toString() method.
// All types can be converted into String
else if (toClass == String.class) {
    if (from instanceof EnumType e)
        return (U) e.getLiteral();
    }
    return (U) from.toString();
}

This leads to the JSONB.toString() method, which uses JSONValue.toJSONString(parsed()) to normalize the JSON representation and enable equality between two JSONs of different structure, but with the same attributes.

This method is clearly documented as one to be avoided in a performance-sensitive context and it’s clearly specified in the JavaDoc.

uses a normalised representation of the JSON content, meaning that two equivalent JSON documents are considered equal. This impacts both behaviour and performance!

So now I know the why!

As a matter of conscience, I’ve looked at our MySQL implementation, which uses the JSON type, which doesn’t perform any normalization and returns the subjasent JSON directly via String.valueOf(data), so it doesn’t suffer from the same performance issue.

That’s the whole story, I hope this dive into the jOOQ code has interested you as much as it has me 😉

Update : after publication of this article on social networks, jOOQ a répondu pointing to two GitHub issues on this subject:

  1. Revert JSONB::toString to produce JSONB::data
  2. Implement faster JSONB normalisation

The second issue is worth reading, as it explains the issue of standardization, suggests avenues for improvement and discusses the limitations of the current JSON parser. A problem that might seem simple at first glance is often much more complicated than it appears.

To take part in the discussion, I’ve opened an issue Switch JSONB record convertion for String to use the data() method which proposes to fix this specific case of conversion without having to wait to fix the more general problem of the performance of the toString() method.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.