On performance of JDBC drivers.
Back when the first version of the MariaDB Java Client was released, someone asked in the comments about the performance characteristics of the driver compared to ConnectorJ. I answered with hand-waving, saying that nobody does anything stupid, the performance of the drivers would be roughly the same, but I promised to measure it and tell the world one day. And now that day has come. The day where three MySQL JDBC drivers (ConnectorJ, MariaDB JDBC, and Drizzle JDBC) are compared against each other. Unlike the server, which gets benchmarking attention all the time, there is no standard benchmark for connectors, so I needed to improvise, while trying to keep the overhead of the server minimal. So I did something very primitive to start. I used my two favorite queries:
DO 1
— this one does not retrieve a result set, and thus can be seen as a small “update”.SELECT 1
— the minimal SELECT query.
The test program runs a query N times, and if the query was a select, it retrieves all values from the result set, using ResultSet.getObject(i)
, and calculates the queries-per-second value. (The best thing is that the test program is single-threaded, and how often does one get to run single-threaded tests? 🙂 the test was run on my own workstation, which runs Windows Server 2008 R2, and I have useConfigs=maxPerformance
in the URL for ConnectorJ.
Results (Queries per second, unprepared)
ConnectorJ-5.1.24 | MariaDB-JDBC-1.1.2 | Drizzle-JDBC-1.3-SNAPSHOT | |
---|---|---|---|
DO 1 | 19543 | 22104 | 15288 |
SELECT 1 | 17004 | 19305 | 13410 |
MariaDB JDBC appears to be a little faster (~10%) than ConnectorJ, and much faster (~30%) than Drizzle JDBC.
Can ConnectorJ do better? I bet it can. Looking into profiler output – CPU profiling, instrumentation mode in NetBeans – for a test that executes “SELECT 1” in a loop, shows com.mysql.jdbc.StatementImpl.findStartOfStatement()
taking 7.5% of runtime. Ok, instrumentation results should be taken with a grain of salt, however the single reason string search is used, is because – if an update (DML) statement is executed inside ResultSet.executeQuery()
, it is rejected with an exception. This can be done differenty, I believe. If absolutely necessary, throwing an exception can be delayed, until the client finds out that the server sent an OK packet instead of a result set.
Even more interesting is the case with Drizzle JDBC. In theory, since the MariaDB driver has a Drizzle JDBC heritage, the performance characteristics should be similar, but they are not, so there must be a bug somewhere. It appears very easy to find, as according to profiler, 50.2% CPU time (take that number with a big grain of salt) is spent in a function that constructs a hexdump from a byte buffer. Looking at the source code, we find following line that is unconditionally executed:
log.finest("Sending : " + MySQLProtocol.hexdump(byteHeader, 0));
While the result of the hexdump is never used (unless logging level is FINEST), the dump string is still created, using relatively expensive Formatter
routines, concatenated with the String “Sending:”, and then thrown away… In Markus’ defense, hexdump()
is not his fault, it was contributed 3 years ago. But it remained undetected for 3 years. This bug is now filed https://github.com/krummas/DrizzleJDBC/issues/21 [UPDATE: this bug was resolved within hours after reporting]
So, let’s check how much we can gain by putting the offending code into an if (log.getLevel() == java.util.logging.Level.FINEST)
condition.
The QPS from “DO 1” raises from 15288 to 19968 (30%), and for “SELECT 1” we have increase from 13410 to respectable 16824 (25%). Not bad for a single line fix.
While the one-liner makes the Drizzle JDBC faster, with slightly better numbers than ConnectorJ, it is still not as fast as MariaDB.
In the MariaDB JDBC connector, there were a couple of improvements to performance which were made since forking. One of the early improvements was to avoid copying data unnecessarily when sending, and to decrease the number of byte buffers. Another improvement came recently, after profiling and finding that parsing Field packets is expensive (mostly due to the construction of Strings for column name, aliases, and etc…). The improvement was lazy parsing, delaying string construction, and avoiding it entirely in most cases. For example, if column names are not used, and rows are accessed using integer indexes in ResultSet.getXXX(int i)
, the metadata won’t be fully parsed. Also, perhaps there were some other fixes that I do not remember anymore. 🙂
Can we further increase the QPS?
We can try. First, statements can be prepared. MariaDB and Drizzle so far only provide client-side prepared statements (ConnectorJ can do both client and server-side prepared statements) but using them saves having to convert the query to bytes, and JDBC escapes preprocessing. From now on I’ll stay just with “DO 1” which proved to be the fastest query. Trying it on MariaDB driver shows some minimal QPS increase 22104 (not prepared) vs 22183 (prepared), or 0.3%. Slightly more on ConnectorJ (19543 vs 20096, or 2.9%). Nothing revolutionary so far.
But, We still have not used all of the options in this (admittedly silly) quest for maximizing the performance of “DO 1”. Recall that ConnectorJ can support named pipes on Windows, which are allegedly much faster than TCP connections. Restart server with named pipe, set JDBC URL to “jdbc:mysql:///?socketFactory=com.mysql.jdbc.NamedPipeSocketFactory&namedPipePath=\\.\Pipe\MySQL&user=root&useConfigs=maxPerformance”, and rerun the test with 1000000 prepared queries. Now the QPS grew to 29542! That is strong, and is a 33% improvement compared to the best result seen so far. Yet, unfortunately, still no cigar, since JVM dumps a stack trace when the named pipe connection is closed. This is a “Won’t fix” (chalked off as a JVM problem) MySQL bug Bug#62518, which renders named pipe support almost useless – though maybe there is a trick to shut up th JVM somehow in this case, but I do not know of such a trick.
How fast is C client library in comparison?
Out of curiosity, I also tested how the native client compares to JDBC. With the TCP protocol, it does slightly better than the fastest JDBC (MariaDB, prepared), but it is not a huge margin – 24063 QPS vs 22183 (8.5% difference), and I believe Java drivers could improve further.
With named pipe, QPS is 33122, which is ~12% better than what ConnectorJ could do, if pipes worked properly there.
Accessing benchmark program
I put the benchmark program on Launchpad, together with the drivers. If you’re on Windows, and if you have a server running on port 3306, and the ‘root’ user doesn’t have a password, you can just branch the repository and run bench_all.bat. Those of you who are using other operating systems, I trust you to be able to quickly rewrite the batch files as shell scripts.
Given that Oracle owns MySQL and Java you think they could prioritize that bug fix with the named pipes 🙂
Also, it was great meeting you at PLMCE. Thanks for letting me show you Shard-Query.
Justin, you must have met someone else at PLMCE . Its been a while I attended a conference (last one was 2011, still OReilly MySQL conf 🙂 But thanks anyway!
After some thinking, I believe ConnectorJ bug might be solved without contacting JVM folks – since ConnectorJ derives NamedPipeSocket from a Socket, they could just l override the culprit method Socket.shutdownInput() to do nothing.
Oh, sorry, I guess I got you confused with another Vlad 🙂 oops. Anyway, great idea with respect to overriding the method.
Thanks for sharing! Couple of remarks:
“If absolutely necessary, throwing an exception can be delayed, until the client finds out that the server sent an OK packet instead of a result set.”
Well, the problem is that in this case the server would already executed the UPDATE statement, right? Even if the underlying storage engine would allow a rollback, you really don’t want to go near that position. Or maybe I’m missing something.
Regarding the benchmark itself: Personally I’m not sure the slightly better performance of the mariadb driver is noticeable in a more realistic scenario (i.e. real queries). My assumption is that in almost all cases database io, and in case of resultsets, network will sooner be a limiting factor than raw java performance.
However, it’d be really interesting if my assumption could be proved false. Have you done such tests? If not, that’s ok, maybe you’ll find time to do so in the future. If you have done such tests, then it’d be nice if you could share the result. If it turns out it doesn’t make much of a difference then it’s not necessary to post detailed data; I would be happy to take your word for it.
Despite these reservations, I think this does show that doing benchmarks is still useful from a standpoint of quality, since it helped identify and fix a drizzle bug. Drizzle has traditionally refrained from publishing benchmarks on the grounds that they are unrealistic and aren’t trusted anyway unless executed by an independent party; this exercise shows that such benchmarks still have value since they can identify bugs or poorly optimized code.
Another interesting observation is the comparison of C client lib versus JDBC driver. This is yet another confirmation that the myth that Java is slow should be buried.
So, thanks again! Enjoyable read.
About ConnectorJ DML detection – The standard says it this way
“If the SQL string being executed does not return a ResultSet object, the method executeQuery throws an SQLException”. It does not say – driver should attempt to parse SQL query in order to find out whether it possibly would be an update.
About realistic scenario : I’m not sure whether good performance is noticeable in realistic scenario either. I had to improvise, and I choose those queries – I was curious myself whether I could dig something useful , or learn something new . I did and I shared some tidbits.
I still think every milli-and-microsecond saved on client is good, user does not care where it is spent.
Yes, I had ran some more tests with larger result sets, and with bigger rows, and connect/disconnect, however I did not have time to really understand the results in detail, and I did not want to publish numbers without explanation. I also think my “large result set” tests sucked because they were retrieving from information schema, it has prohibitive overhead. They were also least conclusive.
However one thing that I noticed is that (already fixed) Drizzle bug was noticable, also on much slow queries with ~60 qps, and even with ~20 qps. IIRC, ConnectoJ driver was slightly better that MariaDB on blobs , but bad on connect/disconnect, even with “useConfigs=maxPerformance”. Drizzle JDBC had stellar connect/disconnect (it does nothing upon establishing connection), very closely followed by MariaDB with “fastConnect=true”. One day, when I have profiled those cases, it could make sense to write about it again.
Vlad, thanks for the extra info. Much appreciated. I agree that all performance gain is good, and surely as a developer I would care about these differences.
I don’t agree the user doesn’t care where the time was spent though; I mean, if they experience performance problems that can be explained mainly in terms of waiting for the network or database io, then they simply won’t be able to solve that by switching drivers. If they aren’t aware that switching drivers doesn’t make the difference they need or expect, they will just be disappointed spending work and time on something that didn’t solve their problems.
(Of course, I’m not saying it is any driver developer’s fault that the user doesn’t understand what problem they have on their hands 🙂
Roland, when I say “user doers not care where time is spent”, I mean the end-user, who wait for web page to load, and who does not know there is java and database behind. Of couse, one can argue that my own users (users of that driver) are DBAs, programmers, consultants etc, people who use JDBC to write web apps. Yes, these people hopefully do care and measure etc.
Good work vlad! 🙂 Would be interesting to see the difference in handling a more real scenario (where I assume the difference will be more or less zero) and big result sets, I had lot of issues in the past with different connectors… Also would be good during your tests to see the CPU usage…
no clue abotu real scenario, I hope there are people better qualified to know life than me:) With large result sets, the problems often is that drivers reads the full huge result sets into memory, and programs die of OOM.
Reading full result set into memory – this mode is default in JDBC connectors too ( Drizzle JDBC does not know any other mode)
ConnectorJ and MariaDB JDBC however have an option to read big results row-by-row (“streaming”)
It is activated with
stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
If streaming is used, problem that often “in real life” happens is that client does not read off the result set fast enough, and server blocks in socket writ()e, and if it waits long enough (net_write_timeout), and after that it just closes client’s connection. ConnectorJ increases net_write_timeout before executing streaming query. MariaDB JDBC does not do that but of course people can set their timeouts higher themselves, e.g in the connection URL’a “sessionVariables=net_write_timeout=1000”
You’re making a new connection for each query in this benchmark, correct? Would the performance difference between different drivers go away if using connection pool?
Nope, it is one-time connection and thousands of exactly the same queries. The speed of the connect was not measured at all, only of the queries. Connect/Disconnect is another story. I might write about this next time, this is where pooling would matter.
Would the differences be more prominent with MySQL’s query cache?