In my last article Query vs Execute vs Prepare, I tried to give you a peek into how Golang tries to juggle between a normal query or a prepared statement based on how you choose to invoke a query or execute or prepare. We emphasized on one of the key aspects which is the network calls you would incur in choosing one over the other so that you can make an informed decision while working with the databse/sql package.
In this article, I’ll try to shed some light on the how the database connection is used by this package.
In order to run a query on a database, one must connect to the database first. I know that’s stating the obvious, but:
- Do you know that while opening a connection, Golang allows you to choose whether you want to work with a single connection — sql.Conn or with a database object sql.DB ?
- Why would one need to choose between the two?
- What are other considerations while choosing one over the other?
We’ll try to answer these three questions in this article.
Connection vs Connection Abstraction
Golang advocates to work with sql.DB which is an abstraction over an actual DB connection. This helps the consumer in the following ways:
1. sql.DB under the hood works with a connection pool.
2. For every query or execute or prepare fired on sql.DB, it gets a connection from the pool to fulfil that query and returns the connection back to the pool.
3. It offers automatic retries. So, if a query fails because of a bad connection, sql.DB would internally choose another connection from the pool and retry up to 10 times.
As part of the connection pooling in
database/sql, handling failed connections is built-in. If you execute a query or other statement and the underlying connection has a failure, Go will reopen a new connection (or just get another from the connection pool) and retry, up to 10 times. (Refer)
Hence, as you can see, working with sql.DB abstracts the connection and hence the pain of connection management from the application.
One thing you must observe is, that in order to reap the benefits offered by sql.DB, you are losing out on connection affinity i.e. the ability to do multiple queries on the same connection. A natural question would be: Why would one even need to have a connection affinity?
Need of Connection Affinity
Suppose you are tasked to design the database for TVA (The MCU TVA!) where records for all the timeline breaches as well as TVA origins are kept in a single table. I agree not the best of the designs, but please bear with me. A Loki variant who has access to the system may try to access the classified files about TVA origins. You want to restrict Loki to access only those records specific to his cases, and nothing else.
While you can try doing this using a where clause, a lesser invasive option would be to enforce a Row Level Security (read).
Row-Level Security (RLS) simplifies the design and coding of security in your application. RLS helps you implement restrictions on data row access. For example, you can ensure that workers access only those data rows that are pertinent to their department. Another example is to restrict customers’ data access to only the data relevant to their company. (Refer)
Enforcing RLS would need you to set a run time config on the DB before executing the actual query. This implies you will have to use the same connection for setting the run time config as well as running the actual query.
With sql.DB , you are not guaranteed that two successive queries will reuse the same connection, as it works with a connection pool.
How to achieve Connection Affinity?
There are two ways in which you can achieve this connection affinity.
1. Forego the goodness of sql.DB and use sql.Conn directly. This is the continuous single database connection on which you can run all the queries you need before letting go off the connection.
2. Work with sql.Tx, a transaction which internally has a connection affinity. When you prepare a transaction and then execute it, both these actions must happen on the same connection. If for some reason, the connection is not available at the time of execution (may taken up by a parallel request), a new connection would be picked from the pool, on which prepare and execute will be done afresh.
Now that we know the two ways in which we can get the connection affinity, we must also spare a thought about how to be as resilient without the inherent retries that we could get when working with sql.DB connection pool.
When working with a database, be it on premise or in the cloud, where network partition can never be ruled out, retrying a database query in case of an error is one of the simplest resiliency strategies.
Retry can again be thought of at two levels:
- Retrying the query again on the same connection: Useful in certain scenarios like a deadlock situation wherein the deadlock is resolved by the database by failing your query. You strategy here should be to get the connection and try the query in a loop with a configurable number of attempts.
- Retrying the query again with a new connection: Most of the time you may need this when you run into any connection errors. When working with a transaction, where say out of 4 queries A, B, C and D, you failed on D. In this case, you want to rollback and retry starting a new transaction followed by A, B, C, D again on a new connection in a loop with a configurable number of attempts.
- Golang exposes — sql.DB, sql.Conn and sql.Tx. Application must choose wisely depending on the need for connection affinity.
- With sql.DB you get access to connection pool and automatic retries but lose on connection affinity.
- With sql.Conn or sql.Tx, you get the connection affinity, but you must also take control of query retries (often with a new connection) from the application.