Thursday, March 10, 2011

concurrency concepts in databases and hibernate

I've read about this multiple times but often forget the details as it is not often that I need to work on it in my applications. So, here I am taking quick notes for faster recap of it later.

First thing to know is what are the issues that can happen by multiple concurrent transactions if no preventive measures are taken.

lost update : I have found different meanings to this term. The hibernate book describes this as follow. One transaction commits its modifications and another one rolls back resulting undoing of the updates done by the other transaction.
However at many other places lost-update has the meaning that of second-lost-update described later. And, ANSI SQL-92 standard does not seem to talk about it.

dirty read : One transaction fetches a record and modifies it. Another transaction fetches the same record and gets the modifications made by first transaction even if they are not committed.

unrepeatable read : One transaction reads some record, another transaction commits some modifications to same record, first transaction reads the same record again and finds it different from what was read previously.
One special case of unrepeatable read is the second lost updates problem. Let say, One transaction reads some record and modifies it, another transaction reads same record and modifies it. First one commits its modifications and then the second commits and modifications made by first transaction are lost.

phantom read : One transaction reads a list of records by some filter criteria. Another transaction starts, inserts some records that fulfill the filter criteria used by first transaction and commits. First transaction reads same list again by same filter criteria but ends up getting a different set of records this time.

Now, to overcome above issues ANSI standard defines following transaction isolation levels.

read uncommitted : this permits all 3.. dirty read, unrepeatable read and phantom reads.

read committed : this does not permit dirty reads but allows unrepeatable as well as phantom reads.

repeatable read : this does not permit dirty and unrepeatable reads but allows phantom reads.

serializable : this is the strictest transaction isolation which simply does not seem to allow concurrent transactions. this again is usually not used in typical applications due to being too restrictive and badly performing. This does not allow any of dirty read, unrepeatable read or phantom reads.

exact implementation of above isolation levels varies significantly among the vendors. One has to consult the docs of particular db to understand the impact of each isolation level to performance and scalability.

In Hibernate, by default, every JDBC connection to a database is in the default isolation level of the DBMS, usually read-committed or repeatable-read. You can change this by explicitly setting hibernate.connection.isolation property. Then hibernate will set the mentioned isolation level to each new JDBC connection(note that it will not change the default isolation level of your db).

Optimistic concurrency control:
An optimistic approach assumes that everything will be OK and any conflicts are detected only in the end when data is written/flushed to the db.
Multi-user applications usually default to optimistic concurrency control and database connections with a read-committed isolation level.
Let say the isolation level is set to read-committed. Transaction A and B start at the same time and both read and modify the same record(they can not see each other's changes as dirty read is not permitted in read-committed isolation level). Transaction A commits and then B does. Then one of the following 3 things can happen.

last commit wins : both transactions commit successfully and B overrides any modifications done by A and no error raised.

first commit wins : when second transaction(B) is committed, conflict is detected and error is raised.

merge conflicting updates : conflicts are detected and one interactive dialogue helps resolving those conflicts

With hibernate, you get last-commit-wins by default. You can enable first-commit-wins strategy by enabling optimistic concurrency control. This needs versioning enabled for entities .


In the end, if you need more fine grained control on locking then you can use variety of "SELECT ... FOR UPDATE ..." statements by using LockMode.UPGRADE et al. This is called explicit pessimistic locking.


Reference:
Chapter-9: Transactions and Concurrency in the book "Java Persistence with Hibernate".
ANSI SQL-92 Standard Spec

No comments:

Post a Comment