Steven Devijver lives in Antwerp, Belgium. He works as a Java
contractor focusing on financial services. As a former Perl addict he
now gets his kicks from Groovy. He holds a big interest in remote
GUI's and persistence strategies.
If you are creating or maintaining servlet based web applications you should take the time to read the servlet specifications if you haven't already. They contain a great deal of very useful information on how servlet containers work which developers creating web-based applications should understand.
One of the more interesting areas of the specifications is how servlet containers handle incoming requests. The servlet container creates a singleton of each servlet defined in the web.xml configuration file. Requests may and probably will occur concurrently which means multiple threads will be running your code simultaneously. This means thread-safety is an important issue in web application. Developers should be aware of this issue and should make sure their code works in a thread-safe way.
Let's take a moment to think about what thread-safety means and how you can achieve it. There are many resources available on the subject of thread-safety and I really like the article on Wikipedia. Basically the article says a piece of code is thread-safe if it is reentrant or protected from multiple simultaneous execution by some form of mutual exclusion.
Most of you probably know or have heard or read about the synchronized keyword in Java. Java offers thread support natively - without additional libraries - and the synchronized keyword is seen by many as the corner stone of thread-safety in Java. Synchronization offers mutual exclusion in Java. Synchronizing a block of code or an entire method guarantees not more than one thread will execute the code at any given time, effectively guaranteeing thread-safe execution. One of the side effects of synchronization is congestion. You probably know the sight of those young girls sitting at the reception desks of big corporations or maybe in a lawyers office. They have to take all incoming phone calls, handle the snail mail, maybe even handle the e-mail of multiple executives and welcome visitors. On busy days this probably leads to a lot of stress and it certainly leads to congestion.
Congestion is a big problem in web-based applications. Having even a single synchronized block of code executed by every incoming request may bring your application to its knees with no more than 5 simultaneous users as only one thread can execute a synchronized block of code while other threads remain suspended until they can obtain a lock. Congestion is not the only problem created by mutual exclusion, you may also be faced with deadlocks. A deadlock is usually non-recoverable and is established when thread A has a lock on an artifact in the application and waits for an artifact locked by thread B. Meanwhile, thread B actually waits on thread A to release its lock. Often deadlocks are much more complex involving many threads creating a cascade of locks that together result in a deadlock.
Besides the synchronized keyword Java also offers a number of synchronized objects that are widely used and may lead to the same problems. They are java.util.Hashtable and java.util.Vector. Their methods are synchronized, so only use them if you really really have to. Otherwise use java.util.HashMap and java.util.ArrayList. The synchronized methods in java.util.Collections also offer synchronized behavior.
Reentrance introduces a different set of problems yet they are easier to manage. Reentrant code avoids sharing data across threads. Lets put this in a Java perspective. First you need to understand - or maybe you understand already - methods in Java are explicitly thread-safe. Consider the trivial method below:
public Double pi() {
int a = 22;
int b = 7;
return new Double(a / b);
}
No matter how many threads enter this method simultaneously, never ever will this method be unsafe. Every thread has its own stack that is local to the thread and not shared with other threads. When a thread enters a method variables created within the scope of that method are stored in the stack of that thread (this also goes for static methods). So when threads A and B enter this method simultaneously they will both create variables a and b. The example is thread-safe because no data is shared. Note: the product of dividing 22 by 7 comes close to pi and will do for most use cases but it's not exactly pi.
Consider the following example that optimizes performance by doing the calculation once and reusing the result afterwards.
private Double pi = null;
public Double pi() {
if (pi == null) {
pi = new Double(22 / 7);
}
return pi;
}
While this example may improve performance it is not thread-safe. Consider the case where pi is null and threads A and B enter the method and simultaneously execute line 4. They both test if pi is null and both tests return true. Next, consider thread A continues execution, returns the reference to the memory address that contains a object variable that contains the result of 22 divided by 7 and exits the method while thread B is suspended by the VM for some reason. When the execution of thread B continues afterwards a new object variable will be created on line 5 and the memory address held by the reference that has already been returned by the method will be overwritten with a new memory address. This is potentially very dangerous and could lead to nasty bugs.
Consider this example which uses ThreadLocal to make the method pi() again thread-safe while still offering performance gains:
private static ThreadLocal pi = new ThreadLocal();
public Double pi() {
if (pi.get() == null) {
pi.set(new Double(22 / 7));
}
return (Double)pi.get();
}
The ThreadLocal class wraps any object and binds it to the current thread thus making objects local to the thread. When a thread executes the pi() method for the first time there will be no object bound to the thread by ThreadLocal instance pi so the get() method will return null. The set() method will bind an object to the thread that is not shared by other threads. If the method pi() is called often per thread this approach may still offer a considerable performance gain while guaranteeing thread-safety.
There's a lot of skepticism about the use of ThreadLocal. While it's true the performance of ThreadLocal was very poor prior to Java 1.4 this issue has been resolved in the meanwhile. It's also true there's a lot of misunderstanding about the use of ThreadLocal resulting in a lot of misuse. However, the usage of ThreadLocal in the example above is perfectly save. As you can see, the behavior of the method has stayed the same after introducing ThreadLocal, we only made the method thread-safe.
Writing thread-safe code by reentrance requires you to be careful when using instance variables or static variables, especially when you are modifying objects that may be used by other threads. Synchronization may offer a outcome is some use cases. However, testing your application to identify performance bottlenecks that may be introduced by synchronization is only possible by using profiling tools and testing under load.
After this crash-course in thread-safety let's consider how this could affect web application by examining a typical use case. We create a number of pages that allows the manipulation of data in a database through a well-defined process. This process is implemented partially in the workflow of the web tier and partially in business logic. We use Hibernate to persist our domain model to the database. The web tier could be Tapestry, Wicket, Struts, Webwork, JSF, Spring MVC or any other framework designed to run in the servlet container.
How the web tier is implemented is outside the scope of this article. Instead we will focus on how to manage database connectivity which is usually the biggest source of problems related to thread-safety in web applications. Database connectivity objects like connections, result sets, statements or Hibernate sessions are stateful objects. They are not thread-safe by design and should not be shared by multiple threads simultaneously. We've decided upfront we want to avoid the use of synchronization in our code - either through the use of the synchronized keyword or synchronized classes like Hashtable or Vector - because we do not want to deal with congestion or deadlocks. Instead we want to achieve thread-safety through reentrance.
Still, implementing thread-safe database access through reentrance remains a tedious task. Some clever people have come up with a solution by adding a filter in the servlet container configuration. Such a filter would amongst other things create a JDBC connection or Hibernate session at the start of the request before the web tier is invoked and bind it to the current thread by means of ThreadLocal for use in the business logic. While this approach would allow us to achieve thread-safe data access there still is the issue of managing transactions and database errors in addition to the required use a lot of boilerplate code in the business logic. Boilerplate code is particularly bad not only because we have to copy and paste it which makes our application hard to maintain but also because it's potentially buggy.
Some of you may have heard of the data access abstraction offered by Spring, some may have used it. Spring's data access abstraction is known for declarative transaction demarcation and reuse of data access best practices through templates. One area that is less covered when reviewing Spring is the thread-safe way in which data access is achieved. This achievement is available in Spring for JDBC, Hibernate, JDO, iBatis and TopLink. Let's have look at how this works in our typical use case.
We start by defining a data source and session factory for Hibernate.
<bean id="propertyConfigurer" class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="locations">
<list>
<value>WEB-INF/jdbc.properties</value>
</list>
</property>
</bean>
<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
<property name="driverClassName"><value>${jdbc.driverClassName}</value></property>
<property name="url"><value>${jdbc.url}</value></property>
<property name="username"><value>${jdbc.username}</value></property>
<property name="password"><value>${jdbc.password}</value></property>
</bean>
<bean id="sessionFactory" class="org.springframework.orm.hibernate.LocalSessionFactoryBean">
<property name="dataSource">
<ref bean="dataSource"/>
</property>
<property name="mappingDirectoryLocations">
<list>
<value>classpath:</value>
</list>
</property>
<property name="hibernateProperties">
<props>
<prop key="hibernate.dialect">net.sf.hibernate.dialect.HSQLDialect</prop>
<prop key="hibernate.show_sql">true</prop>
</props>
</property>
</bean>
This is a basic setup for a typical use case using Hibernate. We define a data source to connect to the database and a local session factory that creates an instance of a Hibernate session factory. Our next goal is to add a business object that accesses the database through Hibernate and a transaction manager that manages local transactions through a Hibernate session. The business object exposes a method that creates an object in the database while the transaction manager wraps this method in a transaction. First lets have a look at the business object and its interface.
public interface CustomerDAO {
public void createCustomer(Customer customer);
}
public class HibernateCustomerDAO implements CustomerDAO {
private HibernateTemplate hibernateTemplate = null;
public void setSessionFactory(SessionFactory sessionFactory) {
this.hibernateTemplate = new HibernateTemplate(sessionFactory, false);
}
public void createCustomer(Customer customer) {
this.hibernateTemplate.save(customer);
}
}
As you can see from the example above we use HibernateTemplate which is offered by Spring. This template class is implemented following best practice guidelines and abstracts boilerplate code. It will also convert checked exceptions to unchecked exceptions thrown by the data access technology and does a great job at throwing specific exceptions for specific error conditions. Similar template classes are offered by Spring for JDBC, iBatis SqlMap, JDO and TopLink. All of these template classes and their instance variables are thread-safe through reentrance and can be shared safely by multiple threads concurrently. Using these templates not only leverages reuse of code and offers best practices. There's something else going on that offers us thread-safe data access. Let's have a look at how we set up our business object and transaction management in Spring.
<bean id="customerDAOTarget" class="test.usecase.HibernateCustomerDAO">
<property name="sessionFactory"><ref bean="sessionFactory"/></property>
</bean>
<bean id="transactionManager" class="org.springframework.orm.hibernate.HibernateTransactionManager">
<property name="sessionFactory"><ref bean="sessionFactory"/></property>
</bean>
<bean id="customerDAO" class="org.springframework.transaction.interceptor.TransactionProxyFactoryBean">
<property name="transactionManager"><ref bean="transactionManager"/></property>
<property name="target"><ref bean="customerDAOTarget"/></property>
<property name="transactionAttributes">
<props>
<prop key="create*">PROPAGATION_REQUIRED</prop>
<prop key="*">PROPAGATION_REQUIRED</prop>
</props>
</property>
</bean>
For those that are not familiar with transaction management in Spring, let's go through the configuration. First we setup up our business object called HibernateCustomerDAO which we wire with the Hibernate session factory instance. Note that all beans in Spring are by default singleton, so is our business object. This means multiple threads may execute the createCustomer() method simultaneously.
Next we configure the Hibernate transaction manager which is wired with the same Hibernate session factory instance. This transaction manager will do a number of things every time it is called upon. First of all the transaction manager will check if a Hibernate session is bound to the current thread and will use it if one is available. If no Hibernate session is bound to the current thread the transaction manager will ask the Hibernate session factory for a new Hibernate session and will bind that session to the current thread. Next - as defined in our setup - the transaction manager will start a new transaction through the Hibernate session if no transaction is currently active, otherwise the current active transaction is joined.
This behavior can be specified declaratively through the TransactionProxyFactoryBean which is also bundled with Spring. TransactionProxyFactoryBean creates a proxy object for our business object that manages transactions through a transaction manager. Every time the createCustomer() method is called through this proxy object the transaction manager will manage transactions as defined by the transaction attributes. Spring currently offers a number of transaction manager implementations next to the HibernateTransactionManager, including transaction managers for JDBC data sources, JDO and TopLink.
Now, let's go back to our business object. When we call the createCustomer() method HibernateTemplate will look for a Hibernate session bound to the current thread. Because we passed false as a second parameter in the HibernateTemplate constructor a unchecked exception will be thrown if no Hibernate session is found, which offers another security net in case we did not configure transaction management properly for the createCustomer() method. Remember what we said about the transaction manager. If transaction management is configured properly a Hibernate session will be bound to the current thread and a transaction will have been started. Note however HibernateTemplate will not check if a transaction is currently active, nor will it explicitly start or end transactions. Also note the current active transaction will be rolled back if a unchecked exception is thrown in the scope of a demarcated method. Through transaction attributes we can modify this behavior declaratively although this goes beyond the scope of this article.
Let's briefly sum up what we've learned about thread-safe data access offered by Spring. By using transaction management and leveraging the power of ThreadLocal Spring binds a database connectivity artifact - either a JDBC connection, a Hibernate session or a JDO persistence manager - to the current thread that is used by the data access templates. If you put this into perspective based on what we talked about in the first part of this article you can see that database connections are not shared between threads concurrently. Spring does not only offers declarative transaction management, abstraction of boiler plate code and best practices, it also offers thread-safety. Achieving thread-safety in your application through reentrance is straight-forward when using Spring for database connectivity.