Friday, September 23, 2011

JAXB and Java 5 Enums

In my previous post about Java 5 Enums, I wrote about how Java 5 provides capabilities to add behavior (via member fields and methods) within the enum class. This can be leveraged to provide the switching-like on the enumeration constants. So instead of the following code fragment: We can define an abstract method in the Enum specification, and override that method in each of the Enum definitions as shown here. However there can be cases where it is not possible to change the Enum class in order to define an abstract method, if the Enum class has been generated, such as via JAXB compiler. In that case, the only way we can provide any custom behavior is by programming a switch construct as above in a class that leverages the enum. Digressing a little bit from the main point, I would also like to point out how JAXB treats different XSD enum definitions. See the examples below from a file, example.xsd which contains the 2 schema definitions, viz. CountryCurrencyEnum and CountryEnum. The above example demonstrates 2 Enum examples. The currency Enum is an example of inline definition of the Enum, while the Country enum is an example of a external Enum definition. When you run the latter explicit enum definition through the XJC compiler, it produces a separate Enum class as below: However the XJC compiler will produce the following file when it sees the inline Enum definition. So if we are interested in generating Java 5 style enums via JAXB, we are better off creating a separate definition in the xsd file.

Saturday, August 20, 2011

Customizing Maven's lifecycle

Every project has a lifecycle from project initiation all the way through regular production releases. A typical lifecycle involves at least the basic steps of getting all the resources together, compilation of source-code, unit-tests, integration-tests, packaging to create the custom-format. Maven formalizes this concept into a “default” lifecycle that every project is expected to have. According to Maven, a lifecycle is made up of a sequence of phases. Each phase has zero or more plugin goals bound to it. Some of the core plugin goals are bound to the core phase steps by default, e.g the compiler plugin’s compile goal is bound to the compile phase. A more clear second example would be the Surefire plugin’s test goal is bound to the unit test phase. However, some phases are free-form and not necessary part of the core of the default lifecycle, they will be activated only when a corresponding plugin goal bound to that phase by default is configured. At the same time, some plugins and their goals are free-form and it is upto the developer to bind them to a particular phase.

Maven site lists all the following goals as part of the default lifecycle for any project regardless of the packaging type. Here are a few examples on what aspects of the project build we can customize by leveraging what plugins.

1.validate - Validates the project is correct and all necessary information is available. The enforcer plugin can be used here to enforce a few environment-specific constraints on the project.
2. initialize - Initialize build state, e.g. set properties or create directories. An example would be to use some programmatic plugins like antrun or groovy to generate custom properties. In the example below, the property ‘buildtime’ was created during build-time and it will be used inside the manifest file to mark the time the jar/war/ear was created. 3.generate-sources - Generate any source code for inclusion in compilation. Projects that use webservices or any of the Java-XML bindings, like JAXB2, XMLBeans, Castor can make use of the corresponding plugins (e.g. axistools:wsdl2java, jaxb2:xjc, xmlbeans:xmlbeans, castor:castor) to generate any source code in this phase. These plugin goals bind by default to this lifecycle phase so that you do not have to explicitly mention it.
4. process-sources - Process the source code, for example to filter any values from within the source code. More on filtering in a later blog.
5. generate-resources - Generate resources for inclusion in the package like WSDLs in a web-services project (e.g axistools:java2wsdl), or Hibernate Configuration XML or hbm.xmls (hibernate3:hbm2cfgxml)
6. process-resources - Copy and process the resources into the destination directory, ready for packaging. This is also a great place to filter any values from within the resources file. More on filtering in a later blog.
7. compile - This is the most obvious phase. Although the compiler plugin is part of the core lifecycle and need not be defined explicitly to be attached to any phase, there is often a need to customize the JDK version to use. Eg
8. process-classes- This phase is used to post-process the generated class files from compilation, for example to do bytecode enhancement on Java classes.
9. generate-test-sources - This is conceptually same as generate-sources phase as above, except that we are generating test source code here.
10. process-test-sources - This phase is used to process test source files, e.g to filter any values, make program variables point to a test DB location instead of production DB location etc.
11. generate-test-resources - This phase is used to create resources for testing. E.g. Let's say your project's tests are off some companydata.dat file which is a part of the companycore.jar. Since your project is interested only in this data file for the purposes of testing, you could use maven-dependency-plugin to unpack the companycore.jar, extracting only that file into your test-resources directory.
12. process-test-resources - This phase is used to copy and process the resources into the test destination directory.
13. test-compile - The compiler plugin compiles the test source code into the test destination directory.
14. process-test-classes - Conceptually same as process-classes to do any byte code enhancement.
15. test - The Surefire plugin runs tests using a suitable unit testing framework. You may need to configure the plugin to skipTests or exclude any particular test class that are causing the build to fail. This may be useful during active development phase.
16. package - Take the compiled code and package it in its distributable format, such as a JAR. You can request Maven to package source files and test source files into respective jars via maven-jar-plugin and maven-source-plugin in addition to the distributable class file jar that it creates.
17. verify- This phase can be used to run any checks to verify the package is valid and meets quality criteria. It is a great place to enforce any source code formatting styles.
So the above list gives a fair idea on how you can leverage Maven plugins to customize your build lifecycle. The above are just brief samples. Any ideas beyond the above usage of plugins in various build phases is more than welcome.

Saturday, July 30, 2011

Maven's assumptions

Maven does both - 'makes assumptions' as well as 'enforces' a lot of standards and good practices for project programming and management at an enterprise-level. It also provides some good features such as resource filtering, though not enabled by default, suggest good programming practice. I wish to list down all of them as they come to my mind from the very obvious to the subtle ones below:

  1. Projects would have a standard directory layout relative to project's home directory to place source code, target binaries, test source code, test binaries, resources etc as described in the Introduction to Standard Directory Layout.
  2. Projects would adopt unit-testing as a standard practice by including the test-compile and test phases as part of the build lifecycle. Maven even goes a step further by failing a build if the test source does not compile or the unit tests fail in its default build lifecycle.
  3. Projects provide a clean directory/folder separation between application source code and the resources required (such as Spring applicationContext files, Hibernate configuration files, log4j properties file etc) and not put everything together in project home folder.
  4. Projects need not hardcode application properties inside the configuration resources or source code and instead make use of resource filtering - i.e. a property like jdbc.driverName could be externalized in the /src/main/filters and its value will be substituted wherever it is referenced. Thus any changes to it are localized to one file under /src/main/filters although it can be referenced in several places in configuration and code.
  5. Modern-day applications (represented by an aggregate project model / POM in Maven) will be divided into separate modules - app-domain-model, app-DAO, app-business-logic, app-web-modules, app-webservices-modules, app-utils etc - instead of old-school monolithic applications. Maven multi-module projects and the Reactor plugin help achieve this.
  6. Projects would need to be built for various platforms (Windows, Unix, Linux etc) and deployment environments (dev, staging, test, production) and builds do differ based on the platform and enviroment needed. Eg. different database servers required for each enviroment thereby a different jdbc.url, jdbc.username, jdbc.password for each environment. Maven Profiles are a great feature that if leveraged well can make the build process seamless across environments.

Maven references

Recently I have been learning Maven as it is widely deployed and used at an organizational level in a very clever and effective manner. However, the Maven documentation, although quite extensive for an open-source project does not flow logically to educate a newbie. After scouring through most of the available online information in a not so logical progression and then trying to connect the pieces together, I came to a conclusion that the following would be the ideal order to comprehend Maven.

Online reference books made available by Sonatype.

All about Maven settings.

Core of Maven - Project Object Model and all its intricacies - multi-module builds, profiles, reactors etc

Tuesday, July 12, 2011

CVS and (no) Atomic Commits

Every project that I have worked so far has used a different version control system. My history of using version control systems has been thus - Rational ClearCase in my very first job/project, then Visual Sourcesafe, then Subversion for a very brief period of time (about 3 months) and then Perforce for the longest time as yet(for a little over 4 years) and now I am currently using CVS.

Using Perforce was a pleasant experience and its atomic commits combined with its changelist/changeset feature is something that I am missing while using CVS currently. In CVS, each file committed to the repository has its own, independent version number and history which sure is a limitation. I do not remember enough about Rational ClearCase except that its config-specs pretty much allowed a much efficient branching and merging in a manner that could emulate atomic commits. So this blog entry is about atomic commits and changesets and how they reduce the frequency of build breaks which seem to happen every once in a while in a high-traffic team workload.

A typical reason why a build could break is that developers fail to commit all of the files that go in as part of the task or a bug fix. As a consequence when one or more of the checked-in files try to access constructs(classes, methods, constants etc) newly introduced in the file that was forgotten to be checked-in. And thats the build breaking right in your face!

In my opinion, the kind of SCM being used as the source code repository(read CVS, Visual SourceSafe) can also be a contributing factor towards this. Ideally, when a developer commits files to the repository, it is great if the files are grouped together as a single atomic change towards the bug fix, new feature development or new task. So even mentally developers start to group all files together, thereby reducing the probability of a build break. Also in the event of a networking failure, atomic commits save the build by ensuring all or none of the files are submitted to the repository.

Perforce takes one more step beyond these atomic commits - if a developer modifies any of the local files that are mapped to the repository, it automatically puts them in a pool called 'changelist'. That way there is no chance that a developer could have forgotten to check-in any file.
Changelists serve two purposes:
- to organize your work into logical units by grouping related changes to files together
- to guarantee the integrity of your work by ensuring that related changes to files are checked in together.

Now different version control systems record this atomic commit differently in their history. For Subversion, a changeset/changelist is just a collection of changes with a unique name. The commit will create a new revision number which can forever be used as a "name" for the change. As per Subversion documentation:
    "In Subversion, a global revision number N names a tree in the repository: it's the way the repository looked after the Nth commit. It's also the name of an implicit changeset: if you compare tree N with tree N-1, you can derive the exact patch that was committed. For this reason, it's easy to think of “revision N” as not just a tree, but a changeset as well. If you use an issue tracker to manage bugs, you can use the revision numbers to refer to particular patches that fix bugs—for example, “this issue was fixed by revision 9238”. Somebody can then run svn log -r9238 to read about the exact changeset which fixed the bug, and run svn diff -r9237:9238 to see the patch itself."

Perforce keeps track of each file's independent revision history as well as changelist numbers. Again you can associate changelist numbers with a bug/task tracking database and we can go back and forth between the SCM and the bug/task tracking database.

As I searched through the web, I did find some solutions and forums that discussed ways to get around this limitation which will be the next task on my agenda.

Wednesday, June 29, 2011

Java 5 Enums

Enums or enumerated types basically means a type that can be defined to have a certain set of fixed values as per the problem domain. Historically, enums (via enum keyword and any associated semantics) were missing from the featureset provided by versions upto Java 1.4.
However developers tried to achieve the "same enum effect" via something like this:
Example 1:
public class Currency {
  public static final int USD = 1;
  public static final int EUR = 2;
  public static final int GBP = 3;
  public static final int YEN = 4;
}

public class CurrencyConverter {
  public void convertCurrency(int fromCurrency, int toCurrency) { ... }

  public static void main(String[] args) {
    CurrencyConverter cc = new CurrencyConverter();
    cc.convertCurrency(Currency.USD, Currency.YEN);
  }
}
Additional currencies could be added to the Currency class by defining new constants. However, the convertCurrency (int, int) method lacks typesafety since the method signature indicates it can accept any int. However, the only acceptabe range of ints is 1 through 4. If we call the method outside of the range of agreed upon constants , e.g. convertCurrency(8, 10), the program fails.

The above can be avoided if we accept that Enumerations should be treated as a separate type. Implementing them as a sequence of integers is not helpful. To define enumerations as their own type, you do the following:

1. Replace the primitive ints above with 'static final' object references to the same class defining the enumerated constants.
2. Disallow any object creation of the class via a private constructor.

Example 2:
public final class Currency {
  public static final Currency USD = new Currency(1);
  public static final Currency EUR = new Currency(2);
  public static final Currency GBP = new Currency(3);
  public static final Currency YEN = new Currency(4);

  int value;

  private Currency(int value){
    this.value = value;
  }
}

public class CurrencyConverter {
   public void convertCurrency(Currency fromCurrency, Currency toCurrency) { ... }

   public static void main(String[] args) {
     CurrencyConverter cc = new CurrencyConverter();
     cc.convertCurrency(Currency.USD, Currency.YEN);
   }
}
The convertCurrency(Currency, Currency) now takes the Currency type instead of an int.

Secondly the acceptable values of Currency can only be defined inside the class due to the private constructor.This along with the fact that Currency is a final class ensures that Currency.USD, Currency.EUR, Currency.GBP and Currency.YEN are the only instances of the Currency class.

It also means that we can use the identity comparison (==) operator instead of the equals() method when comparing enum values. Identity comparison (==) is always faster than equals since we are only comparing object references in the former as opposed to object values in the latter.

However the above typesafety approach comes with the following disadvantages:
1. The above implementation is not Serializable and Comparable by default. It means we can have issues using them in the context of RMI and EJBs. In case, if we make them Serializable, constructing the object again creates a new instance of the same Currency by ignoring its private constructor completely and does not retrieve the same instance that was serialized. This means == comparison fails to identify the equality of a serialized and a non-serialized Currency. Also it means we no longer have a unique single instance of the currency type. We have to implement more boiler-plate code like implementing the readResolve method as suggested in http://www.javaworld.com/javaworld/javatips/jw-javatip122.html?page=2.

2. We cannot switch over the above enum values (remember it is easier to switch over ints) to get any business logic done. If we need to switch, it can be facilitated by providing a getValue() method that returns the int value.

Example 3:
Inside Currency class,
public class Currency {
      ....

      public int getValue() {
        return value;
      }
   }

   public class CurrencyConverter {
     public void convertCurrency(Currency fromCurrency, Currency toCurrency) { ... }
   }
Java 5 enums are a typesafe feature and overcome all the above problems listed with the Enumerated pattern. In addition to facilitating a way to list a set of constant values, they also provide features such as :

1. All defined enums implicitly extend from java.lang.Enum just as all objects implicitly extend from java.lang.Object.

2. The above feature taks care of default implementation for toString(), equals(), hashCode() methods.

3. They are Serializable and Comparable by default without generating duplicate instances during deserialization

4. They can be used in switch-case statements.

5. They can have behavior ( via member variables , methods , constructors, interface implementations etc) in addition to just specifying the constants.

Simplest example of Java 5 enum class with no additional behavior.

Example 4:

public enum Currency { USD,GBP,EUR,YEN }

public class CurrencyConverter
{
   public void printCurrencies() {
     for (Currency currency : Currency.values()) {
        System.out.println(currency);
        System.out.println(currency.ordinal());
   }
} 

}
Currency is an enum type, and all the above enum values, viz USD, GBP, EUR, YEN are of type Currency.

We can iterate through all the instances of Currency via the static values() method and take advantage of the toString() method in the print statement.

Example 5:
We can also add behavior via member fields and methods.

public enum Currency 
  {
    USD("United States"),
    GBP("United Kingdom"),
    EUR("Europe"),
    YEN("Japan")

    String country;

   public Currency(String country){
     this.country = country;
   }

   public Currency getCurrencyForCountry(String country) {
     return Currency.valueOf(country);
   }
 }

When you need to provide custom behavior based on the enum values, there are 2 ways of doing it. Either you switch case based on the enum values in the application code or a yet better way is to move the custom logic inside the enum class as follows:

Example 6:
public class Client 
   {
     enum HttpStatusCode
     {
        HTTP200("HTTP 200") {
         @Override
         void printMessage() {
            System.out.println("Successful Transaction ");
         }
     },
     HTTP401("HTTP 401 Error") {
        @Override
        void printMessage() {
          System.out.println("Authentication Failure");
        }
     },
     HTTP404("HTTP 404 Error") {
          @Override
          void printMessage() {
             System.out.println("Requested resource not found at specified location");
           }
     },
     HTTP500("HTTP 500 Error") {
       @Override
       void printMessage() {
          System.out.println("An error occured on server-side, please have   patience.");
      }
    };

    String statusString;

    HttpStatusCode(String statusString) {
       this.statusString = statusString;
    }

    abstract void printMessage();
 }

   public static void main(String[] args)
   {
     HttpStatusCode status = connectToServer();
     status.printMessage();
   }
 }

Switching over case statements could be used if you have no option of modifying the enum class code. This can happen in cases where the enum class is generated - e.g. using JAXB - from an XSD. More on this in a later blog.

So this covers the basics of Java 5 Enums. Java 5 also provides 2 data structures - java.util.EnumSet and java.util.EnumMap. More on this again will be in yet another blog.


Monday, January 5, 2009

Tomcat 6 and class loading

In continuation with the previous blog entry, I would also like to write about Tomcat 6 (in particular) and class loading pattern that it adopts. Java allows the creation of custom class loaders by implementing the java.lang.ClassLoader. Now Tomcat 6 creates the following class loaders on startup. They share a parent-child relationship too, but NOTE that the delegation pattern is a bit different as will be explained

Although invisible in default installation of Tomcat 6, there are additional shared and server class loaders also available and they fall below the Common class loader in the hierarchy. Each of the class loaders has a responsibility to load classes from certain specific areas, noted below:  

1. Bootstrap + Extension class loader - It loads the Java run-time classes in the JDK as well as any classes from the jars in the Extensions folder.  

2. System class loader - As noted in the previous blog, the System class loader is responsible for loading the classes and the Jar classes present in the CLASSPATH. But an important NOTE here: Tomcat clears the user-set CLASSPATH entry in its startup.bat or startup.sh file. Instead it sets the CLASSPATH to be following: $CATALINA_HOME/bin/bootstrap.jar $CATALINA_HOME/bin/tomcat-juli.jar  

3. Common class loader - This is a Tomcat 6 provided class loader. It loads the classes present in the following folder - $CATALINA_HOME/lib. These classes are available to Tomcat as well as all the web applications that will be hosted on this instance of Tomcat. Although developers can reference the APIs from the jars inside the $CATALINA_HOME/lib directory, they shouldn't be placing their own custom classes and/or jars in there. If developers need certain custom classes and/or jars to be shared by all web applications, then they should be placed where the shared class loader can see them. Note that Tomcat 6.0.14 the $CATALINA_HOME/shared/lib directory does not exist. So this can be done in Tomcat 6 as foll:
  • Create your own $CATALINA_HOME/shared/lib directory.
  • Modify $CATALINA_HOME/conf/catalina.properties by changing the line: shared.loader = ${catalina.home}/shared/lib
However the above does not apply to certain 3rd party libraries such as database drivers etc where Tomcat itself would need to set up data sources. Such jars have to be placed in the $CATALINA_HOME/lib folder for the common class loader to see. One can also add more jars for the common class loader without placing them under the $CATALINA_HOME/lib folder. This can be done by modifying $CATALINA_HOME/conf/catalina.properties by changing the property common.loader as above.  

4. WebappX class loaders - Tomcat creates a class loader for every webapp that is deployed in its instance. This class loader loads classes under WEB-INF/classes and WEB-INF/lib folder. It is for these class loaders where the delegation model deviates, thanks to the Servlet Specification which states as follows: "It is recommended also that the [web] application class loader be implemented so that classes and resources packaged within the WAR are loaded in preference to classes and resources residing in container-wide library JARs."
However the above specification cannot override the Java standard delegation model of delegating to Bootstrap and System class loaders. It only is used to override the parent-child relationships that are introduced by Tomcat - ie. Common, Shared and WebappX class loaders. So when an application requests a class, the class loading hierarchy is as follows:
  1. The bootstrap class loader looks in the core Java classes folders.
  2. The system class loader looks in the $CATALINA_HOME/bin/bootstrap.jar and
  3. $CATALINA_HOME/bin/tomcat-juli.jar
  4. The WebAppX class loader looks in WEB-INF/classes and then WEB-INF/lib
  5. The common class loader looks in $CATALINA_HOME/lib folder.
  6. The shared class loader looks in $CATALINA_HOME/shared/classes and $CATALINA_HOME/shared/lib if the shared.loader property is set in conf/catalina.properties file.