Monday, August 27, 2012

Ant Profiling

In my new job at Assurity Consulting I've been looking at some large builds. Some that take 6 hours or more to run in the Continuous Integration Server. There's a fairly random mix of Ant, Maven and good old shell. The last few days I spent some time diving into the details of a fairly large Ant build.

I've been using a combination of three techniques to analyze where the time is being spent in this build:

  1. Analysis by inspection.
  2. Analysis of "ant -d" output.
  3. Ant Profiler.

There are a number of files and many tasks and targets in this build, and I haven't been able to walk through the entire thing yet. But by browsing through "interesting" parts, I have been able to find out some significant things about the build.

For example, the build makes extensive use of <antcall>. Sometimes people use <antcall> when they don't understand the correct usage of "depends" on Ant targets. In this build, many of the <antcall> usages would be more appropriately done with <macrodef>. <antcall> is inefficient to use on targets within the same build file, because it reparses and copies properties. Please learn to use <macrodef> with Ant. It is your friend.

To extract some data about the Ant build as a whole, I ran it with "ant -d". On this particular build, it generates 600,000 lines of output. Again, a lot of detail.

With some shell piping we can see potential areas of duplication, or work being repeated:

ant -d >ant.debug.out 
sort ant.debug.out | uniq -c | sort -nr | head

For example, we can see whether certain Java source files are compiled more than once during the build:

egrep "^ *\\[javac\\]" ant.debug.out | sort | uniq -c | sort -nr | less

In this case I found that some classes are compiled twice, three, four or five times. Unfortunately, not enough classes to make a large difference to the build time, but still ...

Finally, I found a great Ant profiler called antro. This profiler is really neat. It is completely non-invasive. It uses the Ant listeners feature, and can be run like this to analyze a build:

ant -listener ru.jkff.antro.ProfileListener -lib ~/antro/antro.jar build.xml

This generates a JSON file, which can then be loaded into the profiler's GUI. The GUI is run simply with:

java -jar antro.jar

What could be simpler?

It provides a tree view of the build times, where you can drill down into any node and explore the detail. Using this tool it's easy to get a bird's eye view of the build time breakdown, as well as drill into areas of interest. For example, I was able to assess the overhead of <antcall> by looking at some of those nodes. Unfortunately, there seems to be no way to see the total number or overhead for a single task type such as <antcall> across the whole build. I was able to find the total number of <antcall>s (not the times though) from the "ant -d" output. This allowed me to estimate the total time overhead of <antcall> for this build.

Tuesday, August 21, 2012

Git and SVN on OSX 10.8 - is this the best way?

Don't get me started on Apple. I bought my first Mac a few weeks ago, and I'm still trying to figure out why. They are pretty, but they are also pretty frustrating. Other than the hardware compatability problems with Linux, which may actually have driven me crazy, Ubuntu is a much better-designed system.

For one thing, Apple seems determined to gradually eliminate standard UNIX software from this platform. Which kind of defeats the whole point I bought the thing -- I wanted a reliable UNIX-based system. I realize that Apple could not give a toss about UNIX geeks, having a very lucrative mass market of schmucks to serve, but still that doesn't console me much. One day I will post a long list of frustrations I have found since starting with OSX.

Today I just want to post how I got something working, in case I need it later. In Ubuntu, if I want to use Git and SVN, I have to type something like this:
sudo apt-get install subversion
sudo apt-get install git-core
Something like that. Also maybe
sudo apt-get install git-svn
I don't remember. Actually, I don't have to remember, or type it, because bash on Ubuntu has completions. There's nothing to it.

In OSX, there is no standard package management system. (Let alone bash completions.) There are a variety of independent efforts (Mac Ports, Homebrew, Fink), and they all hate each other. You pick one of those and hope for the best.

Except Apple keeps trying to break them.

The worst thing I have done with OSX so far is upgrading to 10.8. This was a stupid mistake, because 10.8 does not contain any useful features, but instead breaks things that I had come to depend on. I did it because I thought it was supposed to fix fullscreen mode with multiple monitors, which was already broken. It is still broken.

Prior to 10.8 I was able to get Git and SVN working using Homebrew. After 10.8 I get compilation errors.

So instead I installed SVN using the brain-dead Windows method of click-till-you-die installation from a binary DMG file from git-scm.com. And I installed Git similarly by clicking and clicking and clicking with a binary PKG file from WANDisco. This was fine, for independent Git and SVN operation. But Git-SVN did not work. I got

Can't locate SVN/Core.pm in @INC

because the Git Perl code is in one place, and the SVN Perl code is somewhere completely different.

Only the wise and benevolent creator can tell why these things need to talk to each other using Perl. It probably comes from Git being implemented using every UNIX hack tool under the sun.

I got them to work, apparently, by doing this:
sudo ln -s /opt/subversion/lib/svn-perl/auto /usr/local/git/lib/perl5/site_perl/auto
sudo ln -s /opt/subversion/lib/svn-perl/SVN /usr/local/git/lib/perl5/site_perl/SVN
What a hack! And only works because there was no auto/ directory there already! This is gonna break, I know it! Hopefully the Homebrew version will be working again by then.

Give me Ubuntu for this sort of thing any day. Shit, even give me an RPM-based system, I'll be happy with that.

Tuesday, May 29, 2012

Groovy AST transform gotcha

I'm a big fan of Groovy's AST transforms. Lately I've been using @Lazy in a lot of my code, because I love a declarative or functional style of programming, and @Lazy lets me do that really efficiently and concisely with Groovy. I hope that my colleagues agree!

We got caught by an interesting trap using a couple of other transforms recently: @EqualsAndHashCode and @Immutable. We like to use @EqualsAndHashCode to generate equals() and hashCode() for entity classes in our domain model. We use the 'includes' attribute to include only the primary key properties in the equals() comparison, like this:
@EqualsAndHashCode(includes = "id")
class Foo {
  String id
  String description
}
For this class, and its corresponding database table, the 'id' property is the key. Equality is not done by Java object equality, or by full value equality, but by comparing the entities' primary keys:
assert new Foo(id: "1", description: "cat") == 
       new Foo(id: "1", description: "dog")
This has important effects when storing these kinds of objects in collections.

Be careful mixing AST transform annotations though! We added @Immutable to some of our domain classes, like this:
@Immutable
@EqualsAndHashCode(includes = "id")
class Foo {
  String id
  String description
}
and got a surprising result:
assert new Foo(id: "1", description: "cat") != 
       new Foo(id: "1", description: "dog")
The reason is the order of the annotations. If we reverse them, it works correctly:
@EqualsAndHashCode(includes = "id")
@Immutable
class Foo {
  String id
  String description
}

assert new Foo(id: "1", description: "cat") == 
       new Foo(id: "1", description: "dog")
So what happens? The doc for @Immutable says:
The @Immutable annotation instructs the compiler to execute an AST transformation which adds the necessary getters, constructors, equals, hashCode and other helper methods that are typically written when creating immutable classes with the defined properties.
Later on, in more detail:
Default equals, hashCode and toString methods are provided based on the property values. Though not normally required, you may write your own implementations of these methods. For equals and hashCode, if you do write your own method, it is up to you to obey the general contract for equals methods and supply a corresponding matching hashCode method. [...]
So, I'm guessing that if we put @Immutable first, then @EqualsAndHashCode hasn't yet had a chance to do its magic, and @Immutable adds its default equals() etc, not the ones we want. But if we put @EqualsAndHashCode first, then its equals() etc methods are there for @Immutable to see, and we get the behavior we want.

Thus, problem solved, for now. But it does make one wonder a little about the interactions of all of these transforms and other annotations. We've been using Groovy AST transform annotations together with JPA annotations with no known problems to date, and I hope it continues that way.