Monday, May 28, 2012

Groovy AST transform gotcha

I'm a big fan of Groovy's AST transforms. Lately I've been using @Lazy in a lot of my code, because I love a declarative or functional style of programming, and @Lazy lets me do that really efficiently and concisely with Groovy. I hope that my colleagues agree!

We got caught by an interesting trap using a couple of other transforms recently: @EqualsAndHashCode and @Immutable. We like to use @EqualsAndHashCode to generate equals() and hashCode() for entity classes in our domain model. We use the 'includes' attribute to include only the primary key properties in the equals() comparison, like this:
@EqualsAndHashCode(includes = "id")
class Foo {
  String id
  String description
}
For this class, and its corresponding database table, the 'id' property is the key. Equality is not done by Java object equality, or by full value equality, but by comparing the entities' primary keys:
assert new Foo(id: "1", description: "cat") == 
       new Foo(id: "1", description: "dog")
This has important effects when storing these kinds of objects in collections.

Be careful mixing AST transform annotations though! We added @Immutable to some of our domain classes, like this:
@Immutable
@EqualsAndHashCode(includes = "id")
class Foo {
  String id
  String description
}
and got a surprising result:
assert new Foo(id: "1", description: "cat") != 
       new Foo(id: "1", description: "dog")
The reason is the order of the annotations. If we reverse them, it works correctly:
@EqualsAndHashCode(includes = "id")
@Immutable
class Foo {
  String id
  String description
}

assert new Foo(id: "1", description: "cat") == 
       new Foo(id: "1", description: "dog")
So what happens? The doc for @Immutable says:
The @Immutable annotation instructs the compiler to execute an AST transformation which adds the necessary getters, constructors, equals, hashCode and other helper methods that are typically written when creating immutable classes with the defined properties.
Later on, in more detail:
Default equals, hashCode and toString methods are provided based on the property values. Though not normally required, you may write your own implementations of these methods. For equals and hashCode, if you do write your own method, it is up to you to obey the general contract for equals methods and supply a corresponding matching hashCode method. [...]
So, I'm guessing that if we put @Immutable first, then @EqualsAndHashCode hasn't yet had a chance to do its magic, and @Immutable adds its default equals() etc, not the ones we want. But if we put @EqualsAndHashCode first, then its equals() etc methods are there for @Immutable to see, and we get the behavior we want.

Thus, problem solved, for now. But it does make one wonder a little about the interactions of all of these transforms and other annotations. We've been using Groovy AST transform annotations together with JPA annotations with no known problems to date, and I hope it continues that way.

Tuesday, December 27, 2011

Checking Foreign Key attribute consistency

While evolving a schema during development, it's sometimes hard to make sure that column attributes remain consistent across different tables.

Some databases provide domain types. So for example, with Firebird/InterBase you can define a domain type or alias for your invoice_number column as, say, VARCHAR(10). Then you can use the domain type in defining any tables containing invoice_number, and the column attributes will always be consistent.

Oracle doesn't have a natural way to do this. One trick I've used is preprocessing SQL DDL files with Ant and using "macros" for column types. So I might have a types.properties file like this:

...
INVOICE_NUMBER_TYPE = VARCHAR2(10)
...


Then in create_table_invoice.sql I might have:

CREATE TABLE invoice (
invoice_number @INVOICE_NUMBER_TYPE@ NOT NULL,
...
);


And also use the macro elsewhere for any foreign key column.

But, this turned out to be a bit of a pain. For one thing, it makes the build more convoluted because of the preprocessing required. More importantly, it made the SQL DDL files "invalid". We couldn't just run one in sqlplus, without the preprocessing step first. We couldn't send one to a DBA. We didn't get good IDE support, because the IDE doesn't understand the type macros.

As a result, I've reverted to putting hardcoded column attributes in the DDL files. But this takes me back to the problem of keeping the foreign keys parent/child attributes consistent.

Another approach to that problem is to use a view like this:

CREATE OR REPLACE VIEW chk_foreign_key_type AS
SELECT
ac.table_name child_table,
acc.column_name child_column,
atc.data_type child_data_type,
atc.data_length child_data_length,
atc.data_scale child_data_scale,
ac2.table_name parent_table,
acc2.column_name parent_column,
atc2.data_type parent_data_type,
atc2.data_length parent_data_length,
atc2.data_scale parent_data_scale
FROM all_constraints ac
JOIN all_cons_columns acc ON acc.owner = ac.owner
AND acc.constraint_name = ac.constraint_name
JOIN all_tab_columns atc ON atc.owner = ac.owner
AND atc.table_name = acc.table_name
AND atc.column_name = acc.column_name
JOIN all_constraints ac2 ON ac2.owner = ac.owner
AND ac2.constraint_name = ac.r_constraint_name
JOIN all_cons_columns acc2 ON acc2.owner = ac2.owner
AND acc2.constraint_name = ac2.constraint_name
AND acc2.position = acc.position
JOIN all_tab_columns atc2 ON atc2.owner = acc2.owner
AND atc2.table_name = acc2.table_name
AND atc2.column_name = acc2.column_name
WHERE ac.owner = 'your_schema'
AND ac.constraint_type = 'R'
AND (atc2.data_type <> atc.data_type
OR atc2.data_length <> atc.data_length
OR NVL(atc2.data_scale, -1) <> NVL(atc2.data_scale, -1))
ORDER BY 1, 2
/

COMMENT ON TABLE chk_foreign_key_type IS
'Foreign key column(s) different from parent type/length/scale'
/


This view will return any foreign key where a data type, length or scale of a column in the child table does not match the corresponding column in the parent table.

Tuesday, February 8, 2011

Ant Target Dependency Graph

We can get a pretty good diagram of our Ant target dependencies very easily using an embedded Groovy script and GraphViz.

Add this to build.xml:



def u(x) {x.toString().replace("-", "_").replace(".", "_")}
new File("build.dot").text = """
digraph ant {
${project.targets.values().collect {target ->
target.dependencies.collect {dep ->
u(dep) + " -> " + u(target)
}.join("\n")
}.join("\n")}
}
"""


You will need to have the embeddable groovy-all.jar in Ant's classpath, e.g. in ~/.ant/lib/.

Then run "ant target-graph". It writes a build.dot file in the current directory.

Convert this into a picture using the GraphViz dot command:
dot -Tsvg -O build.dot

The results are surprisingly good.

Here's an example from the Apache commons-dbcp project:



I'm not going to show you the diagram I got for our system at work, the reason I wrote this script! The diagram is so big and complex, I was shocked. After being happy with Ant all these years, I think it's time to be getting serious about Gradle.

Saturday, January 22, 2011

SlickEdit 2011 Wish List

Well, we're nearly through January 2011, and the SlickEdit 2011 beta is due any day now.

I've been using SlickEdit since 1996, and I still get kind of excited around this time of year when a new version comes out. Sometimes the team surprises me.

Here is my wish list for features in SlickEdit 2011:


The first three are hot JVM languages that I'd love to see support for. I don't really expect to see them, but you never know. Last year, SlickEdit added support for Erlang, Haskell and F#, so they aren't completely in the dark about hot languages.

The next three items are popular distributed version control systems. Again, I don't really expect much from SlickEdit on that, yet. Forum posts on the topic have met with disappointing reponses from SlickEdit staff -- doesn't look like they "clicked" on DVCS yet. It took them an awful long time to move on from CVS to Subversion themselves, and even now the Subversion support is miserable compared to any other tool I've used. And so, my wish, modernise the Subversion support.

SlickEdit has a lot of advanced features for C/C++ programmers. C/C++ programmers probably make up a very large chunk, if not the majority, of SlickEdit users. And as far as I can tell, SlickEdit is actually one of the best "IDE"s available for C/C++. I don't do C/C++ any more myself though, I do JVM-based languages mostly. And with Java, SlickEdit also tries to be an uber-IDE, with Project Types, JUnit, Ant support and more. But here it falls far, far short of industry standards. Java programmers are really spoiled by the superb IDEs aavailable for them, and two of the best ones are even free. Anyway, I don't wish for SlickEdit to improve its Java IDE features. I'm happy to use a Java IDE for that kind of work. The point I'd like to make is that supporting current VCS systems, and supporting them really well, would benefit all SlickEdit users. I can't imagine many SlickEdit users are not using VCS, and many of them probably use a modern VCS, such as Git. It's really about time SlickEdit caught up with the VCS game.

For a couple of examples of excellent VCS integration, look at:


SlickEdit 2011 included some rather dubious new features. My favorite "non useful feature" was Subword Navigation. You can move the cursor through camel-cased words such as AbstractBeanFactory. But when would anyone want to do that? Far more useful would be file or class completion/loading using smart "camel typing", as introduced by IntelliJ IDEA and copied by other tools. With IDEA, I can press Ctrl+N to open a class, then type "ABF" or "AbBeFa" to open the AbstractBeanFactory class. This is really useful, and would be something SlickEdit could really benefit from.

Anyway, I'm sure SlickEdit 2011 will contain a few pleasant surprises, as well as a few new annoying bugs. As always, it will be interesting to figure whether the feature-to-bug ratio improves, or not. I'm looking forward to the beta.

Thursday, January 6, 2011

Groovy DSL/Builders: ZIP Output Streams

Let's follow up last week's post with another example of a very similar, very simple builder.

This one is for outputting ZIPped data to a stream. Let's take the standard example of using Java's ZIP support to zip up a folder of files.

Because the JDK does not include methods to traverse the filesystem, we need to define a method to be called recursively for subdirectories:


private void zipDirectory(File dir, ZipOutputStream zos) throws IOException {
for (File file : dir.listFiles()) {
if (file.isDirectory()) {
zipDirectory(file, zos);
}
else {
ZipEntry entry = new ZipEntry(file.getPath());
entry.setSize(file.length());
entry.setTime(file.lastModified());
zos.putNextEntry(entry);
IOUtils.copy(new FileInputStream(file), zos);
}
}
}


We cheated a little here by using the Apache Commons IO IOUtils class to actually copy the file bytes to the ZIP file. Also, we don't do anything here with IOExceptions.

With this method in place, we can create a ZIP file from a folder using:


ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(zipFile));
zipDirectory(new File(dir), zos);
zos.close();


Groovy's JDK IO extensions, and filesystem traversal methods, make this job quite a bit easier. Here's the Groovy code to do the same thing:


new ZipOutputStream(new FileOutputStream(zipFile)).withStream {zos ->
new File(dir).traverse(type: FileType.FILES) {File file ->
def entry = new ZipEntry(file.path)
entry.size = file.length()
entry.time = file.lastModified()
zos.putNextEntry(entry)
zos << file.bytes
}


This code is still a bit awkward in how it interacts with Java's ZIP API, in particular the creation of the ZipEntry object.

Using a simple builder, we can rewrite this as follows:


new ZipBuilder(new FileOutputStream(zipFile)).zip {
new File(dir).traverse(type: FileType.FILES) {File file ->
entry(file.path, size: file.length(), time: file.lastModified()) {it << file.bytes}
}
}

The ZipBuilder provides two methods:

  • zip(): creates and manages the ZipOutputStream

  • entry() (nested): creates and adds a ZipEntry to the enclosing zip stream


As with other builders, this builder promotes readable code that reflects the structure of the object to be created.

Here's the code for the builder itself:


class ZipBuilder {

@InheritConstructors
static class NonClosingOutputStream extends FilterOutputStream {
void close() {
// do nothing
}
}

ZipOutputStream zos

ZipBuilder(OutputStream os) {
zos = new ZipOutputStream(os)
}

void zip(Closure closure) {
closure.delegate = this
closure.call()
zos.close()
}

void entry(Map props, String name, Closure closure) {
def entry = new ZipEntry(name)
props.each {k, v -> entry[k] = v}
zos.putNextEntry(entry)
NonClosingOutputStream ncos = new NonClosingOutputStream(zos)
closure.call(ncos)
}

void entry(String name, Closure closure) {
entry([:], name, closure)
}
}


This builder uses the same style with Closures as the HSSFWorkbookBuilder described earlier.

There are a few other Groovy (and Java) features to note:

  • Java's ZIP library requires clients to write to the ZipOutputStream for each entry created. We need to make sure that no entry closes the ZipOutputStream -- it must be closed only when the zip stream is finished. (Many of Groovy's output methods close streams automatically.) For this reason, we wrap the output stream in a NonClosingOutputStream before passing it to an entry. This class is simply defined as a FilterOutputStream (OutputStream decorator) with a no-op close() method.

  • We use Groovy's @InheritConstructors to save repeating the trivial constructor.

  • The entry() method creates a new ZipEntry with its mandatory name property. It then populates additional optional properties from a Map, using Groovy's support for setting Java Beans properties as Map keys. These properties are intended to be provided as named arguments to the method, as shown in the example earlier. This makes for a very concise and intuitive way to set the properties.

  • The main overload of entry() is declared to take its arguments in this order: Map props, String name, Closure closure. When called, entry() is (typically) given arguments in a different order: String name, Map props, Closure closure. This is due to Groovy's convention for passing named arguments to a method, described here, in the section "Named Arguments".


One final note about this builder -- it doesn't just work with files. Because the constructor takes an OutputStream, it can write to any stream. So it could be used to write directly to a servlet response, for example. Similarly, the entries are populated as streams, so they can be filled by anything that can write to a stream.

Tuesday, December 28, 2010

Groovy DSL/Builders: POI Spreadsheets

It's well-known that Groovy is very rich for creating DSLs and fluent builder APIs.

I work a lot with the Apache POI library to generate Excel workbooks from data. We can use Groovy very easily to support a fluent and readable API for creating workbooks.

Here's a very simple example. Suppose we want to populate a workbook with two sheets with some data. Using the raw POI API, we could code something like this:

def workbook = new HSSFWorkbook()
def sheet1 = workbook.createSheet("Data")
def row10 = sheet1.createRow(0)
row10.createCell(0).setCellValue(new HSSFRichTextString("Invoice Number"))
row10.createCell(1).setCellValue(new HSSFRichTextString("Invoice Date"))
row10.createCell(2).setCellValue(new HSSFRichTextString("Amount"))
def row11 = sheet1.createRow(1)
row11.createCell(0).setCellValue(new HSSFRichTextString("100"))
row11.createCell(1).setCellValue(Date.parse("yyyy-MM-dd", "2010-10-18"))
row11.createCell(2).setCellValue(123.45)
def row12 = sheet1.createRow(2)
row12.createCell(0).setCellValue(new HSSFRichTextString("600"))
row12.createCell(1).setCellValue(Date.parse("yyyy-MM-dd", "2010-11-17"))
row12.createCell(2).setCellValue(132.54)
def sheet2 = workbook.createSheet("Summary")
def row20 = sheet2.createRow(0)
row20.createCell(0).setCellValue(new HSSFRichTextString("Sheet: Summary"))
def row21 = sheet2.createRow(1)
row21.createCell(0).setCellValue(new HSSFRichTextString("Total"))
row21.createCell(1).setCellValue(123.45 + 132.54)


This is not very readable. Even if we extract routines such as a common method to generate the cells in a row, the structure of our code does not follow closely the structure of what we want to create.
One of the big advantages of builders is that the structure of the code can match closely the structure of the generated result.

Here's the same workbook, created with a simple builder API:


def workbook = new HSSFWorkbookBuilder().workbook {
sheet("Data") { // sheet1
row(["Invoice Number", "Invoice Date", "Amount"])
row(["100", Date.parse("yyyy-MM-dd", "2010-10-18"), 123.45])
row(["600", Date.parse("yyyy-MM-dd", "2010-11-17"), 132.54])
}
sheet("Summary") { // sheet2
row(["Sheet: Summary"])
row(["Total", 123.45 + 132.54])
}
}


The HSSFWorkbookBuilder class required to do this is very straightforward:


import org.apache.poi.hssf.usermodel.HSSFRichTextString
import org.apache.poi.hssf.usermodel.HSSFWorkbook
import org.apache.poi.ss.usermodel.Cell
import org.apache.poi.ss.usermodel.Row
import org.apache.poi.ss.usermodel.Sheet
import org.apache.poi.ss.usermodel.Workbook

class HSSFWorkbookBuilder {

private Workbook workbook = new HSSFWorkbook()
private Sheet sheet
private int rows

Workbook workbook(Closure closure) {
closure.delegate = this
closure.call()
workbook
}

void sheet(String name, Closure closure) {
sheet = workbook.createSheet(name)
rows = 0
closure.delegate = this
closure.call()
}

void row(values) {
Row row = sheet.createRow(rows++ as int)
values.eachWithIndex {value, col ->
Cell cell = row.createCell(col)
switch (value) {
case Date: cell.setCellValue((Date) value); break
case Double: cell.setCellValue((Double) value); break
case BigDecimal: cell.setCellValue(((BigDecimal) value).doubleValue()); break
default: cell.setCellValue(new HSSFRichTextString("" + value)); break
}
}
}

}


The magic is in the handling of the nested closures, and setting the delegate for each to the builder so that methods are resolved against the builder.

Here's another example of using such a builder. This one takes an SQL query and creates a workbook with two sheets. The first sheet contains the result of running the query, and the second sheet contains the query text.


def workbook = new HSSFWorkbookBuilder().workbook {
sheet("Data") {
db.eachRow(
sql,
{meta -> row(meta*.columnName)}, // header row with columns names from ResultSetMetaData
{rs -> row(rs.toRowResult().values())} // data row for each ResultSet row
)
}
sheet("SQL") {
sql.eachLine {line ->
row([line])
}
}
}

Thursday, February 4, 2010

Release builds with TeamCity: Selecting the branch

We've long had TeamCity doing regular "CI style" checkin builds for our Java/Ant projects. We recently added nightly builds for extra reports, and for longer-running performance tests. This was straightforward.

We finally got TeamCity doing our release builds. There were a couple of tricky points, which I thought would be worth writing up:

  • Selecting the branch
  • Manipulating the repository
  • Ensuring correct (release) versions of dependencies


Selecting the branch


This was the trickiest thing. Checkin and nightly builds always run against trunk. Well, you could set up a checkin build for a long-running development branch too, but that's not difficult. The release build should be done from the release branch, and that can be different for each release. Or it can be the same, if you have to do a fix release on an existing one!

We found a pretty good way to do this with TeamCity, using Build Configuration Templates and Configuration Parameters.

The idea is that you set up a template that contains all of the settings for the release build, except for the branch name. The branch name is specified by a configuration parameter. Then, the template is instantiated for each branch as desired. Each time the template is instantiated, the branch name configuration parameter is given for that instance.

Here are some details, using Subversion VCS and a hypothetical project named "xxx":

Create the Release Build Template


  1. Edit project's existing checkin build config.
  2. Click "Extract Template" to create a new template.
  3. Enter "release-build-template" for Name.
  4. Back in the checkin build config, click "Detach from Template".
  5. Click OK.

We've created a template, with no configurations attached.

Set up the Release Build Template

Edit the template. Change settings as given in the following sections.

Version Control Settings


  1. Create a new VCS root named xxx-branches.
  2. Specify Subversion as the Type of VCS.
  3. Enter Svn repo URL + "/xxx/branches" as the URL.
  4. Test the Connection.
  5. Save the VCS root.
  6. Attach the template to the xxx-branches VCS root.
  7. Detach the template from the xxx-trunk VCS root.
  8. Add checkout rule for VCS root: "%release.branch%=>.". This tells TeamCity to checkout the specific release branch into the working directory.
  9. Save Version Control Settings.

Runner Settings


  1. Change Target to "release-build", or whatever you want to call your release build target.
  2. Save Runner Settings.

You must have a target in your build script called "release-build", or whatever you want to call your release build target. This target must build the release and publish it somewhere. For example, it could copy it to a staging area on a server. Or, it could publish it to your enterprise repository.

The Ant target might look something like this:



Build Triggering Settings


  1. Delete/disable all triggering (VCS and Dependencies).
  2. Save Build Triggering Settings.
We've modified the template so that it will checkout the source from a release branch, with the specific branch given by a configuration parameter ("release.branch"). It will then build and publish the release.

Create a Release Branch Build Config for Release xx.yy

This procedure creates a release build config for a particular branch.

  1. Edit the release-build template's build configuration.
  2. Click Create Build Configuration From Template.
  3. Enter "release-xx.yy" for Name, where "xx.yy" is the name of your release branch.
  4. Enter the name of the branch for the release.branch parameter. For example, "RB-01.05".
We've created a configuration for running a release build on the branch.

To create a release:

  1. Ensure all changes for the release are checked into trunk.
  2. Create the branch. For example:
    svn copy $SVN/xxx/trunk $SVN/xxx/branches/RB-xx.yy

  3. Click Run on the release build configuration in TeamCity.



You can keep the release build configuration around for a particular branch as long as you like. If you are finished with a branch, you can delete the build configuration. If you need it again, it's easy to recreate it from the template.

That's it. I hope to write up some notes on the other points (manipulating the repository and ensuring the correct release versions of dependencies) soon.