Example Hadoop Job that reads a cache file loaded from S3

I had all sorts of problems getting my head around how cache files work with Hadoop. Finally, I stumbled across the answer – when you add a cache file (see HadoopMain#48), it’s available to read as a local file inside the mapper (MyMapper#36).

When running in Elastic MapReduce, the file URI can be an S3 file, using either s3://bucket/path or s3n://bucket/path – this may or may not work in other Hadoop implementations, but the general approach would work fine.

See the gist at https://gist.github.com/twasink/8813628

Markdown in WordPress?? Yay!

Markdown in WordPress?? Yay!

Not sure how I missed this, but… WordPress.com blogs can now support Markdown. Which is going to make it so much easier to include embedded code snippets in a properly monospaced font.

How-To: Grails, GORM and SimpleDB

I went to build a new Grails-based app today, and I wanted to use SimpleDB as a backend (the app is an internal-use administration app, to configure a suite of AWS-deployed apps). So I went looking on how to use GORM with SimpleDB. This turned out to be a non-trivial task, so I thought I’d share the process with everyone.

Continue reading “How-To: Grails, GORM and SimpleDB”

Building Dependent Maven Projects in Bamboo

For the last year or so, I’ve been using Atlassian’s Bamboo (in the OnDemand variant) for our team’s build server. And, mostly, it’s an awesome tool. Some parts, however, are a little rough around the edges. Building dependent projects is one of them.

Continue reading “Building Dependent Maven Projects in Bamboo”

Mavericks Upgrade Experience – Apps keep freezing

So I, like a zillion other Apple fanbois, updated to Mavericks yesterday. Yay for free upgrades.

Overall, I’m impressed. It feels faster, I love the proper support for multiple monitors, iBooks on the desktop is a win, and the iCloud keychain works as advertised, with syncing to my phone.

But there’s one big negative – I’ve had multiple episodes of “applications randomly freezing”. When this occurs, the application _that I’m currently using_ just stops. It doesn’t crash, it just pauses. For a few seconds, or a few minutes, then it resumes. I can go use other apps, and they’ll work – or they might pause as well. It’s only been a day, but I’ve had 4 of these episodes so far – once last night shortly after upgrading, which I kind of shrugged off, and then 3 this afternoon, in about a 2 hour period. (Worked fine all morning though).

If I had to guess, I’d say it’s the new memory compression feature – but I had it occur just after a reboot this afternoon, with no other apps open. Then this evening it’s working fine again.

The other likely explanation is a clash with 3rd-party apps – I don’t run that many, but you never know.

In any case, I’m going to be doing a clean install to an external disk to see if it reproduces there – if it doesn’t happen there, then it will be a clean install for the laptop as well this weekend.

If you’ve had issues like that, feel free to leave a comment.

Update: Well, it just happened again. The only observation I can make is that it was while doing a build for the project I’m looking at – it’s quite possible that it was trying to compact memory or move stuff to swap. I wonder if the swap partition is corrupted?

Update the second: As it turns out, my hard disk had started reporting SMART errors the day before I download Mavericks. Bad timing. The freezes were being caused by I/O errors – sure enough, when reading from the swap file (not partition).

Lessons learnt from a bug

This is a rant about a bug report I raised with ExtJS a few weeks ago. That said, I’m using the bug more as a teachable moment than anything else; I’m certainly not trying to bag ExtJS (which I quite like, despite some of its quirks). But this bug does highlight a number of “things done wrong”, which I want to learn from so that I don’t commit the same errors.

(No knowledge of ExtJS is required, and whilst I will describe the details of the bug in depth, the technical issues involved aren’t meant to be the takeaway points)

Continue reading “Lessons learnt from a bug”

Giving the ‘hasOne’ association some love

One of the really nice features of ExtJS, to my mind anyway, is the rich model architecture, and how models can be associated with each other. However, the quality can be a bit erratic – certainly, it appears that the HasOne association (which allows a one-to-one relationship) could use some loving, as it isn’t as well developed as the more commonly-used HasMany

Continue reading “Giving the ‘hasOne’ association some love”

Reading Associative Arrays with ExtJS Models

Wow, it’s been a while since I posted something…

I’ve been working a lot with ExtJs recently, as the basis for a web application which talks to a lot of JSON-based web services. And I got to say that I am enjoying it – it’s a nice, powerful framework that makes working with JavaScript quite bearable.

ExtJs includes a sub-framework for turning JSON (or XML) data into ‘models’, including nested data. It does this by providing ‘reader‘ classes that understand JSON (or XML). However, it only understands nested arrays. Sometimes what you have is a nested object – e.g. when you serialize a HashMap from Java into JSON. Fortunately, it’s possible to extend ExtJS and provide a new Reader – one that understands nested objects (aka ‘maps’, or ‘hashes’, or ‘associative arrays’).


/**
* A variant of the JSON reader. Instead of reading arrays, where each record in the array field
* has an 'id' property, it reads objects – aka associative arrays. The key of the entry will be the
* array.
*
* So where the JSON reader would like data like this:
* [ { id: '1', property: 'foo' }, { id: '2', property: 'bar' } ]
*
* the associative reader likes data like this:
* { '1': { property: 'foo' }, '2': { property: 'bar' } }
*/
Ext.define('Twasink.data.AssociativeReader', {
extend: 'Ext.data.reader.Json',
alias: 'reader.associative',
readRecords: function(data) {
// convert the associative array into a normal array.
var idProperty = 'id'; // should be a config value?
var arrayData = []
Ext.Object.each(data, function(key, value) {
var arrayEntry = {};
Ext.Object.merge(arrayEntry, value);
arrayEntry[idProperty] = key;
arrayData.push(arrayEntry);
});
return this.callParent( [ arrayData ]);
}
})


Ext.define('Twasink.model.Bar', {
extend: 'Ext.data.Model',
idProperty: 'id',
fields: [ 'baz', 'bux']
})

view raw

Bar.js

hosted with ❤ by GitHub


[
{ "id": "foo_1", "baz": "baz_1", "bux": "bux_1", "bar": {
"bar_1": { "baz": "bar_baz_1", "bux": "bar_bux_1" },
"bar_2": { "baz": "bar_baz_2", "bux": "bar_bux_2" }
}
},
{ "id": "foo_2", "baz": "baz_2", "bux": "bux_2", "bar": {
"bar_1": { "baz": "bar_baz_3", "bux": "bar_bux_3" },
"bar_2": { "baz": "bar_baz_4", "bux": "bar_bux_4" }
}
}
]


Ext.define('Twasink.model.Foo', {
extend: 'Ext.data.Model',
requires: [ 'Twasink.data.AssociativeReader', 'Twasink.model.Bar' ],
idProperty: 'id',
fields: [ 'baz', 'bux'],
hasMany: [ { model: 'Twasink.model.Bar', name: 'bars', associationKey: 'bar', reader: 'associative' }]
})

view raw

Foo.js

hosted with ❤ by GitHub

If you’re using ExtJS, I hope you find this useful.

Code samples not enough anymore

It’s becoming quite common for employers to ask to see code samples from prospective developers. This doesn’t really go far enough.

The next step up is to see the VCS history as well. A small sample – say, a couple of hours of work – can reveal a lot about how a person works – more than the code itself. Do they write tests first, or do they backfill later? Do they refactor their code to promote readability? Do they commit regularly, with meaningful comments, or do they just push bits in randomly?

With good free VCS hosting – like GitHub and BitBucket – anyone can easily create sample code and put it online for your potential employer to see.

So the next time an employer asks for a code sample, take it up a notch and give them the entire history as well. (And yes, I practice what I preach)

AiL – JBehave and Spring

Having succeeded in getting a simple JBehave story running. my next challenge is to scale it up a bit. In particular, I want to get a JBehave story that integrates with Spring to do something more fully-featured: save an entry in a database.

Continue reading “AiL – JBehave and Spring”