Dec 26, 2011

Heroku: Parsing RSS Feed XML

I want a scheduled job to run on heroku that pulls in data from an external RSS feed. I started out with libxmljs. Things looked promising until I deployed to heroku. In order to build my slug file libxmljs needs to have access to SCons:

-----> Installing dependencies with npm 1.0.94

> libxmljs@0.4.3 preinstall /tmp/build_16f6d4f95rt18/node_modules/easyrss/node_modules/libxmljs
> make node

make: scons: Command not found
make: *** [node] Error 127
npm ERR! error installing libxmljs@0.4.3 Error: libxmljs@0.4.3 preinstall: `make node`
npm ERR! error installing libxmljs@0.4.3 `sh "-c" "make node"` failed with 2
Shame!

Easyrss looked good too but also had the same libxmljs dependency.

In the end I settled on xml2js which works just fine. Here's my code snippet:
var request = require('request'),
xml2js = require('xml2js'),
_ = require('underscore');

var parser = new xml2js.Parser();
request('http://url.to.my.rss.feed', function(error, response, body) {
if (!error && response.statusCode == 200) {
parser.parseString(body, function (err, result) {
_(result.item).each( function(item) {
console.log(item.title);
});
});
}
});
I can exercise the script successfully via heroku run bash.

Dec 21, 2011

Learn You a Gradle Fu For Great Good

It's not too late to sign up for the Gradle training course that's coming to PDX in January*! The training came up at last night's PJUG (although Gradle builds for JVM languages in general). Here is my on-list response:

"That [500 line] pom.xml from last night is a great reminder of what is was like to use Maven. Maven is great at dependency management, but terrible at everything else. Gradle build scripts are readable, concise, only go into detail where it matters, and are configurable at a higher level of abstraction than Ant. Perhaps most importantly, Gradle is well documented.

All that said, there are conventions, idioms, and a certain way of thinking that will heighten your Gradle-fu. You will learn these on your own over time. However, attending the training course is going to be much more time efficient."


If you don't know what this is all about, then you should definitely be taking a look at Gradle.

* Howard Lewis Ship recommends Gradle too.

Dec 20, 2011

1. Heroku + Gradle + Graffiti + AI 2. ? 3. Profit!

Very nice. James Ward came to evangelize Heroku at our local usergroup. Thanks, James! Armed with a better understanding of Heroku, I set out to provide a simple web-service on heroku while at the usergroup post-talk bar visit. So, here it is:

GenderService! Okay. It's a dubious name, but it does something useful. You can hit the webservice and provide a name and it will infer the gender for you. You can make up a name that doesn't even exist and it will do a reasonable job of determining the gender. Give it a shot!

Here are some that I tried: (if you browser behaves strangely, it may be detecting that it's JSON and trying to be helpful)

In case you're curious, I'm using ngramms, laplace smoothing, and a smallish labelled dataset to estimate male/female probabilities for a name. If that's interesting to you, and you're local to PDX, consider joining pdxai on googlegroups.

Edit: I've added a confidence to the response. Clearly it's certain that I'm female: {"name":"merlyn","male":false,"confidence":0.995}