headless JavaScript unit testing with Jasmine and PhantomJS

¶ September 23, 2012 by Rob Friesel

Yesterday, I gave a talk (slides are here, and/or rate my talk here) at the fourth Vermont Code Camp on how to run fast, reliable, headless unit tests for JavaScript using the Jasmine BDD test framework and PhantomJS. What follows here is a summary of “the meat” of the talk: specifically, how to execute your Jasmine-based test suites using PhantomJS as the runtime environment. ¹

Photo credit to Josh Sled, 2012. (source)

Let’s assume that you’re a front-end developer, and let’s further assume that you’ve already looked into this subject at least a little bit. We’ll skip the primer on unit testing, because you already know what that is, and what its benefit are. We’ll further assume that you’ve read Ben Cherry’s blog post, “Writing Testable JavaScript” which, while not necessarily an exhaustive post on the subject, is an easy-to-digest post with just the right amount of information, and just the right tone. So all that being said, our basic assumptions for the rest of this post:

You know what unit testing is;
You’re writing testable JavaScript;
You’re already writing some unit tests for your JavaScript;
You’re frustrated by “playing the browser refresh game”; and
You want to automate your tests, possibly even as part of your CI ² strategy.

With these assumptions in place, let’s ask the obvious question: How would we go about setting up some automated headless unit tests?

Why Jasmine?

Jasmine

To start on this journey of automating your unit tests with PhantomJS, you’ve got to have a suite of tests. Though you could roll your own assertion framework, there are plenty of existing JavaScript testing frameworks out there already. ³ In other words: this is a solved problem. The testing frameworks that are out there run the spectrum with respect to features, maturity, and integrations with specific libraries or other frameworks. It’s a blog post in and of itself just to do a side-by-side comparison of even just a handful of them. At some point you need to just pick one that works for you and go with it.

I chose Jasmine.

You may be asking: Why Jasmine?

I first became exposed to Jasmine through Rebecca Murphey’s screencast where she talks about using it to test jQuery-based code. I read through the docs; I read through some of the source code; and I read a number of blog posts, articles, and forum threads, all discussing Jasmine in one capacity or another. From this research, I decided that it was mature enough, that it had the features I was looking for, ⁴ and the BDD style that it used just “clicked” with me. So when it came time for me to “just pick one”, that was the one that I picked.

Jasmine itself has the usual suspects for features: an “expect” test method; various matchers for equivalence, equality, defined (and undefined), negations, and so forth; “before” and “after” style setup and teardown functions; spies; async support; and more. Equally attractive to me, Jasmine seems to have developed a great community of users, including some intrepid folks who have built some marvelous extensions. (More on this later.) Lastly, it was sufficiently “library agnostic” for my tastes. ⁵

It wasn’t that choosing Jasmine was a no-brainer, but it didn’t take me long to prefer it over others. As such, it provides a foundational component for our later discussion around how to automate these tests using PhantomJS.

What is PhantomJS?

PhantomJS is a headless WebKit with a JavaScript API that allows you to work with that browser. If you’re asking “Headless” WebKit? WTF? then try to think of it this way: it’s like Chrome with no window, no browser chrome, and an invisible viewport. ⁶ It’s a command-line utility into which you can feed JavaScript-based programs for performing a variety of tasks that browsers and JavaScript engines are otherwise well-suited to. Observe this totally contrived example:

In that script we:

Create one of PhantomJS’s WebPage objects;
Load the main URL for this blog;
After the landing page loads, we grab all the headlines;
Output the text of those headlines to the console; and
Exit from PhantomJS.

The above demonstrates just a few of the basics of the PhantomJS API. You can let your imagination run wild with this; it’s possible to use it for SVG rendering, site crawling and scraping, network monitoring, and–of course–headless testing.

Headless Testing

Because PhantomJS is “just a browser”, the tests that you already have written can (theoretically) be executed in that environment as-is. You just need to write a test runner to do the following things:

Load the page that contains your tests;
Wait for the tests to finish executing;
Parse the HTML to identify the number of successes and failures;
(Optionally) report those success and failures, writing them to the console (and/or to a file); and
(Ideally) exit the PhantomJS runtime with an exit status of 0 or 1, to make it scriptable like a good UNIX-y citizen.

If you’ve followed along with the example in the Jasmine documentation, you used the jasmine.HtmlReporter (like so) to get an attractive test report in your browser window. While this reporter is much more pleasing to the eye, it arguably means you’re doing a lot more overhead in parsing the report. ⁷ Jasmine core ships with a jasmine.TrivialReporter as well, which has some simpler output, but isn’t as attractive to look at.

Again–theoretically–we have enough to get started without changing a thing about our existing tests. But we’ll probably want more from our test runner than “just” the results. And wouldn’t it be nice to have “stripped” output in the console (i.e., in PhantomJS) and “pretty” output in every other browser?

Fortunately, we don’t need to do all this heavy lifting ourselves.

larrymyers/jasmine-reporters

Larry Myers recognized a lot of these ~~problems~~ opportunities long before I came along dreaming of doing something like this. Luckily, Mr. Myers created a series of Jasmine reporters (and runners!) to solve this problem, and open-sourced them on Github.

Of particular interest to us are the jasmine.ConsoleReporter and the phantomjs-testrunner.js test runner script.

In a nutshell, what Mr. Myers has given us with the jasmine.ConsoleReporter, is a Jasmine reporter which was designed explicitly for console output–not something that we’re jury-rigging along the way. Furthermore, he provides a sister reporter in the jasmine.JUnitXmlReporter, which provides a mechanism for writing JUnit-style XML reports to the filesystem, directly from the PhantomJS runtime. (Can you see where this is headed?)

Putting It All Together

Along with those reporters, Mr. Myers’ project offers us the aforementioned phantomjs-testrunner.js. Somewhat ironically, the test runner depends on the jasmine.TrivialReporter from Jasmine core, but for best results, you’ll want to add the jasmine.ConsoleReporter. Used in combination, the test runner handles loading the target test page into the PhantomJS runtime, and adds the function necessary to write the XML reports to the filesystem; meanwhile, the jasmine.TrivialReporter provides the DOM output that’s parsed for the test results, and the jasmine.ConsoleReporter outputs results to the console. And if we’re feeling adventurous, we can add in the jasmine.JUnitXmlReporter to get a test report written to the filesystem. ⁸

On top of that, we can have the best of both worlds–an easily consumable report in PhantomJS and the console, and the “pretty” report for every other browser. We just need to do a little sniffing around in our runtime and decide which reporters to stand up. (Here’s what I did for the VT Code Camp demo.)

It works well, and it’s a pretty extensible solution, providing a lot of the right hooks to do things like fail builds from Jenkins, block git commits, ⁹ or watch a directory of files and show Growl alerts when tests start failing. I myself have only started to scratch the surface of what’s possible here, but I’m pretty excited about what I’ve seen so far.

The Catch

This is the part where someone out there is ready to shout me down, to tell me that I’m giving everyone bad advice. This is the part where I admit that this is not the “best” solution for this kind of test automation.

But “best” is relative. If by “best” you mean “fast”, then I’m going to say that this has a chance of winning the prize for “best” automated and/or headless testing solution. However, if by “best” you mean “comprehensive”, then no, this is not the best solution to test automation. A comprehensive test automation solution is going to involve:

A server (e.g., Selenium) to serve as a remote “driver” for “slaved” clients;
A pool of such clients in a “browser lab” with a representative sample of the browsers and devices you’re targeting and/or supporting (virtualized and/or on dedicated hardware); and ¹⁰
Everything else that you’ll need no matter what other strategy you choose (i.e., the actual test suites, the test runners, the CI server, etc.)

When all the apples and oranges are sliced, the solutions aren’t all that different–one just happens to have much broader coverage across browsers and devices. But that “much broader coverage” isn’t without its own costs. For starters, it’s more expensive to stand up a browser lab like this; not to mention the expense of maintaining such an apparatus. ¹¹ And then you have to have someone who knows how to set up such a thing and/or be willing to learn all the intricacies involved. Even assuming that you’re prepared to cope with these expenses and challenges, even assuming that you believe that the broader coverage is worth it ¹²–you’re still still likely to be bound by a couple of things: first, that such a browser lab is probably a one-of-a-kind item in your shop, and as such, getting time in the lab is competitive and contentious; and second, if you’ve “gone all the way” and introduced functional/acceptance/end-to-end tests, then you’ll quickly see that the reality with this setup is that those kinds of tests take a long time to run, and when they run, they can tie-up your testing environment for a good long while.

Having breadth of coverage is important, but so is getting immediate feedback about your test results. The solution proposed in this post won’t solve the former problem, but it goes a long way toward solving the latter problem.

Summary and Errata

Perhaps the “tl;dr” belongs at the top, but I felt that it made sense to state “the short version” of all this once more at the end, just to drive it home: If you have a suite of JavaScript unit tests written in Jasmine, you can use a selection of reporters from Larry Myers’ jasmine-reporters, along with his test runner script, to automate the headless execution of those tests using PhantomJS. I don’t know about you, but I’m pretty excited about the possibilities opened up with such an arrangement.

I’d also like to share a couple of interesting items that came out of the discussion I had with the audience at VT Code Camp.

On test coverage. When we started talking about Jasmine and BDD, one member of the audience asked: “How much coverage is enough coverage? How do you know if you’re testing enough?” Naturally, I gave the naïve developer’s response’ and said “coverage reports” and went on to say that I hadn’t explored code coverage tools well enough to speak to that. However, the gentleman surprised me and explained that he did not mean code coverage. We went back and forth a little bit, but we agreed that “enough” test coverage isn’t a function of code paths, but is a level of satisfaction that the business owners have. Knowing that every line of code is tested might satisfy a developer, but stakeholders are thinking about how the software should behave and what it should do. In retrospect, his point was an almost perfect illustration as to why I prefer the BDD approach, and why I mentioned that the developers should be writing the test cases/specs with the domain experts; then the question becomes not “are we testing enough things?” but “are we testing the right things?”
Crowdsourced test reports. One person in the audience speculated something of a blasphemous idea: (paraphrased) “What about just delivering the test payload to the end-users? executing the tests client-side and posting the results back to you with ajax?” I can think of a bunch of reasons not to do this, but for a moment there, everyone in the room just kind of looked around as if to say: That’s just crazy enough to work! Who needs a browser lab when your users can be your browser lab? (Just kidding.)
“It’s still just WebKit.” More/less hand-in-hand with the section above (“The Catch”), was the astute observation by Mike Fowler that PhantomJS is “still just WebKit” and that by and large, that’s not the browser where we encounter the defects in our code. Mike cited versions of Internet Explorer that don’t support indexOf as a “gotcha”; and though I didn’t say so at the time, I immediately thought of trim. This was an excellent point to be raised, and I’m glad that he did. That being said (and if I recall correctly, I did say this), having tests, and being able to run them quickly using a tool like PhantomJS isn’t going to bail you out of every problem. You should still be linting. You should still be doing code reviews. You should still be testing across different browser runtimes. You should still have top-notch QA engineers to keep you honest. (Hopefully the above technique just means that there’s less for those QA engineers to “keep you honest” about.)

All in all, I had a great audience at Vermont Code Camp this year, and I’m grateful for their attention, and even more grateful for their fantastic questions. Thanks out to Julie Lerman and Rob Hale for organizing the event; ¹³ it was a lot of fun, and I hope to do it again next year.

Resources

UPDATE: (July 2014) This talk and blog post became the basis for my first book: The PhantomJS Cookbook!

My talk included a bunch of discussion around testing in general, unit testing in JavaScript specifically, and the landscape of JavaScript testing frameworks. (Spoiler alert: there’s a lot of them.)[↩]
”CI” of course being “continuous integration”. If I assumed too much, and you don’t know what that is, check out Wikipedia’s entry on CI. (I’ll still be here when you get back.)[↩]
At the time of this writing, the Wikipedia article for “Unit testing frameworks” lists 35 different unit testing frameworks for JavaScript–and that’s not even an exhaustive list.[↩]
Although, to be fair: any of the “mature” testing frameworks pretty much all have the same features.[↩]
A lot of other testing frameworks seem more/less “attached” to some other library or framework. This is not to say that you can only test YUI-based JavaScript with YUI Test, nor that you can only test Dojo-based JavaScript with the Dojo Object Harness–but I really was looking for something that didn’t even have one these associations even by name. (After all: names are important.)[↩]
Faith-based browsing?[↩]
Admittedly: that’s kind of a weak argument.[↩]
Don’t forget to tell the jasmine.JUnitXmlReporter where to write those reports. (I recommend target/test-reports–but that’s just showing by background and biases.)[↩]
The code examples from my VT Code Camp talk include this example pre-commit hook.[↩]
…and if you’re going to all the trouble to set up a browser lab like this, you may as well go all-out and also set up some hard-core functional/acceptance/end-to-end tests with a tool like Geb as well. But let’s not get crazy.[↩]
Not to mention fraught with its own tangled questions: how do you keep browsers up to date? when do you let older versions drop off? how do you keep Chrome from updating itself? what about ? all those Android devices?[↩]
And don’t get me wrong: it’s almost definitely worth it, if you can afford the up-front involved.[↩]
And many thanks to Apprenda for sponsoring the speaker/volunteer party![↩]

About Rob Friesel

Software engineer by day. Science fiction writer by night. Weekend homebrewer, beer educator at Black Flannel, and Certified Cicerone. Author of The PhantomJS Cookbook and a short story in Please Do Not Remove. View all posts by Rob Friesel →

found drama