HOME BLOG

Archive for March, 2017

Introduction to PhantomJS

Posted on: March 14th, 2017 by Olu No Comments

Hi folks,

In this post I briefly go through how to navigate pages using PhantomJS perhaps while writing automated UI tests, scraping web pages, etc. To facilitate this we will use the example code supplied by PhantomJS itself [1].

 

Let’s look at the first chunk of the code

"use strict";
var sys = require("system"),
    page = require("webpage").create(),
    logResources = false,
    step1url = "http://en.wikipedia.org/wiki/DOM_events",
    step2url = "http://en.wikipedia.org/wiki/DOM_events#Event_flow";

if (sys.args.length > 1 && sys.args[1] === "-v") {
    logResources = true;
}

In the first line we create an object of sys module which we use later on to check the number of arguments. That’s useful if your script is going to accept arguments. The important bit, though, is the page = require(“webpage”).create() line. That creates a page.

 

Next, to open a page, we use code like

 

setTimeout(function() {
    console.log("");
    console.log("### STEP 1: Load '" + step1url + "'");
    page.open(step1url);
}, 0);

 

That is, we call open() method on the page object. Notice how we wait for 0 seconds before making this call. For subsequent calls one waits for longer periods of time.

 

Next, to click a line on a page, we can use code like

setTimeout(function() {
    console.log("");
    console.log("### STEP 3: Click on page internal link (aka FRAGMENT)");
    page.evaluate(function() {
        var ev = document.createEvent("MouseEvents");
        ev.initEvent("click", true, true);
        document.querySelector("a[href='#Event_object']").dispatchEvent(ev);
    });
}, 10000);

 

Notice how we create a MouseEvents object and call initEvent method on that object, passing “click” to it. Then we dispatch the event on the selector we want to click.

 

Finally, in a PhantomJS script, it’s good to close out the page and exit PhantomJS. We do that using code as shown below:

setTimeout(function() {
    console.log("");
    console.log("### STEP 5: Close page and shutdown (with a delay)");
    page.close();
    setTimeout(function(){
        phantom.exit();
    }, 100);
}, 20000);

 

To see a more full-fledged example, check out Amir Duran’s excellent example of using PhantomJS to log in to Amazon [2]. The PhantomJS website [3] also has lots of good examples demonstrating usage. That’s all for now. Happy coding.

 

References

1. Page Events example. https://raw.githubusercontent.com/ariya/phantomjs/master/examples/page_events.js.

2. How to login Amazon using PhantomJS – Working example | Code Epicenter.
http://code-epicenter.com/how-to-login-amazon-using-phantomjs-working-example/

3. Examples | PhantomJS. http://phantomjs.org/examples/