Summary
This article describes how to build client-side passive monitors using Engine for Web Applications.
Passive monitoring is the practice of recording metrics in response to outside stimuli. Conversely, active monitoring is the practice of generating stimuli and then recording the resulting metrics. Both active and passive Web site monitoring can take place at any point where the network traffic, application transactions, or user events generate data.
Building Client-Side Passive Monitors with Engine for Web Applications
Prerequisites
The following prerequisites are needed to implement the examples provided in this article.
Overview
Introduction
The release of Google Analytics has brought renewed attention to a methodology that has existed for at least ten years: passively monitoring real users on Web pages. Passive monitoring is the practice of recording metrics in response to outside stimuli. Conversely, active monitoring is the practice of generating stimuli and then recording the resulting metrics. Both active and passive Web site monitoring can take place at any point where the network traffic, application transactions, or user events generate data.
There are many types of passive monitors ranging in price and implementation. Some are hosted in third-party datacenters, and others can be hosted local to the monitored Web site. Different sets of metrics can be collected, and different reports can be generated. Some monitors focus on the client-side by using a watermark, or beacon, and other monitors focus on the server-side by instrumenting the Web server or tapping a network span port. There are advantages and disadvantages to each.
The advantages to client-side passive monitors are:
- monitor requests that may be cached by the user agent or a proxy.
- monitor client-side events and activities.
- span application boundaries without instrumenting application servers.
- monitor page-to-page latency as seen by the user.
- monitor render time, which can only be surmised by server-side monitoring products.
The disadvantages to client-side passive monitors are:
- Cannot monitor detailed network latency.
- Has no insight into server errors.
Common problems that exist for both server- and client- side solutions:
- Accuracy is affected by some UA's not responding to client-side monitors, and by failing to monitor cached hits by the server.
This article describes how to build client-side passive monitors using the Engine for Web ApplicationsMonitor Service.
[ top ]
Monitor Metrics
No matter the type of Web monitor, be it a log shredder, or a Web Analytics monitor with all the bells and whistles, there exist a set of common metrics that can be collected. Most operations and business-level reporting are created from those common metrics. Additional metrics include Web pages dog-earred with marketing identifiers for tracking campaign and advertisement response levels. Data is abundant with passive monitoring, and the real value comes from how that data is aggregated and summarized to answer operations and business level questions. Beyond the buzzword speak, and once all of the marketing is boiled down, the common metrics derived from passive monitoring of Web pages follow.
Common Metrics
The following metrics are the baseline for most Web monitoring products, and are also found in Web server logs.
- Date / Time
- Location
- Referer
- Server GMT Offset
- Client IP; Geo & ISP Data (eg: http://ip2location.com):
- ISP
- Country
- Region
- City
- Latitude and Longitude
- Client Host
- User Agent String (from Request header / can be spoofed), including:
- Operating System (eg: Windows)
- Operating System Version (eg: XP)
- Client Name (eg: Firefox)
- Client Version (eg: 1.5)
- Session ID
- Response Code (some passive monitors don't/can't collect this)
Client-Side Metrics
In addition to the common metrics, the following metrics are the performance and experience baseline for client-side monitoring products:
- Page title
- Window Name (useful for pages with Framesets and IFrames)
- Client GMT Offset
- User Agent Properties (sampled via script) : could be anything, but here are popular ones:
- Screen dimensions
- Color depth
- Plugin support
- Script supported
- User Agent, Content, and User Events
- Script Errors
- Image Errors
- Content Assertions (eg: does a specific expression match the page contents)
- User Agent Calculations (things you can calculate on the UA with the monitor)
- Psuedo-Throughput / kbps (if you don't do this from the server, it can only be psuedo)
- Session-Length cookies supported
- Longterm-Length cookies supported
- DOM Support (eg: getElementById, or XMLHTTP support; aka AJAX Support)
- Page-views per session count
- Returning visit (unique session) count
- Script Support (is script enabled)
- Script Performance (eg: Function Monitor )
- Content Layout (eg: Page Scope; very expensive)
- User Behavior (eg: IMNMotion Behavior Monitor )
- Variable Assertions (eg: what is the value of a script variable, or what is the value of a specified expression)
- Cookie Assertions (eg: what is the value of a cookie key)
Marketing Metrics
In addition to client-side and common metrics are market metrics. These are unique page quantifiers that are specific to a monitoring product, and that are used in tandem with summarizers and product-specific reports to generate market response charts. This type of information is similar to client-side variable assertions, with the adedd benefit of being specifically tied to a particular product feature.
[ top ]
Existing Monitoring Industry
Cutting Through Industry Speak
There are many common terms used to describe passive monitor product features, all of which can be boiled down to sensible descriptions.
- Click stream / Path analysis: this is the path of URLs over time a specific person follow for a specific session.
- Conversion rate: the number of visitors who register, click an advertisement, start a shopping cart, or check-out, versus the total number of visitors to the Web site.
- Drop-off / Abandonment rate: the number of visitors who start a process (eg: start a shopping cart), and then leave before finishing the process.
- Return rate / Returning visitors: the number of visitors who have been to a site before, typically in a different session, and who have not yet bought anything.
- Customers: someone who bought something, or has otherwise registered with the site.
- Optimized or Compressed Javascript: this means the script file has been obfuscated to reduce whitespace, remove code comments and line returns, and replace long member names with shorter ones.
- Geo Location / Origin: This is mapping the client ip address to a table of IP blocks assigned to ISPs, and the locations of the ISPs.
- Real-user monitor: another way of describing passive monitoring.
Service Breakdown
As the name implies, client-side passive monitors collect data on Web pages, and send that data to a Web server for aggregation. For monitoring secure sites, and for maximum privacy and data protection, it is recommended to use locally hosted solutions, or where data is collected and stored by the same domain that is serving the monitored content. No matter where the data is stored, the key areas of collecting, storing, and reporting on the data are broken down as follows:
- Monitor: collects the data and sends it to a Web server as a GET or POST request. GET requests are most typically generated using the new Image() constructor and tacking the monitored data onto the query string. POST requests are starting to become more common with XMLHTTP requests. The monitor can be as simple as a static image request, or as complex as an embedded script file that works alone or interoperates with a server-side filter or application.
- Data Spool / Aggregator: this can simply be a web server log file, or a Web application to receive and process the request, extract the collected metrics, and spool the metrics for later storage.
- Data Storage: This can be a flat file or database where the data is stored in a logical, and if possible, normalized format.
- Summarizer: A summarizer is useful for building out summaries of the data ahead of any requests, and is particularly useful for building out session paths for click stream analysis.
- Reporting; a reporting service for generating reports based on the raw or summarized data.
[ top ]
Designing A Passive Web Page Monitor
A well-designed passive Web Page Monitor that uses any kind of JavaScript should be able to handle three implementation states:
- script is disabled,
- script is enabled but the browser does not support the script necessary for the monitor to collect and send data,
- and the browser supports the monitor.
The monitor might also consider the following possible loading states:
- The page loads completely (window.onload fires).
- The user navigates away or otherwise leaves the page before it loads completely.
- The user hits the stop button before the page loads completely.
The reason for checking whether script is supported to the level required by the monitor is that by not doing so data would not be collected for those users. Also, the reason for considering the various loading states is that some monitors may not send data until the window.onload event fires, but the onload event may not fire if the user leaves the page before it loads, such as when a page gets redirected. Also, sending data via asynchronous operations such as an image request or an asynchronous XML request may cause data to be lost if the request is sent when the page is unloaded.
The following are a few examples of simple passive monitors, where the data is spooled in the Web server log by making an image request and the web URL is //localhost/monitor/spool.gif. Note that the protocol is not specified so that the request inherits the protocol used to load the page.
<!-- Scriptless monitor / Simple Web beacon -->
<img src = "//localhost/monitor/spool.gif" height="1" width="1" alt = "" />
<!-- Scripted monitor / Web beacon -->
<script type = "text/javascript">
var oMonitor = new Image();
oMonitor.src = "//localhost/monitor/spool.gif?script_enabled=true";
</script>
<!-- Handle script-disabled -->
<noscript>
<img src = "//localhost/monitor/spool.gif?script_disabled=true" height="1" width="1" alt = "" />
</noscript>
[ top ]
The Engine for Web Applications Monitor Service
The Engine for Web Applications Monitor Service provides instrumentation and contextual information to easily create a robust passive monitor. The Monitor Service can be used to register a script object as a monitor, delegate a number of events to that object, and provide easy access to common information such as the session id, unique id, and various location hashes.
Event Delegation
The Monitor Service delegates the following events to virtual handlers defined on the monitor.
- (window) contextmenu
- (window) beforeunload
- (window) blur
- (window) error
- (window) focus
- (window) keydown 1
- (window) load
- (window) resize
- (window) unload
- (document) scroll
- (form) submit 2
- (input) blur 2
- (input) focus 2
- (document) mouseclick 1
- (document) mousemove 1
- (select) change 2
- bubbles from the source elements to the specified element.
- specified for any object matching the element name.
Context Properties
The Monitor Service exposes the following contextual property methods.
- getApplicationId - returns the value of a global APPLICATION_ID, or the static string "Global". The application identifer is used to establish context within a specific web application.
- getContextId - returns a light hash of the location and a few random numbers. This should only be used for correlation within a specific session.
- getContextObject - returns the context object for which monitoring is applied. The default is document, and this may be overriden by specifying a global CONTEXT_OBJECT variable.
- getDatasetId - returns the value of a global DATASET_ID, or the static string "public". The dataset id is used to filter data to an organizational level. This is useful in reseller or hosted contexts where data from multiple sources may be stored in the same location.
- getSessionId - returns the session id. By default, the Monitor Service uses the cookie MONITOR_SESSION_ID to store the session id. The SDK is required to change the cookie name to use an application specified session id, such as ASPSESSIONID or JSESSIONID.
- getWindowState - returns the loading state of the window, as interpreted by the Monitor Service.
[ top ]
Examples
Example #1
The following is an example of creating a monitor using a new Engine object and the Monitor Service.
<!-- Scripted monitor / Web beacon -->
<script type = "text/javascript">
// Create a new object to be the monitor, and add it to the Engine framework.
var oMonitor = org.cote.js.newObject("MyMonitor","1.0",true);
oMonitor.initializeMonitor=function(){
// must return true for monitoring to be enabled;
return 1;
};
org.cote.js.monitor.MonitorService.addMonitor(oMonitor);
</script>
Alternately, the following is an example of creating a monitor from an existing object.
var oMonitor = {
handle_window_load:function(){
},
// Required by MonitorService
initializeMonitor:function(){
// must return true for monitoring to be enabled;
return 1;
}
};
// important to prepare object so 1) it can be registered, and 2) add it to the ObjectRegistry
org.cote.js.prepareObject("MyMonitor","1.0",true,oMonitor);
// add the monitor object
org.cote.js.monitor.MonitorService.addMonitor(oMonitor);
The missing piece in the previous examples is collecting and sending data. The following example shows the collection of a few metrics, and includes a noscript statement to compensate for browsers that do not support script or that have scripting disabled.
Example #2
<noscript>
<img src = "http://localhost/monitor/spool.gif?script_disabled=true" height="1" width="1" alt = "" />
</noscript>
<script type = "text/javascript">
var oMonitor = {
handle_window_load : function(){
var oBeacon = new Image();
oBeacon.src = "http://localhost/monitor/spool.gif?location=" + escape(document.URL)
+ "&refer=" + escape(document.cookie)
+ "&session_id=" + this.getMonitorService().getSessionId()
+ "&context_id=" + this.getMonitorService().getContextId()
+ "&title=" + escape(document.title)
+ "&gmt=" + (new Date()).getTimezoneOffset()
;
},
// Required by MonitorService
initializeMonitor : function(){
// must return true for monitoring to be enabled;
return 1;
}
};
// important to prepare object so 1) it can be registered, and 2) add it to the ObjectRegistry
org.cote.js.prepareObject("MyMonitor","1.0",true,oMonitor);
// add the monitor object
org.cote.js.monitor.MonitorService.addMonitor(oMonitor);
</script>
Here are a few considerations about the previous example:
- Some information can be culled from the Web server or the Web server logs, such as the date and time, the IP address, and the UserAgent string.
- The time from the monitored user should only be used for measuring clock-skew, and not for marking the time on the page. Use the time from the server because that will provide consistent and more accurate values, whereas a monitered user could have changed their local clock.
- This example does not address the condition that the script is not supported, and relies on the window.onload event firing. What happens if the user leaves the page before the onload event fires? Conversely, by not waiting for the onload event, not all elements may be instrumented for monitoring, and if the user navigates away from the page before the beacon to send data is created, then the data is still lost.
Example #3
To compensate for early navigation, a cookie can be used (if cookies are enabled). If cookies are not enabled, a synchronous XML request can be used. Otherwise, this would be a point of failure for data collection. The following example shows how cookies might be used to compensate for an early navigation event.
var oMonitor = {
getData : function(){
return "http://localhost/monitor/spool.gif?location=" + escape(document.URL)
+ "&refer=" + escape(document.cookie)
+ "&session_id=" + this.getMonitorService().getSessionId()
+ "&context_id=" + this.getMonitorService().getContextId()
+ "&title=" + escape(document.title)
+ "&gmt=" + (new Date()).getTimezoneOffset()
;
},
handle_window_beforeunload : function(){
// If the page is unloading before it loaded, save the data to a cookie
if(!this.getDocumentRendered()){
document.cookie = "BackupData=" + escape(this.getData());
// make a note that the data was stored
this.getStatus().data_sent = 1;
}
},
handle_window_load : function(){
// If there was backup data, send that data.
if(this.getCookie("BackupData")){
var oBackupBeacon = new Image();
oBackupBeacon.src = this.getCookie("BackupData");
this.clearCookie("BackupData");
}
// If data wasn't sent, send it and mark it as having been sent
if(!this.getStatus().data_sent){
var oBeacon = new Image();
oBeacon.src = getData();
this.getStatus().data_sent = 1;
}
},
// Required by MonitorService
initializeMonitor : function(){
// must return true for monitoring to be enabled;
return 1;
}
};
[ top ]
Appendix A
Web Page Passive Monitor Products
[ top ]