URL Normalization

Introduction

The URL formats used by sites often include dynamic or variable path components. These components are not favorable for analytics. The Web SDK's URL Normalization feature can remove or replace the dynamic components. To use the Normalization feature, define a custom function in the configuration. This function will be invoked by the Web SDK and passed the URL path, component, or hash fragment value along with the JSON message type for which this normalization is being performed. The Web SDK will use the return value from this function when logging the message.

Some normalization techniques include:

  • removing components
  • masking component values
  • replacing targeted (dynamic) text

Configuration

The URL Normalization function is specified in the core configuration. The default implementation returns the same value that is passed to it. You should provide an implementation which is specific to your application. In most cases the normalization should be performed regardless of where the normalized value is set in the Web SDK JSON message. Sometimes the normalization rule is only applicable for a particular message type. In such cases, based on the messageType value, make the replacement rule conditional. An undefined value of the messageType indicates the normalization function is being invoked for the JSON wrapper (e.g. the page URL in thewebEnvironment section) instead of a specific message type.

🚧

Perform adequate testing of the normalization function against all expected URL, path, and fragment inputs to ensure the expected value is being returned in all cases.

core: {
    normalization: {
        /**
         * User defined URL normalization function which accepts an URL, path or fragment
         * and returns the normalized value.
         * @param {String} urlOrPath URL, path or fragment which needs to be normalized.
         * @param {Integer} [messageType] Indicates the message type for which the normalization
         * is being performed, undefined otherwise.
         * @returns {String} The normalized URL/path/fragment value.
         */
        urlFunction: function (urlOrPath, messageType) {
            // Normalize the input URL or path here.                                                                
            return urlOrPath;
        }
    }
}

Sample implementations

urlFunction: function (url) {
    // Replace "abcd/pqrs123" with "abcd/pqrs"
    return url.replace(/pqrs\d\d\d/, "pqrs");
}
urlFunction: function (url, messageType) {
    var retValue = url;
    if (messageType === 2) {
        retValue = url.replace(/param\d\d\d/, "paramXXX");
    }
    return retValue;
}

The following example shows how to convert a query string so it is preserved as the URL path.

📘

Note:

Overstat and Replay will strip any query string from the URL. Use slash, underscore, or other delimiters to preserve the query string.

urlFunction: function (url) {
    // abc.com?user=john&age=30  ->  abc.com/user/john/age/30
    return url.replace("?", "/").replace("&", "/");
}