Decoding Mechanism in JavaScript Malware: A Deep Dive

Cybersecurity is a wild, ever-changing world, and sneaky JavaScript malware keeps popping up where you least expect it like tucked inside perfectly normal files. Not long ago, I stumbled across a slick piece of trickery hiding in a legit jQuery plugin (jquery.themepunch.revolution.js).

One part that really caught my eye was this clever string decoding function buried in its messy, hard-to-read code. In this blog, I’m going to break it down for you, starting with the original jumbled version, then cleaning it up with some renamed bits to make it easier to follow, and finally digging into the nitty-gritty of how it ticks and why it’s so key to the malware staying under the radar.

The Original Code: A Peek into Obfuscation

The malware’s decoding mechanism begins with two functions buried in a self-executing block. Here’s the original, unaltered code as it appeared

function x(I, h) {
    var H = A();
    return x = function(X, J) {
        X = X - 0x84;
        var d = H[X];
        return d;
    }, x(I, h);
}

function A() {
    var s = [
        'send', 'refe', 'read', 'Text', '6312jziiQi', 'ww.',
        '//loopstec.com/apps/video/Banners-sample/Banners-sample.php',
        'stat', '440yfbKuI', /* ... additional strings ... */
    ];
    return s;
}

When you first peek at this code, it’s a bit of a head-scratcher. It’s full of single-letter variables like I, h, H, X, and J, some mysterious hexadecimal stuff like 0x84, and a function that rewrites itself, a total signs it’s trying to stay sneaky. The x function seems to grab strings from an array that A() hands over, but with its short names and odd setup, it’s tough to figure out what’s going on. Let’s swap those names for something clearer and strip away the confusion.

Renamed Code: Clarity Through Naming

To make sense of this, I’ve renamed the variables and functions while preserving the exact logic. Here’s the transformed version

// Function to decode strings from an array
function stringDecoder(index, offset) {
    var strings = stringList();
    return stringDecoder = function(newIndex, newOffset) {
        newIndex = newIndex - 0x84; // Adjust index to match array
        return strings[newIndex];
    }, stringDecoder(index, offset);
}

// Array of encoded strings
function stringList() {
    var words = [
        'send', 'refe', 'read', 'Text', '6312jziiQi', 'ww.',
        '//loopstec.com/apps/video/Banners-sample/Banners-sample.php',
        'stat', '440yfbKuI', /* ... additional strings ... */
    ];
    return words;
}

Renaming Choices

x → stringDecoder: Reflects its role in decoding strings from numeric indices.

A → stringList: Clearly indicates it provides the list of strings.I,

h → index, offset: Suggest their intended use as parameters (though offset is a decoy).

H → strings: Represents the array of strings fetched from stringList.

X, J → newIndex, newOffset: Distinguish the inner function’s parameters.

s → words: A more descriptive name for the string array.

With these names, the code’s purpose becomes more apparent: it’s a mechanism to map numeric codes to hidden strings. But how does it work, and why is it structured this way? Let’s break it down.

Detailed Explanation: How the Decoder Operates

This decoding system is both elegant and devious, designed to obscure the malware’s intentions while enabling dynamic string retrieval. Here’s a step-by-step analysis of its mechanics.

The stringList function is straightforward:

Purpose: Acts as a repository for strings the malware needs, such as ‘send’ (for HTTP requests), ‘refe’ (part of ‘referrer’), or the suspicious URL ‘//loopstec.com/apps/video/Banners-sample/Banners-sample.php’.

Execution: When called, it returns an array (words) containing these strings in a fixed order

Example :

stringList(); // Returns ['send', 'refe', 'read', 'Text', ...]

Design Choice: Encapsulating the array in a function (rather than a global variable) reduces its visibility in static code analysis, a subtle obfuscation tactic.

Each string has an index: ‘send’ is at 0, ‘refe’ at 1, and so on. The full array likely contains dozens of entries, tailored to the malware’s needs.

Initial Call

Signature: stringDecoder(index, offset)

Takes two parameters, but only index matters (offset is unused, likely a red herring)

var strings = stringList();

Fetches the string array and stores it locally

Self-Modification

return stringDecoder = function(newIndex, newOffset) {
    newIndex = newIndex - 0x84; // 0x84 is 132 in decimal
    return strings[newIndex];
}, stringDecoder(index, offset);

Redefinition: Assigns a new, simpler function to stringDecoder.

Immediate Execution: The comma operator (,) runs this new function with the original index and offset as newIndex and newOffset.

Closure: The strings array is captured in the new function’s scope, persisting across calls.

Offset Adjustment: The new function subtracts 0x84 (132 in decimal) from newIndex. This adjusts the input (e.g., 0xa5 = 165) to an array index (e.g., 165 – 132 = 33).

Post-Modification

After the first call, stringDecoder becomes

function(newIndex, newOffset) {
    newIndex = newIndex - 0x84;
    return strings[newIndex];
}

Now, it simply takes a number, adjusts it, and returns the corresponding string from strings.

Example Walkthrough

Assume a simplified stringList for clarity

var words = ['send', 'refe', 'read', 'Text'];

First Call: stringDecoder(0xa5, 999) (0xa5 = 165)

  • strings = [‘send’, ‘refe’, ‘read’, ‘Text’].
  • Redefines stringDecoder to the new function.
  • Runs it: 165 – 132 = 33.
  • strings[33] is undefined (array only has 4 items), but in the real array, this might be ‘send’ or another string at index 33.
  • Returns the string (or undefined if out of bounds).

Second Call: stringDecoder(0x86, 0) (0x86 = 134)

  • Uses the new function: 134 – 132 = 2.
  • strings[2] = ‘read’.
  • Returns ‘read’.

In the malware, indices like 0xa5 (165) or 0x89 (137) map to specific strings (e.g., ‘send’ might be at index 33 if the array is long enough).

Why the Complexity?

Obfuscation: Using hexadecimal numbers (e.g., 0xa5) instead of plain strings (e.g., ‘send’) hides intent. Without knowing the offset (0x84), the indices look arbitrary.

Self-Modification: The initial overwrite makes static analysis harder—tools might miss that the function changes after one call.

Efficiency: After setup, it’s a lightweight lookup, reusing the strings array via closure.

Role in the Malware

This decoder is a linchpin in the malware’s operation. Later code uses it to construct dynamic behavior:

  • stringDecoder(0xa5) → ‘send’ (used in XMLHttpRequest.send).
  • stringDecoder(0x8a) → ‘open’ (for XMLHttpRequest.open).
  • A higher index might yield the URL ‘//loopstec.com/…’ for exfiltration.

By encoding strings as numbers, the malware avoids hardcoding sensitive terms, making it less detectable by simple pattern matching (e.g., searching for ‘eval’ or suspicious URLs).

Security Implications

This technique is a hallmark of sophisticated JavaScript malware:

  • Stealth: It blends into legitimate scripts (like a slider plugin), evading casual inspection.
  • Flexibility: The string list can be updated remotely or expanded without changing the core logic.
  • Detection Challenge: Analysts must decode the indices and offset to reveal the full behavior, complicating signature-based defenses.

Conclusion

The stringDecoder and stringList duo exemplifies how malware authors weaponize JavaScript’s flexibility. What appears as a quirky function is a deliberate obfuscation tool, enabling the malware to phone home and execute remote payloads discreetly. By renaming and dissecting it, we’ve lifted the veil on its mechanics, hopefully equipping developers and security professionals to spot and counter such threats.

Stay vigilant, and happy coding (safely)!

Leave a Reply