Supercharging Malware Analysis in Binary Ninja: Automated String IOC Lookup with VirusTotal

Malware analysis is often a race against time. Analysts meticulously dissect malicious code, searching for clues, indicators of compromise (IOCs), and a deeper understanding of the threat’s capabilities. One common, yet sometimes tedious, task is sifting through strings extracted from a binary and checking them against threat intelligence sources. What if this process could be streamlined directly within your favorite reverse engineering platform?

Enter a powerful Binary Ninja plugin designed to do just that! This plugin enhances the string analysis workflow by integrating directly with VirusTotal, allowing analysts to quickly identify known malicious strings and gather crucial context without ever leaving their Binary Ninja workspace.

What Does This Plugin Do?

At its core, this Binary Ninja plugin automates the process of:

  1. Comprehensive String Collection: It gathers all identified strings from the binary currently being analyzed.
  2. Organized Tabular Display: Instead of just a flat list, it presents these strings in a new, sortable table within the Binary Ninja interface. This table includes:
    • Address: The memory address where the string is located.
    • String Value: The actual content of the string.
    • Type: The encoding type of the string (e.g., ASCII, UTF-8).
    • Length: The length of the string, with the table initially sorted to show the longest strings first (often more unique and interesting).
  3. Integrated VirusTotal Lookup: This is where the magic happens. Analysts can:
    • VirusTotal Info: A general summary, indicating if the string was found in a file categorized as “Malicious” or “Suspicious” by VirusTotal, along with potential threat labels or associated file names.
    • AV Detections: A comma-separated list of antivirus engine detections for the first malicious file found containing the string (e.g., “Kaspersky (Trojan.Generic), Malwarebytes (Agent.Tesla)”). This provides quick insight into how different vendors classify the threat.
    • Sample Hashes: Displays up to three SHA256 hashes of different malicious/suspicious files that VirusTotal found containing the searched string. This allows analysts to pivot to these known samples for further investigation.
  4. User-Friendly Features:
    • Text Wrapping: Long AV detection strings and sample hashes are wrapped for better readability.
    • Minimum String Length Filter: Avoids noisy lookups for very short, common strings.
    • API Key Configuration: Allows users to input their own VirusTotal API key.
    • Dependency Checks: Warns if the required vt-py library is missing.

The Benefits for Malware Analysts: Speed, Context, and Efficiency

This plugin offers several significant advantages for malware analysts:

  1. Accelerated IOC Discovery: Manually copying strings and searching them on VirusTotal is time-consuming. Automating this directly within Binary Ninja drastically speeds up the process of identifying potential IOCs. A string that might be a C2 server, a unique malware artifact, or a known malicious command can be flagged almost instantly.
  2. Immediate Contextualization: Seeing VirusTotal results alongside the string, its address, and other binary information provides immediate context. Is this string part of a known malware family? What do AV engines call it? This helps analysts quickly prioritize which strings and code sections warrant deeper investigation.
  3. Reduced Context Switching: By keeping the analysis workflow within Binary Ninja, the plugin minimizes the need to switch between different tools and browser tabs. This helps maintain focus and improves overall efficiency.
  4. Hint Generation and Pivoting:
    • The “AV Detections” column can provide immediate hints about the type of malware or its capabilities (e.g., “Downloader,” “Spyware,” “Ransomware”).
    • The “Sample Hashes” column offers direct pivot points. Analysts can take these hashes and look them up on VirusTotal or other platforms to find related samples, C2 infrastructure, or detailed analysis reports. This is invaluable for campaign tracking and understanding threat evolution.
  5. Prioritization of Interesting Strings: While the table initially sorts by length, the VirusTotal information helps analysts quickly identify strings that are actually interesting from a threat perspective, regardless of their length or perceived uniqueness at first glance.
  6. Streamlined Reporting: Having relevant IOCs and their associated threat intelligence readily available can simplify the process of compiling analysis reports.

Getting More Hints, Faster

Imagine analyzing a new, unknown binary. You open the strings view, and with a couple of clicks, the plugin starts enriching this data. Suddenly, a seemingly innocuous string lights up with “Malicious (Label: Emotet)” and a list of AV detections. Another string provides hashes of known Cobalt Strike beacons. These are not just strings anymore; they are actionable intelligence points, guiding your next steps in the reverse engineering process.

This plugin transforms the static list of strings into a dynamic source of leads. It helps analysts to:

  • Quickly identify if the malware uses known malicious infrastructure or commands.
  • Link the current sample to known threat actors or malware families.
  • Discover related malicious files for broader campaign analysis.
  • Focus reverse engineering efforts on code sections associated with high-confidence IOCs.

Conclusion

In the fast-paced world of malware analysis, efficiency and speed are paramount. The “Strings Viewer with VirusTotal Lookup” plugin for Binary Ninja is a testament to how targeted automation can significantly enhance an analyst’s capabilities. By bringing crucial threat intelligence directly into the reverse engineering environment, it empowers analysts to uncover more hints, gain context faster, and ultimately, stay ahead of evolving threats.

If you’re a Binary Ninja user, integrating this plugin (or one with similar functionality) into your toolkit is a highly recommended step towards a more efficient and insightful malware analysis workflow.

To use this plugin, you’ll typically need:

  • Binary Ninja.
  • The plugin script itself.
  • A VirusTotal API key (ensure you adhere to their terms of service).
  • The vt-py Python library installed in your Binary Ninja Python environment.
from binaryninja import *
from binaryninja import interaction # Corrected import for MessageBox
from binaryninjaui import UIContext
from PySide6 import QtWidgets, QtCore, QtGui # QtGui for QCursor
import json # For parsing JSON responses and debug printing
import vt # For official VirusTotal API client
import re # For escaping special characters

# --- Configuration ---
# IMPORTANT: Replace "YOUR_VIRUSTOTAL_API_KEY" with your actual VirusTotal API key
VT_API_KEY = "YOUR_VIRUSTOTAL_API_KEY"
MIN_STRING_LENGTH_FOR_VT_LOOKUP = 4 # Minimum length of string to query on VT
CHECK_ALL_WARNING_THRESHOLD = 20 # Warn user if checking more than this many strings at once
MAX_SAMPLE_HASHES_TO_DISPLAY = 3

# --- Custom TableWidgetItem for Numerical Sorting ---
class NumericTableWidgetItem(QtWidgets.QTableWidgetItem):
    def __init__(self, numeric_value):
        super().__init__(str(numeric_value))
        self.numeric_value = numeric_value

    def __lt__(self, other_item):
        if isinstance(other_item, NumericTableWidgetItem):
            return self.numeric_value < other_item.numeric_value
        return super().__lt__(other_item)

# --- Worker Thread for VirusTotal Lookup ---
class VTLookupThread(QtCore.QThread):
    # Signal: row_index, general_threat_info, av_detections_string, sample_hashes_string, original_string_value
    result_ready = QtCore.Signal(int, str, str, str, str)

    def __init__(self, api_key, search_string, row_index, parent=None):
        super().__init__(parent)
        self.api_key = api_key
        self.search_string = search_string
        self.row_index = row_index

    def _escape_vt_search_string(self, value):
        """Escapes special characters for VirusTotal content search queries."""
        value = value.replace('\\', '\\\\') 
        value = value.replace('"', '\\"')   
        special_chars = r'[+\-&|!(){}\[\]^~*?:/]'
        def escape_match(match):
            return '\\' + match.group(0)
        escaped_value = re.sub(special_chars, escape_match, value)
        return escaped_value

    def run(self):
        if not self.api_key or self.api_key == "YOUR_VIRUSTOTAL_API_KEY":
            self.result_ready.emit(self.row_index, "VT API Key Missing", "", "", self.search_string)
            return

        if len(self.search_string) < MIN_STRING_LENGTH_FOR_VT_LOOKUP:
            self.result_ready.emit(self.row_index, f"String too short (min {MIN_STRING_LENGTH_FOR_VT_LOOKUP})", "", "", self.search_string)
            return
        
        escaped_search_string = self._escape_vt_search_string(self.search_string)
        client = None
        query = "" 
        try:
            client = vt.Client(self.api_key)
            query = f'content:"{escaped_search_string}"' 
            it = client.iterator("/intelligence/search", params={"query": query}, limit=10) # Fetch a bit more to find 3 malicious samples
            
            threat_info_final = "Not found in malicious files" 
            av_detections_final_string = ""
            sample_hashes_list = []
            first_malicious_file_processed = False # To get general info from the first hit
            
            for file_obj in it: 
                is_malicious = False
                is_suspicious = False
                if hasattr(file_obj, 'last_analysis_stats'):
                    is_malicious = file_obj.last_analysis_stats.get('malicious', 0) > 0
                    is_suspicious = file_obj.last_analysis_stats.get('suspicious', 0) > 0
                
                if is_malicious or is_suspicious:
                    # Collect sample hash if we need more
                    if len(sample_hashes_list) < MAX_SAMPLE_HASHES_TO_DISPLAY and hasattr(file_obj, 'id'):
                        if file_obj.id not in sample_hashes_list: # Avoid duplicates
                             sample_hashes_list.append(file_obj.id)

                    # Process details from the *first* malicious/suspicious file encountered
                    if not first_malicious_file_processed:
                        prefix = "Malicious" if is_malicious else "Suspicious"
                        extracted_general_names = []
                        av_detections_list = []

                        if hasattr(file_obj, 'last_analysis_results'):
                            for engine, result_data in file_obj.last_analysis_results.items():
                                if result_data.get("category") == "malicious" and result_data.get("result"):
                                    av_detections_list.append(f"{engine} ({result_data['result']})")
                        if av_detections_list:
                            av_detections_final_string = ", ".join(av_detections_list)

                        if hasattr(file_obj, 'popular_threat_classification') and \
                           file_obj.popular_threat_classification and \
                           file_obj.popular_threat_classification.get('suggested_threat_label'):
                            extracted_general_names.append(f"Label: {file_obj.popular_threat_classification['suggested_threat_label']}")
                        
                        if not extracted_general_names: # Fallback for general name
                            file_name_to_use = None
                            if hasattr(file_obj, 'meaningful_name') and file_obj.meaningful_name:
                                file_name_to_use = file_obj.meaningful_name
                            elif hasattr(file_obj, 'names') and file_obj.names:
                                file_name_to_use = file_obj.names[0]
                            
                            if file_name_to_use:
                                 extracted_general_names.append(f"File: {file_name_to_use}")
                            elif hasattr(file_obj, 'id'): 
                                extracted_general_names.append(f"File ID: {file_obj.id}")
                            else:
                                extracted_general_names.append("File ID: Unknown")
                        
                        threat_info_final = f"{prefix} ({'; '.join(extracted_general_names)})"
                        first_malicious_file_processed = True
                    
                    # If we have enough hashes and processed the first malicious file, we can stop early
                    if first_malicious_file_processed and len(sample_hashes_list) >= MAX_SAMPLE_HASHES_TO_DISPLAY:
                        break
            
            sample_hashes_final_string = ", ".join(sample_hashes_list)
            self.result_ready.emit(self.row_index, threat_info_final, av_detections_final_string, sample_hashes_final_string, self.search_string)

        except vt.error.APIError as e:
            error_message = f"VT API Error: {e.code}"
            if e.code == "AuthenticationError": error_message = "VT Auth Error"
            elif e.code == "QuotaExceededError": error_message = "VT Quota Exceeded"
            elif e.code == "PermissionError": error_message = "VT Permission Denied"
            elif e.code == "InvalidArgumentError": error_message = f"VT Invalid Argument"
            log_error(f"VirusTotal API Error for string '{self.search_string}': {e.message} (Code: {e.code}) Query: {query}")
            self.result_ready.emit(self.row_index, error_message, "", "", self.search_string)
        except Exception as e:
            log_error(f"Unexpected error during VT lookup for '{self.search_string}': {e}")
            self.result_ready.emit(self.row_index, f"Error: {type(e).__name__}", "", "", self.search_string)
        finally:
            if client:
                client.close()

# --- Main Plugin Logic ---
persistent_widgets = {}
active_threads = {} 

class StringsTableView:
    COL_ADDRESS = 0; COL_STRING = 1; COL_TYPE = 2; COL_LENGTH = 3
    COL_VT_INFO_GENERAL = 4; COL_VT_AV_DETECTIONS = 5; COL_VT_SAMPLE_HASHES = 6
    TOTAL_COLUMNS = 7

    def __init__(self, bv):
        self.bv = bv
        self.table_widget = None
        self.vt_api_key = VT_API_KEY 
        self._setup_ui()

    def _setup_ui(self):
        strings_data = []
        for s in self.bv.strings:
            strings_data.append((hex(s.start), s.value, s.type.name, len(s.value), "", "", "")) # Added "" for sample hashes

        if not strings_data:
            interaction.show_message_box("No Strings Found", "No strings were identified in the current binary.")
            return

        self.table_widget = QtWidgets.QTableWidget(len(strings_data), self.TOTAL_COLUMNS) 
        self.table_widget.setHorizontalHeaderLabels([
            "Address", "String", "Type", "Length", 
            "VirusTotal Info", "AV Detections", "Sample Hashes"
        ])
        
        for r, (addr, s_val, s_type, s_len, vt_gen, vt_av, vt_hash) in enumerate(strings_data):
            self.table_widget.setItem(r, self.COL_ADDRESS, QtWidgets.QTableWidgetItem(addr))
            self.table_widget.setItem(r, self.COL_STRING, QtWidgets.QTableWidgetItem(s_val))
            self.table_widget.setItem(r, self.COL_TYPE, QtWidgets.QTableWidgetItem(s_type))
            self.table_widget.setItem(r, self.COL_LENGTH, NumericTableWidgetItem(s_len))
            self.table_widget.setItem(r, self.COL_VT_INFO_GENERAL, QtWidgets.QTableWidgetItem(vt_gen))
            self.table_widget.setItem(r, self.COL_VT_AV_DETECTIONS, QtWidgets.QTableWidgetItem(vt_av))
            self.table_widget.setItem(r, self.COL_VT_SAMPLE_HASHES, QtWidgets.QTableWidgetItem(vt_hash))


        self.table_widget.setWordWrap(True)
        self.table_widget.resizeColumnsToContents()
        self.table_widget.setColumnWidth(self.COL_STRING, 200) 
        self.table_widget.setColumnWidth(self.COL_VT_INFO_GENERAL, 180) 
        self.table_widget.setColumnWidth(self.COL_VT_AV_DETECTIONS, 250) 
        self.table_widget.setColumnWidth(self.COL_VT_SAMPLE_HASHES, 280)
        self.table_widget.resizeRowsToContents() 

        self.table_widget.setEditTriggers(QtWidgets.QAbstractItemView.EditTrigger.NoEditTriggers)
        self.table_widget.setSortingEnabled(True)
        self.table_widget.sortByColumn(self.COL_LENGTH, QtCore.Qt.SortOrder.DescendingOrder)
        self.table_widget.setSelectionBehavior(QtWidgets.QAbstractItemView.SelectionBehavior.SelectRows)
        self.table_widget.setContextMenuPolicy(QtCore.Qt.ContextMenuPolicy.CustomContextMenu)
        self.table_widget.customContextMenuRequested.connect(self._show_context_menu)

        container_widget = QtWidgets.QWidget()
        layout = QtWidgets.QVBoxLayout(container_widget); layout.addWidget(self.table_widget)
        
        ui_context = UIContext.activeContext()
        if ui_context:
            widget_id_key = str(id(container_widget))
            persistent_widgets[widget_id_key] = container_widget
            container_widget.destroyed.connect(lambda obj=None, key=widget_id_key: persistent_widgets.pop(key, None))
            ui_context.createTabForWidget("Strings Table (VT)", container_widget)
        else: 
            self.standalone_window = container_widget 
            container_widget.setWindowTitle("Strings Table (VT Lookup)"); container_widget.show()

    def _show_context_menu(self, position):
        if not self.table_widget: return
        menu = QtWidgets.QMenu()
        
        item_under_cursor = self.table_widget.itemAt(position)
        if item_under_cursor:
            row = item_under_cursor.row()
            string_item = self.table_widget.item(row, self.COL_STRING)
            if string_item:
                selected_string = string_item.text()
                display_s = selected_string[:30] + "..." if len(selected_string) > 30 else selected_string
                vt_action_single = menu.addAction(f"Check \"{display_s}\" on VirusTotal")
                vt_action_single.triggered.connect(lambda checked=False, r=row, s=selected_string: self._check_string_on_vt(r, s))
        
        if self.table_widget.rowCount() > 0:
            if menu.actions(): menu.addSeparator()
            check_all_action = menu.addAction("Check All Strings on VirusTotal")
            check_all_action.triggered.connect(self._check_all_strings_on_vt)
            
        if menu.actions(): menu.exec(self.table_widget.mapToGlobal(position))

    def _check_all_strings_on_vt(self):
        if not self.table_widget: return
        num_rows = self.table_widget.rowCount()
        if num_rows == 0:
            interaction.show_message_box("Check All Strings", "No strings in the table to check.")
            return

        if num_rows > CHECK_ALL_WARNING_THRESHOLD:
            confirm = interaction.show_message_box(
                "Confirm Check All Strings",
                f"You are about to check {num_rows} strings on VirusTotal. This may hit API rate limits.",
                buttons=interaction.MessageBoxButtonSet.YesNoButtonSet, 
                icon=interaction.MessageBoxIcon.WarningIcon 
            )
            if confirm != interaction.MessageBoxButtonResult.YesButton: return
        
        log_info(f"Starting 'Check All Strings' for {num_rows} strings.")
        for row_idx in range(num_rows):
            string_item = self.table_widget.item(row_idx, self.COL_STRING)
            if string_item:
                string_to_check = string_item.text()
                self.table_widget.setItem(row_idx, self.COL_VT_INFO_GENERAL, QtWidgets.QTableWidgetItem("Queued..."))
                self.table_widget.setItem(row_idx, self.COL_VT_AV_DETECTIONS, QtWidgets.QTableWidgetItem(""))
                self.table_widget.setItem(row_idx, self.COL_VT_SAMPLE_HASHES, QtWidgets.QTableWidgetItem(""))
                self._check_string_on_vt(row_idx, string_to_check, is_bulk_check=True)

    def _check_string_on_vt(self, row_index, string_to_check, is_bulk_check=False):
        if not self.table_widget: return

        current_status_item = self.table_widget.item(row_index, self.COL_VT_INFO_GENERAL)
        # Only update to "Checking..." if not already being processed or queued by another action
        if not current_status_item or current_status_item.text() not in ["Checking...", "Queued..."] or is_bulk_check:
            self.table_widget.setItem(row_index, self.COL_VT_INFO_GENERAL, QtWidgets.QTableWidgetItem("Checking..."))
            if not is_bulk_check: # For single checks, also clear/set other VT columns
                 self.table_widget.setItem(row_index, self.COL_VT_AV_DETECTIONS, QtWidgets.QTableWidgetItem("Checking..."))
                 self.table_widget.setItem(row_index, self.COL_VT_SAMPLE_HASHES, QtWidgets.QTableWidgetItem("Checking..."))

        thread_id = f"vt_thread_{row_index}_{QtCore.QTime.currentTime().toString('hhmmsszzz')}"
        thread = VTLookupThread(self.vt_api_key, string_to_check, row_index)
        thread.result_ready.connect(self._on_vt_result)
        thread.finished.connect(lambda tid=thread_id: self._thread_finished(tid)) 
        
        active_threads[thread_id] = thread 
        thread.start()

    def _on_vt_result(self, row_index, general_threat_info, av_detections_string, sample_hashes_string, original_string):
        if not self.table_widget: return
        if row_index < self.table_widget.rowCount():
            self.table_widget.setItem(row_index, self.COL_VT_INFO_GENERAL, QtWidgets.QTableWidgetItem(general_threat_info))
            self.table_widget.setItem(row_index, self.COL_VT_AV_DETECTIONS, QtWidgets.QTableWidgetItem(av_detections_string))
            self.table_widget.setItem(row_index, self.COL_VT_SAMPLE_HASHES, QtWidgets.QTableWidgetItem(sample_hashes_string))
            self.table_widget.resizeRowToContents(row_index)
        else:
            log_warn(f"Row index {row_index} out of bounds when trying to update VT result.")

    def _thread_finished(self, thread_id):
        if thread_id in active_threads:
            del active_threads[thread_id]

# --- Plugin Registration ---
current_strings_view_instance = None
def launch_strings_viewer_with_vt(bv):
    global current_strings_view_instance
    if VT_API_KEY == "YOUR_VIRUSTOTAL_API_KEY":
        interaction.show_message_box("VirusTotal API Key Needed", 
                         "Please set your VirusTotal API key in the plugin script (VT_API_KEY variable).")
    try: import vt
    except ImportError:
        interaction.show_message_box("Dependency Missing", 
                         "The 'vt-py' library is required. Install via pip: pip install vt-py")
        return 
    current_strings_view_instance = StringsTableView(bv)

PluginCommand.register(
    "Step 6. View All Strings in Table (with VirusTotal Lookup)",
    "Collects strings, displays them, and allows VirusTotal lookup (uses vt-py).",
    launch_strings_viewer_with_vt
)
log_info("Plugin 'View All Strings (VT Lookup)' loaded. Set VT_API_KEY & install 'vt-py'.")

Leave a Reply