Skip to content

[total_replay] Logic fails to distinguish between Git LFS pointers and actual log files #1133

@KNGP14

Description

@KNGP14

The total_replay.py script and its associated helper utility total_replay/utility/utility_helper.py are designed to download attack data automatically if the required log files are not present locally. However, the current implementation fails when the repository has been cloned without a subsequent git lfs pull.

Root Cause

The helper currently checks for the existence of log files using a simple file existence check:

attack_datasets_full_path = os.path.join(os.path.expanduser(self.read_config_settings('attack_data_dir_path')), datasets_path)
if os.path.isfile(attack_datasets_full_path):
ColorPrint.print_info_fg(f"[+][. INFO]: ... Attack data at: {attack_datasets_full_path} already exists. Download skipped.")
return (attack_datasets_full_path, datasets_path)

When the repository is cloned, Git creates "placeholder" files for LFS-managed assets. These files exist on the file system (making os.path.isfile() return True), but they only contain LFS pointer metadata rather than the actual attack data.

Example of a pointer file content:

version https://git-lfs.github.com/spec/v1
oid sha256:01064c2f854d52d4b7ad86d5d71f9beb758319cb2f5c0648b5054376df15386a
size 3226

Impact

Because the helper sees that a file exists, it skips the download step. Consequently, total_replay reads the small LFS pointer string and sends it to Splunk instead of the actual raw logs. This results in successful "replays" that contain no usable security events.

Suggested Fix

The file check should be enhanced to verify if the file is an LFS pointer. This can be done by:

  1. Checking if the file size is unusually small (LFS pointers are typically < 200 bytes).
  2. Or, more reliably, checking if the first line of the file starts with version https://git-lfs.github.com/spec/v1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions