-
Notifications
You must be signed in to change notification settings - Fork 132
Description
The total_replay.py script and its associated helper utility total_replay/utility/utility_helper.py are designed to download attack data automatically if the required log files are not present locally. However, the current implementation fails when the repository has been cloned without a subsequent git lfs pull.
Root Cause
The helper currently checks for the existence of log files using a simple file existence check:
attack_data/total_replay/utility/utility_helper.py
Lines 293 to 296 in de713ff
| attack_datasets_full_path = os.path.join(os.path.expanduser(self.read_config_settings('attack_data_dir_path')), datasets_path) | |
| if os.path.isfile(attack_datasets_full_path): | |
| ColorPrint.print_info_fg(f"[+][. INFO]: ... Attack data at: {attack_datasets_full_path} already exists. Download skipped.") | |
| return (attack_datasets_full_path, datasets_path) |
When the repository is cloned, Git creates "placeholder" files for LFS-managed assets. These files exist on the file system (making os.path.isfile() return True), but they only contain LFS pointer metadata rather than the actual attack data.
Example of a pointer file content:
version https://git-lfs.github.com/spec/v1
oid sha256:01064c2f854d52d4b7ad86d5d71f9beb758319cb2f5c0648b5054376df15386a
size 3226
Impact
Because the helper sees that a file exists, it skips the download step. Consequently, total_replay reads the small LFS pointer string and sends it to Splunk instead of the actual raw logs. This results in successful "replays" that contain no usable security events.
Suggested Fix
The file check should be enhanced to verify if the file is an LFS pointer. This can be done by:
- Checking if the file size is unusually small (LFS pointers are typically < 200 bytes).
- Or, more reliably, checking if the first line of the file starts with
version https://git-lfs.github.com/spec/v1.