Skip to content

Investigate gh-pages impact on full clone size and time #2197

@rwgk

Description

@rwgk

Description

A local controlled file:// clone experiment suggests that gh-pages is substantially inflating normal full-clone cost for NVIDIA/cuda-python.

Measured clone times:

  • With gh-pages: 47.44 seconds
  • Without gh-pages: 1.79 seconds

Measured size estimates from the same experiment:

  • With gh-pages: .git size about 1.2G
  • Without gh-pages: .git size about 38M
  • Estimated reduction: about 1.19 GB by filesystem size, or about 1.17 GB by Git object disk-usage accounting

Full measurement details and evidence are in:

For initial solution exploration, start from:


Additional real-world measurements from two NVIDIA-internal wired-network machines (see comments below for details):

Machine Clone mode Elapsed time Received .git size Git pack size Packed objects
smc120-0009 normal full clone 122.85 s 1.25 GiB 1.3G 1.27 GiB 692,924
smc120-0009 --single-branch --branch main 5.77 s 36.11 MiB 38M 36.66 MiB 20,389
smc120-0009 --depth 1 --branch main 2.46 s 2.66 MiB 3.0M 2.68 MiB 903
Dell Precision 7875 normal full clone 92.60 s 1.25 GiB 1.3G 1.27 GiB 692,924
Dell Precision 7875 --single-branch --branch main 4.55 s 36.17 MiB 38M 36.72 MiB 20,389
Dell Precision 7875 --depth 1 --branch main 2.11 s 2.66 MiB 3.0M 2.68 MiB 903

In both runs, the normal clone fetched about 693k packed objects and 1.25 GiB, while the single-branch main clone fetched about 20k objects and 36 MiB.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Medium priority - Should do

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions