Skip to content

Add golang.py to mirror Go releases from go.dev / dl.google.com#201

Open
yaoge123 wants to merge 2 commits into
tuna:masterfrom
yaoge123:add-golang-py
Open

Add golang.py to mirror Go releases from go.dev / dl.google.com#201
yaoge123 wants to merge 2 commits into
tuna:masterfrom
yaoge123:add-golang-py

Conversation

@yaoge123
Copy link
Copy Markdown
Contributor

@yaoge123 yaoge123 commented May 24, 2026

Summary

Add golang.py to mirror Go releases directly from upstream.

Why

The previously common golang upstream rsync://rsync.mirrors.ustc.edu.cn/golang/ no longer exposes the rsync module. This script gives tunasync a self-contained way to mirror Go releases without depending on another mirror.

How it works

  1. Fetch the official release index from https://go.dev/dl/?mode=json&include=all
  2. Download missing files from https://dl.google.com/go/<filename> with curl
  3. Atomic write via .tmp + os.replace
  4. Skip files whose local size already matches Content-Length

No rsync, no apt-mirror; just curl + JSON.

Copilot AI review requested due to automatic review settings May 24, 2026 09:19
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a Python script to sync a local Golang mirror by fetching the file list from go.dev and downloading artifacts from dl.google.com, with cleanup of stale local files.

Changes:

  • Fetch Go release file metadata from the go.dev download API.
  • Download missing/out-of-date artifacts into a configured working directory.
  • Remove local files not present in the current upstream file list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread golang.py
Comment on lines +7 to +11
import json
import os
import sys
import subprocess
import hashlib
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashlib is now used to verify each downloaded file (and to skip up-to-date local files when an expected sha is available). Streaming hash so memory stays bounded.

Comment thread golang.py Outdated
Comment on lines +23 to +25
["curl", "-s", "-m", "60", API_URL],
capture_output=True, text=True, timeout=90
)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now using curl --fail -L plus an explicit returncode != 0 check, with stderr surfaced when fetch_versions fails.

Comment thread golang.py Outdated
Comment on lines +43 to +53
def download_file(filename, url, filepath):
"""Download a file if not present or size mismatch."""
tmpfile = filepath + ".tmp"
try:
result = subprocess.run(
["curl", "-s", "-L", "-m", "600", "-o", tmpfile, url],
timeout=630
)
if result.returncode == 0:
os.rename(tmpfile, filepath)
return True
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to curl -fsSL for both download_file and fetch_versions so HTTP 4xx/5xx are treated as failures instead of producing empty bodies.

Comment thread golang.py Outdated
Comment on lines +68 to +73
["curl", "-sI", "-m", "10", url],
capture_output=True, text=True, timeout=15
)
for line in result.stdout.splitlines():
if line.lower().startswith("content-length:"):
return int(line.split(":", 1)[1].strip())
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_remote_size now uses curl -sIL so we follow redirects and only count the final response's Content-Length.

Comment thread golang.py
Comment on lines +34 to +39
filename = f["filename"]
sha256 = f.get("sha256", "")
files[filename] = {
"url": BASE_URL + filename,
"sha256": sha256,
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After downloading we verify the file against the upstream sha256 from the go.dev API and remove the file on mismatch (so the next run will retry).

Comment thread golang.py
Comment on lines +115 to +119
for fname in os.listdir(WORKDIR):
fpath = os.path.join(WORKDIR, fname)
if os.path.isfile(fpath) and fname not in expected:
print(f"Removing stale file: {fname}")
os.remove(fpath)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restricted stale cleanup to filenames matching Go's release naming convention (go1.*, getgo*) so unrelated files in WORKDIR are no longer touched.

Comment thread golang.py Outdated
return files


def download_file(filename, url, filepath):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed; download_file no longer takes filename.

Mirror Go release tarballs and installers from go.dev. The script uses
the official go.dev/dl/?mode=json&include=all listing as the index and
downloads files from dl.google.com/go via curl with atomic .tmp + rename.

Compared to the previously common rsync upstream rsync.mirrors.ustc.edu.cn::golang/
this is a direct from-origin sync that does not depend on another mirror.

Used by NJU mirror's golang job (~646G, 8118 files).
Address review feedback:
- Use curl --fail -L so HTTP errors and redirects to error pages are
  treated as failures instead of producing empty/partial output.
- Check curl returncode for fetch_versions and download_file.
- Verify freshly-downloaded files against the expected sha256 from the
  go.dev API; reuse that sha to skip up-to-date local files.
- Stream sha256 over the file so memory stays bounded.
- Restrict stale cleanup to filenames matching Go's release naming
  convention (go1.* and getgo*), avoiding removal of unrelated files
  that happen to live in WORKDIR.
- Drop the unused filename parameter from download_file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants