Skip to content

chore: add HF_TOKEN for authentication#1202

Merged
jakelorocco merged 2 commits into
generative-computing:mainfrom
jakelorocco:chore/update-hf-token
Jun 5, 2026
Merged

chore: add HF_TOKEN for authentication#1202
jakelorocco merged 2 commits into
generative-computing:mainfrom
jakelorocco:chore/update-hf-token

Conversation

@jakelorocco
Copy link
Copy Markdown
Contributor

@jakelorocco jakelorocco commented Jun 4, 2026

Pull Request

Issue

Fixes N/A

Description

Adds an export of a HF_TOKEN to the cicd pipeline so that we are less likely to hit API request limits. (Goes from 500 per 5 minutes with an anonymous account to 1000 per 5 minutes with a free account).

Opted against an environment since it doesn't offer any additional protections here. It will be added as a repo level secret if approved. I've specifically named it HF_TOKEN_READ_PUBLIC_ONLY so that someone doesn't put a write token in there. It doesn't get shown anywhere and only runs on

For debugging, there's also a step added that checks if the token is valid / being used:
output:

HF_TOKEN is set; verifying with the Hugging Face Hub API...
Notice: HF_TOKEN is valid (user: jake-lo).

or

Warning: HF_TOKEN is NOT set — Hugging Face Hub calls will be anonymous.

This means that we can remove the HF token at any time and our tests will still run.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code was added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

  • AI coding assistants used

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

  • Component
  • Requirement
  • Sampling Strategy
  • Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

Signed-off-by: Jake LoRocco <jake.lorocco@ibm.com>
Assisted-by: CLAUDE:OPUS
@jakelorocco jakelorocco marked this pull request as ready for review June 4, 2026 13:24
@jakelorocco jakelorocco requested a review from a team as a code owner June 4, 2026 13:24
@jakelorocco
Copy link
Copy Markdown
Contributor Author

I'm not certain if all of our action runners are being flagged as the same anonymous account right now. As a result, enabling the key might make things worse. I am going to convert back to draft for now.

@jakelorocco jakelorocco marked this pull request as draft June 4, 2026 13:26
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this — adding authenticated HF access is a nice improvement to CI reliability.

Two inline comments below, plus one question I couldn't anchor to a line:

publish-release.yml: was it intentional to leave that workflow out of this change? It also calls quality.yml (line 62) but without a secrets: block, so release CI will still run with anonymous HuggingFace access. Since required: false nothing breaks, but the rate-limit benefit won't apply there. Happy to note it as a follow-up if leaving it anonymous is deliberate.

Comment thread .github/workflows/quality.yml Outdated
# repositories: same-repo PR runs have access to this secret, so a write-scope
# token could be exfiltrated by a malicious workflow change. We expose it as the
# env var HF_TOKEN because that is the name huggingface_hub picks up automatically.
# Rotate via repo Settings -> Secrets and variables.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The security reasoning in the comment above is spot-on, but the right move is to apply what the comment describes rather than suppress zizmor. Setting HF_TOKEN at job scope means it's present in every step's environment — including the pre-commit hooks and the curl | sh Ollama install — when only two steps actually need it.

Moving it to step-level env: scopes the exposure correctly and removes the need for the ignore entirely:

# Remove the job-level env block, then add per step:

      - name: Check HF_TOKEN
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN_READ_PUBLIC_ONLY }}
        run: |
          ...

      - name: Run Tests
        id: tests
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN_READ_PUBLIC_ONLY }}
        run: uv run -m pytest -v --junit-xml=/tmp/pytest-results.xml test

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also worth noting. Using fine-grained and limiting the access to repos we need would be a good way to make the token useless to bandits. Probably recommended... but pragmatically I'm guessing our prefered method is a simpler token and we just rotate it periodically or whenever it looks overused.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed that would be good practice.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need the ignore comment but I've moved it to the step level as well.

exit 0
fi
echo "HF_TOKEN is set; verifying with the Hugging Face Hub API..."
uv run python - <<'PY'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing: if uv run itself fails to start (corrupt venv, transient error), the step exits non-zero under bash -eo pipefail and blocks CI before tests even run — worse than the anonymous state this step is guarding against. One || makes it truly advisory:

Suggested change
uv run python - <<'PY'
uv run python - <<'PY' || echo "::warning::HF_TOKEN verification step could not run."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I had to modify the syntax to use an enclosure but have added it. I also added a flag to the step as well.

@planetf1 planetf1 dismissed their stale review June 4, 2026 15:17

Downgrading from request-changes to comment — findings are suggestions/warnings, not blockers.

Copy link
Copy Markdown
Contributor

@markstur markstur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunate that we need this, but looks appropriate with clearly readonly token

see inline comments

print("::warning::huggingface_hub not installed in this env; skipping HF_TOKEN verification.")
sys.exit(0)
try:
info = HfApi().whoami(token=os.environ["HF_TOKEN"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably know this, but just sharing that whoami does not eat into the rate limit

Comment thread .github/workflows/quality.yml Outdated
# repositories: same-repo PR runs have access to this secret, so a write-scope
# token could be exfiltrated by a malicious workflow change. We expose it as the
# env var HF_TOKEN because that is the name huggingface_hub picks up automatically.
# Rotate via repo Settings -> Secrets and variables.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also worth noting. Using fine-grained and limiting the access to repos we need would be a good way to make the token useless to bandits. Probably recommended... but pragmatically I'm guessing our prefered method is a simpler token and we just rotate it periodically or whenever it looks overused.

Signed-off-by: Jake LoRocco <jake.lorocco@ibm.com>
Assisted-by: CLAUDE:OPUS
Copy link
Copy Markdown
Contributor Author

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

publish-release.yml: was it intentional to leave that workflow out of this change? It also calls quality.yml (line 62) but without a secrets: block, so release CI will still run with anonymous HuggingFace access. Since required: false nothing breaks, but the rate-limit benefit won't apply there. Happy to note it as a follow-up if leaving it anonymous is deliberate.

Thank you, added the secret there as well.

Comment thread .github/workflows/quality.yml Outdated
# repositories: same-repo PR runs have access to this secret, so a write-scope
# token could be exfiltrated by a malicious workflow change. We expose it as the
# env var HF_TOKEN because that is the name huggingface_hub picks up automatically.
# Rotate via repo Settings -> Secrets and variables.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need the ignore comment but I've moved it to the step level as well.

exit 0
fi
echo "HF_TOKEN is set; verifying with the Hugging Face Hub API..."
uv run python - <<'PY'
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I had to modify the syntax to use an enclosure but have added it. I also added a flag to the step as well.

@jakelorocco jakelorocco marked this pull request as ready for review June 4, 2026 22:19
@jakelorocco
Copy link
Copy Markdown
Contributor Author

@planetf1 @markstur, I tested to make sure the updated actions work. I believe I've addressed all of the comments. Can you please re-review?

@jakelorocco jakelorocco enabled auto-merge June 5, 2026 12:58
@jakelorocco jakelorocco requested review from markstur and planetf1 June 5, 2026 12:58
@psschwei psschwei added the do-not-merge/hold Block merging this PR label Jun 5, 2026
Copy link
Copy Markdown
Member

@psschwei psschwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

(I put the hold flag on so that the requested reviewers can have a chance to take a look, but feel free to remove and merge as you see fit)

Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — all three threads addressed. Step-level scoping is correct, the fallback is in place, and is covered too.

Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — all three threads addressed. Step-level scoping is correct, the fallback for the verification step is in place, and publish-release.yml is covered too.

@jakelorocco jakelorocco removed the do-not-merge/hold Block merging this PR label Jun 5, 2026
@jakelorocco jakelorocco added this pull request to the merge queue Jun 5, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 5, 2026
@jakelorocco jakelorocco added this pull request to the merge queue Jun 5, 2026
Merged via the queue into generative-computing:main with commit 6046f96 Jun 5, 2026
8 of 9 checks passed
@jakelorocco jakelorocco deleted the chore/update-hf-token branch June 5, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants