Skip to content

CSPL-4621: add newcomer-friendly integration test writing guide#1835

Open
kubabuczak wants to merge 7 commits intodevelopfrom
CSPL-4621-integration-test-guide
Open

CSPL-4621: add newcomer-friendly integration test writing guide#1835
kubabuczak wants to merge 7 commits intodevelopfrom
CSPL-4621-integration-test-guide

Conversation

@kubabuczak
Copy link
Copy Markdown
Collaborator

@kubabuczak kubabuczak commented Apr 9, 2026

Description

Adds a comprehensive integration test writing guide for newcomers, improves test execution UX, and fixes a Makefile bug.

Key Changes

  • New integration test guide covering test structure, writing, execution, and debugging
  • Restructured setup instructions into sequential, copy-pasteable steps
  • Fixed ginkgo command syntax (custom flags must come after --)
  • Fixed docker-buildx Makefile target (broken builder teardown, cache preservation)
  • Fixed operator log debugging example to point at cluster-wide operator namespace
  • Added tools/cleanup.sh and make undeploy references to cleanup docs

Related Issues

PR Checklist

  • Code changes adhere to the project's coding standards
  • Documentation has been updated accordingly
  • If test framework files were changed (test/testenv/, test/run-tests.sh, test/env.sh), docs/IntegrationTesting.md has been updated
  • The PR description follows the project's guidelines

Add docs/IntegrationTesting.md covering what tests exist, how to write
new tests, how to execute them locally and in CI, and how to debug
failures. Replace test/README.md with a pointer to the canonical doc.
Add doc-update reminders in the PR template and AGENTS.md so the guide
stays accurate when the test framework changes.

Made-with: Cursor
…ures)

Clarifies what should and shouldn't be tested in integration tests,
with a practical checklist for new test authors.

Made-with: Cursor
- Remove --cluster-wide flag references (feature to be removed from code)
- Use docker-buildx with PLATFORMS variable for single-platform builds
- Add SPLUNK_GENERAL_TERMS placeholder with link to README for legal info
- Use generic It label placeholders (<mysuite>, <mytag>, <topology>)
- Add critical block around test logic in sequence diagram
… namespace prefix

Restructure the "How to Execute Tests" section into a sequential setup
guide, fix broken ginkgo flag syntax (custom flags must come after --),
add $REGISTRY/$OPERATOR_IMG/$SPLUNK_IMG variables for copy-pasteable
commands, document cloud storage limitations (CSPL-4694), add debugging
and cleanup guidance (stuck finalizers, leftover namespaces), and prefix
all test namespaces with sok-test- for easy discovery.
Remove the broken `- docker buildx rm` line that was inside a shell
continuation (causing `bash: -: command not found`) and was also
destroying the buildx builder cache after every build. The builder
now persists across runs so subsequent builds use cached layers.
Also fix ENVIRONMENT default assignment and make `rm` in manifests
target non-fatal with `-f`.
@vivekr-splunk
Copy link
Copy Markdown
Collaborator

Thanks for putting this guide together. The documentation direction looks useful. I did notice one blocking issue in the code change mixed into this PR:

Changing TestEnv / TestCaseEnv namespaces to sok-test-<name> looks like it breaks existing paths that still treat GetName() as the namespace. For example:

  • test/testenv/testcaseenv.go now sets namespace: "sok-test-" + name
  • but GetOperatorPodName() and DeleteOperatorPod() still use testcaseEnvInst.GetName() as the namespace in non-cluster-wide mode (test/testenv/util.go)
  • and existing tests do the same, e.g. test/smoke/smoke_test.go passes testcaseEnvInst.GetName() into VerifyServiceAccountConfiguredOnPod(), which shells out with kubectl get pods -n <ns> in test/testenv/verificationutils.go

Before this PR, name == namespace, so those call sites worked. With the new prefix, they seem to point at the wrong namespace. Could we either add/use a real GetNamespace() accessor and update the callers, or keep the externally used namespace semantics unchanged in this PR?

@vivekr-splunk
Copy link
Copy Markdown
Collaborator

One small docs follow-up as well: in docs/IntegrationTesting.md, the debugging example currently says to inspect operator logs with kubectl logs -n <test-namespace> deployment/splunk-op-<test-namespace>. Earlier in the same guide, though, the assumed/default flow is a pre-deployed cluster-wide operator in the splunk-operator namespace. In that default case, the operator logs would come from splunk-operator-controller-manager in splunk-operator, not from each test namespace. Could we tweak that example so it matches the default execution model and avoids sending people to the wrong place while debugging?

@kubabuczak
Copy link
Copy Markdown
Collaborator Author

Thanks for putting this guide together. The documentation direction looks useful. I did notice one blocking issue in the code change mixed into this PR:

Changing TestEnv / TestCaseEnv namespaces to sok-test-<name> looks like it breaks existing paths that still treat GetName() as the namespace. For example:

  • test/testenv/testcaseenv.go now sets namespace: "sok-test-" + name
  • but GetOperatorPodName() and DeleteOperatorPod() still use testcaseEnvInst.GetName() as the namespace in non-cluster-wide mode (test/testenv/util.go)
  • and existing tests do the same, e.g. test/smoke/smoke_test.go passes testcaseEnvInst.GetName() into VerifyServiceAccountConfiguredOnPod(), which shells out with kubectl get pods -n <ns> in test/testenv/verificationutils.go

Before this PR, name == namespace, so those call sites worked. With the new prefix, they seem to point at the wrong namespace. Could we either add/use a real GetNamespace() accessor and update the callers, or keep the externally used namespace semantics unchanged in this PR?

Oh, I did it to make kubectl commands easier - let's revert it. I'll address this issue (using GetName() instead of GetNamespace()) in separate PR

Revert the sok-test- namespace prefix that broke GetName()-based
callers. Update debugging docs to point at the cluster-wide operator
namespace. Add tools/cleanup.sh and make undeploy references.

Made-with: Cursor
Reframe the "gray zone" section: lightweight Splunk-side status checks
are explicitly acceptable as secondary checks, but must always be paired
with a primary operator-level assertion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants