Skip to content

RANGER-4676, RANGER-5615: Add OpenSearch Dispatcher to Ranger Audit Server#986

Open
paras200 wants to merge 1 commit into
apache:masterfrom
paras200:RANGER-5615
Open

RANGER-4676, RANGER-5615: Add OpenSearch Dispatcher to Ranger Audit Server#986
paras200 wants to merge 1 commit into
apache:masterfrom
paras200:RANGER-5615

Conversation

@paras200
Copy link
Copy Markdown

@paras200 paras200 commented May 28, 2026

Adds a new OpenSearch dispatcher module to the Ranger Audit Server that consumes audit events from Kafka and bulk-indexes them into OpenSearch, providing an alternative to the Solr-based audit store.

Core dispatcher module (audit-server/audit-dispatcher/dispatcher-opensearch):

  • OpenSearchDispatcherManager — lifecycle manager with retry-based initialization (exponential backoff, max 5 attempts) and graceful shutdown
  • AuditOpenSearchDispatcher — Kafka consumer that batches audit events and writes them to OpenSearch via the _bulk API using the low-level RestClient
  • Supports basic auth and Kerberos/SPNEGO authentication for OpenSearch connections
  • Document ID deduplication — uses audit.eventId as _id in bulk metadata, falls back to UUID when absent
  • Error handling with partition seek-back and retry sleep on batch failures

Shared mapping (audit-server/audit-dispatcher/dispatcher-common):

  • AuditEventDocMapper — canonical 27-field event-to-document mapper, reusable across dispatcher destinations

Configuration & packaging (distro):

  • Per-dispatcher logback support (logback-opensearch.xml) in start-audit-dispatcher.sh
  • Assembly descriptor updated to package the opensearch dispatcher module

Docker & E2E infrastructure (dev-support/ranger-docker):

  • docker-compose.ranger-audit-dispatcher-opensearch.yml for the dispatcher container
  • KDC healthcheck + ZK depends_on: service_healthy to fix keytab provisioning race condition
  • e2e-audit-opensearch.sh — single-command end-to-end test script (start → validate → teardown)
  • Helper scripts: create-ranger-audit-topic.sh, create-ranger-audit-index.sh

Bug fix:

  • Fix ElasticSearchMgr.connect() to return the client on first connection (missing me = client assignment)

How was this patch tested?

Unit tests:

  • TestAuditOpenSearchDispatcher (6 tests) — validates bulk request formatting, document field mapping, HTTP error handling, item-level error detection, UUID generation for missing event IDs
  • TestOpenSearchDispatcherManager (5 tests) — validates dispatcher type filtering, disabled destination handling, fail-fast when dispatcher class cannot be instantiated
  • TestAuditEventDocMapper (4 tests) — validates all 27 fields are correctly mapped from AuthzAuditEvent to document

End-to-end test (./scripts/audit/e2e-audit-opensearch.sh):

  • Full Docker stack: KDC → ZK → Kafka → Ranger Admin → Audit Ingestor → OpenSearch → OpenSearch Dispatcher
  • Posts a SPNEGO-authenticated audit event through the ingestor REST API
  • Verifies the document is indexed in OpenSearch with the correct _id (marker-based assertion)
  • Validates all service health endpoints and container states
  • Automated teardown on exit (or --no-teardown for debugging)

Pipeline validated: Plugin → Ingestor → Kafka → Dispatcher → OpenSearch

RestClientBuilder restClientBuilder = getRestClientBuilder(urls, protocol, user, password, port);

client = new RestHighLevelClient(restClientBuilder);
me = client;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct the description.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated the PR description. The actual change is a bug fix: connect() was not assigning the newly created RestHighLevelClient to the return variable me, so the method could return null on first invocation.

audit_elasticsearch_password=NONE
audit_elasticsearch_index=ranger_audits
audit_elasticsearch_bootstrap_enabled=true

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to differentiate whether it is open search or elasticsearch

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Ranger Admin's perspective, OpenSearch and Elasticsearch use the same REST API (wire-compatible) — there is no separate audit_store=opensearch value. The Admin connects using audit_store=elasticsearch and the audit_elasticsearch_* properties regardless of whether the backend is Elasticsearch or OpenSearch. The differentiation happens at the dispatcher level (separate dispatcher-opensearch module with its own class), not at the Admin level. This properties file is a Docker dev profile where the Admin queries OpenSearch directly for the audit UI — the connection is API-compatible.

<value>ranger_audits</value>
</property>

<property>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description correctly states unauthenticated OpenSearch is allowed. Cross-link to production hardening (require user/password when xasecure.audit.destination.elasticsearch=true).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — updated the user/password property descriptions to explicitly note (dev only) for empty values and added Production: configure user/password or Kerberos keytab guidance.

private static final String TYPE_OPENSEARCH =
"opensearch";
/** Property controlling OpenSearch destination. */
private static final String ES_DEST_PROP =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module is dispatcher-opensearch but config uses xasecure.audit.destination.elasticsearch.*. Intentional for backward compatibility — please document in README/site XML so operators do not search for opensearch.urls.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added an XML comment block in the site XML explaining this:

<!--
    OPENSEARCH DESTINATION CONFIGURATION
    NOTE: OpenSearch is wire-compatible with Elasticsearch. Config uses the
    xasecure.audit.destination.elasticsearch.* namespace for interoperability
    with Ranger Admin audit queries (which read from the same index).
-->

This way operators understand why they won't find opensearch.urls and should use the elasticsearch.* keys instead.

if (clsName != null && clsName.contains(
"AuditOpenSearchDispatcher")) {
isEnabled = true;
props.setProperty(ES_DEST_PROP, "true");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dispatcher class name contains AuditOpenSearchDispatcher, code sets props.setProperty(ES_DEST_PROP, "true"). Side effect on the shared config object may surprise other components reading the same Properties. Prefer local flag instead of mutating props.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — removed props.setProperty(ES_DEST_PROP, "true"). The enablement decision now stays in the local isEnabled boolean without mutating the shared Properties object.

"type": "long"
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AuditEventDocMapper indexes additionalInfo (line 93–94) but this schema has no additionalInfo property. Relying on dynamic mapping may cause type conflicts vs security-admin/contrib/elasticsearch_for_audit_setup/conf/ranger_es_schema.json. Align docker schema with contrib schema for Admin audit UI search.

Copy link
Copy Markdown
Author

@paras200 paras200 May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added additionalInfo as a text field to the Docker schema. Note that the contrib schema (security-admin/contrib/elasticsearch_for_audit_setup/conf/ranger_es_schema.json) has the same gap — both were relying on dynamic mapping for this field. I've aligned the Docker schema; the contrib schema fix can be addressed in a follow-up. I can raise a JIRA related to this.

…ted dispatcher module

Enhances the Ranger Audit Server with a new OpenSearch dispatcher that
consumes audit events from Kafka and bulk-indexes them into OpenSearch,
eliminating the Zookeeper dependency required by Solr.

- New dispatcher-opensearch module with OpenSearchDispatcherManager
  and AuditOpenSearchDispatcher
- AuditEventDocMapper in dispatcher-common for event-to-document mapping
- Configurable thread pool, offset commit strategy, and basic auth
- Error handling with partition seek-back and backoff on batch failures
- Docker Compose setup and e2e test scripts for local validation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants