Skip to content

Commit 5cd64e2

Browse files
committed
docs: reframe eventkit as a kit, not a PyPI library
- Update overview: 'kit' not 'library' - Change installation: clone & customize, not pip install - Update roadmap: focus on kit evolution - Add future vision: extract focused libraries as patterns stabilize Closes #9 - leaning into kit model where teams clone and customize rather than consuming as a black-box library.
1 parent 0639079 commit 5cd64e2

File tree

1 file changed

+42
-87
lines changed

1 file changed

+42
-87
lines changed

README.md

Lines changed: 42 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
# eventkit
22

3-
Event ingestion and processing primitives for Python.
3+
Event ingestion and processing kit for Python.
44

55
## Overview
66

7-
`eventkit` is a high-performance, type-safe library for building event collection pipelines. It provides the core infrastructure for customer data platforms, product analytics, and event-driven architectures.
7+
`eventkit` is a production-ready **kit** for building event collection pipelines. Clone it, customize it, make it yours.
8+
9+
**Philosophy**: Provide a solid starting point with battle-tested patterns, then get out of your way. Customize for your specific needs.
810

911
### Key Features
1012

@@ -27,13 +29,15 @@ Event ingestion and processing primitives for Python.
2729

2830
## Quick Start
2931

30-
Install from PyPI:
32+
Clone and customize:
3133

3234
```bash
33-
pip install eventkit
35+
git clone https://github.com/prosdevlab/eventkit.git my-event-pipeline
36+
cd my-event-pipeline
37+
uv sync
3438
```
3539

36-
Add to your FastAPI application:
40+
Customize for your needs:
3741

3842
```python
3943
from fastapi import FastAPI
@@ -181,25 +185,31 @@ Inspired by open-source CDP architectures:
181185
- [PostHog](https://github.com/PostHog/posthog) - Modern Python stack (FastAPI, async)
182186
- [Snowplow](https://github.com/snowplow/snowplow) - Schema-first validation (optional)
183187

184-
## Installation
188+
## Getting Started
185189

186-
**Basic:**
187-
```bash
188-
pip install eventkit
189-
```
190+
**EventKit is a kit**, not a library. Clone and make it your own:
190191

191-
**With ClickHouse support:**
192192
```bash
193-
pip install eventkit[clickhouse]
194-
```
193+
# 1. Clone the repo
194+
git clone https://github.com/prosdevlab/eventkit.git my-event-pipeline
195+
cd my-event-pipeline
195196

196-
**Development:**
197-
```bash
198-
git clone https://github.com/prosdev/eventkit.git
199-
cd eventkit
200-
pip install -e ".[dev]"
197+
# 2. Install dependencies
198+
uv sync
199+
200+
# 3. Start local dev
201+
docker-compose up -d # GCS + PubSub emulators
202+
uv run uvicorn eventkit.api.app:app --reload
203+
204+
# 4. Customize for your needs
205+
# - Modify validation rules in src/eventkit/adapters/
206+
# - Add custom storage backends in src/eventkit/stores/
207+
# - Adjust queue behavior in src/eventkit/queues/
208+
# - Make it yours!
201209
```
202210

211+
See [LOCAL_DEV.md](LOCAL_DEV.md) for detailed setup.
212+
203213
## API Endpoints
204214

205215
### Collection Endpoints
@@ -340,62 +350,6 @@ python -m scripts.run_bigquery_loader
340350

341351
See `scripts/bigquery/README.md` and `specs/gcs-bigquery-storage/` for full details.
342352

343-
### Error Store (Dead Letter Queue)
344-
345-
All failed events are stored in a GCS-based dead letter queue for debugging and retry:
346-
347-
**Two Error Types:**
348-
- **Validation Errors**: Missing required fields, invalid schema
349-
- **Processing Errors**: Storage failures, unexpected exceptions
350-
351-
**Storage Structure:**
352-
```
353-
gs://bucket/errors/
354-
date=2026-01-15/
355-
error_type=validation/
356-
error-20260115-100000-abc123.parquet
357-
error_type=processing/
358-
error-20260115-100500-def456.parquet
359-
```
360-
361-
**Create BigQuery Errors Table:**
362-
```bash
363-
cd scripts/bigquery
364-
export PROJECT_ID=my-project DATASET=events
365-
cat create_errors_table.sql | sed "s/{PROJECT_ID}/$PROJECT_ID/g" | sed "s/{DATASET}/$DATASET/g" | bq query --use_legacy_sql=false
366-
```
367-
368-
**Query Errors:**
369-
```sql
370-
-- Find validation errors in last 24 hours
371-
SELECT
372-
error_message,
373-
stream,
374-
COUNT(*) as count
375-
FROM `project.dataset.errors`
376-
WHERE date >= CURRENT_DATE() - 1
377-
AND error_type = 'validation_error'
378-
GROUP BY error_message, stream
379-
ORDER BY count DESC;
380-
381-
-- Get processing errors with stack traces
382-
SELECT
383-
timestamp,
384-
error_message,
385-
JSON_EXTRACT_SCALAR(error_details, '$.exception_type') as exception,
386-
JSON_EXTRACT_SCALAR(error_details, '$.stack_trace') as stack_trace
387-
FROM `project.dataset.errors`
388-
WHERE error_type = 'processing_error'
389-
ORDER BY timestamp DESC
390-
LIMIT 10;
391-
```
392-
393-
**Key Features:**
394-
- Never loses events - all failures stored for debugging
395-
- Automatic 30-day retention (GCS lifecycle rules)
396-
- Full event context (payload, error, timestamp, stream)
397-
- Queryable via BigQuery for pattern analysis
398-
399353
### Custom Storage
400354

401355
Implement the `EventStore` protocol for any backend:
@@ -531,7 +485,7 @@ uv run ruff format src/
531485

532486
## Roadmap
533487

534-
### Core (v0.x)
488+
### Core Kit (v0.x)
535489
- [x] Composable validators (required fields, types, timestamps)
536490
- [x] Segment-compatible adapter with ValidationPipeline
537491
- [x] Collection API with stream routing
@@ -542,24 +496,25 @@ uv run ruff format src/
542496
- [x] Prometheus metrics
543497
- [x] EventSubscriptionCoordinator (dual-path architecture)
544498
- [x] Hash-based sequencer for consistent ordering
545-
- [x] Error store with dead letter queue (GCS-based)
546-
- [ ] Performance benchmarks (10k+ events/sec)
499+
- [x] Performance benchmarks (10k+ events/sec validated)
500+
- [ ] Error handling and dead letter queue (ErrorStore protocol exists, needs implementation)
547501

548-
### v1.0
549-
- [ ] OpenAPI spec and generated clients
550-
- [ ] Comprehensive examples and documentation
502+
### v1.0 - Production Ready
503+
- [ ] Comprehensive examples and use cases
551504
- [ ] Production deployment guides (Cloud Run, GKE, ECS)
552505
- [ ] S3 + Snowflake/Redshift storage adapters
506+
- [ ] Nextra documentation site
507+
508+
### Future: Extract Focused Libraries
553509

554-
### Future Ecosystem
510+
As patterns stabilize, we may extract reusable components:
555511

556-
These capabilities are intentionally scoped as separate packages to keep the core focused:
512+
- **eventkit-ring-buffer** - SQLite WAL durability layer (could be used standalone)
513+
- **eventkit-queues** - Queue abstractions (AsyncQueue, PubSub patterns)
514+
- **eventkit-validators** - Composable validation framework
515+
- **eventkit-storage** - Storage backend protocols and implementations
557516

558-
- **eventkit-profiles** - Profile building and field-level merge strategies
559-
- **eventkit-identity** - Graph-based identity resolution across devices
560-
- **eventkit-enrichment** - IP geolocation, user agent parsing, company enrichment
561-
- **eventkit-destinations** - Activate data to marketing and analytics tools
562-
- **eventkit-privacy** - GDPR/CCPA compliance utilities (deletion, anonymization)
517+
These would be pip-installable libraries while the kit remains a starting point.
563518

564519
## Contributing
565520

0 commit comments

Comments
 (0)