Skip to content
View brjen's full-sized avatar

Block or report brjen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
brjen/README.md

Hey, I'm Brian

I build and operate AI infrastructure on bare metal — and I run a production business on top of it.


Gridline Productions

A fully automated commercial AI render operation, built from scratch in roughly a month with no prior rendering or pipeline experience.

The system handles the entire lifecycle: prompt generation, rendering across multiple models, multi-tier quality enrichment, automated scoring and selection, metadata processing, and commercial listing — all with a single click to start.

By the numbers:

  • 30,000+ renders produced
  • Fully automated end-to-end: one-click startup, no manual intervention required
  • Self-correcting and self-restarting — if something fails, the system recovers on its own
  • Self-improving — the platform learns from its own operations, surfacing what works and deprioritizing what doesn't
  • Dual control plane with machine-specific fallback agents across a multi-machine estate
  • Custom render engine replacing off-the-shelf tooling entirely

Everything runs on hardware I own. No cloud dependencies for production workloads.


Infrastructure

I design and operate a self-hosted multi-machine estate for AI workloads, enrichment pipelines, observability, and automation.

  • Multi-host GPU compute: AMD RX 7800 XT, NVIDIA GTX 1080 Ti, Apple Silicon
  • Networking: Unbound recursive DNS, AdGuard filtering, VLAN-segmented UniFi
  • Virtualization & storage: Proxmox VE, automated backup
  • Observability: Real-time dashboards, system-wide queryability, automated anomaly detection with alerting
  • Operations: Structured ticketing system with 200+ closed tickets, startup/shutdown verification routines, full audit trail from day one

Philosophy

I own the stack. Compute, networking, DNS, storage, monitoring, and deployment all run on hardware I control.

If something breaks, I fix it at the root cause. The PyTorch memory fix came from weeks of instrumented profiling across a real production workload — tracing the issue to glibc allocator behavior rather than treating it as an application-level bug.

I build systems that are observable, rebuildable, and free of vendor lock-in. Every decision, every fix, and every lesson learned is documented and searchable. The system's institutional memory goes back to day one.


Stack

Python, Node.js, Svelte, FastAPI, SQLite, Linux, Proxmox VE, systemd, Git, SSH, Fish shell


Open Source

  • pytorch-memory-fix — Two environment variables that eliminate PyTorch RSS creep during repeated model load/unload cycles on Linux. Zero code changes. Zero performance cost.

Popular repositories Loading

  1. pytorch-memory-fix pytorch-memory-fix Public

    Two environment variables that fix PyTorch/glibc memory creep on Linux. Zero code changes. Zero performance cost.

    Python 8 2

  2. brjen brjen Public

    Config files for my GitHub profile.