Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion SYSTEM_ANALYSIS.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,12 @@ The brace initialization has wrong types and the layout doesn't match the struct
The `#ifdef MAVLINK_AVAILABLE` guard means the real implementation is gated. The manual stub (`pack_set_position_target_local_ned_manual`) sends malformed packets (CRC always 0). On real hardware, the flight controller would reject these packets.

### 10. `drone_main.cpp` simulation path won't work
**FIXED** — flow is now exported via the s_axilite `FLOW` bundle (first 1024
events as packed UQ4.12 positions + Q8.8 flows, `flow_count` + seqlock
`flow_seq`); `read_flow_vectors()` in `arm/fpga_interface.h` decodes it with
torn-read retry, and the sim register file covers the full window. See
`fpga/README.md` § FLOW bundle. Original finding:

The `FpgaInterface` class uses `new volatile uint32_t[128]()` for simulated registers, but:
- The register offsets (e.g., `REG_FLOW_PRED_BASE`) assume specific memory layout
- The `read_flow_vectors()` method expects flow data at specific register offsets
Expand Down Expand Up @@ -225,7 +231,7 @@ Most `.h` and `.cpp` files lack copyright/license headers despite the repo havin
| 🔴 CRITICAL | Encoder weights never initialized | `fpga/top_level.cpp:91` | Zero flow output on FPGA |
| 🔴 CRITICAL | Testbench uses wrong struct types | `fpga/testbench.cpp:424` | Compilation or silent data corruption |
| 🟠 MODERATE | MAVLink stub sends bad CRC | `arm/mavlink_bridge.h:455` | PX4 rejects packets |
| 🟠 MODERATE | ARM reads flow from non-existent AXI regs | `arm/drone_main.cpp:111-128` | ARM gets zero flow vectors |
| 🟠 MODERATE | ~~ARM reads flow from non-existent AXI regs~~ **FIXED**: s_axilite `FLOW` bundle + seqlock decode | `fpga/top_level.cpp` / `arm/fpga_interface.h` | ARM reads real per-event flow |
| 🟠 MODERATE | `ap_axiu<48,0,0,0>` invalid template | `fpga/top_level.cpp:27` | AXI interface may be malformed |
| 🟠 MODERATE | Stream is LIFO in simulation | `fpga/hls_compat.h:213-214` | C-sim differs from RTL sim |
| 🟠 MODERATE | `>> i` shift on ap_fixed not implemented | `fpga/encoder_systolic.h:79-80` | Won't compile in sim |
Expand Down
132 changes: 2 additions & 130 deletions arm/drone_main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
//
// Architecture:
// while(1):
// 1. Read flow vectors from FPGA encoder output (AXI4-Stream DMA)
// 1. Read flow vectors from FPGA encoder output (AXI4-Lite FLOW bundle)
// 2. Convert to EventFlow structs
// 3. Run CollisionPredictor::assess() → ThreatAssessment
// 4. Run EvasionController::compute_command() → EvasionCommand
Expand All @@ -25,6 +25,7 @@

#include "collision_predictor.h"
#include "evasion_controller.h"
#include "fpga_interface.h"
#include "kalman_tracker.h"

// FreeRTOS or bare-metal alternatives
Expand All @@ -39,26 +40,6 @@
#define SLEEP_MS(ms) std::this_thread::sleep_for(std::chrono::milliseconds(ms))
#endif

// ---------------------------------------------------------------------------
// FPGA AXI Register Map (AXI4-Lite base addresses)
// These map to the control_regs_t struct in fpga/top_level.cpp
// ---------------------------------------------------------------------------
#define FPGA_BASE_ADDR 0x43C00000 // Example AXI address
#define REG_ENABLE (FPGA_BASE_ADDR + 0x00)
#define REG_ENABLE_MOTORS (FPGA_BASE_ADDR + 0x04)
#define REG_INFERENCE_PERIOD (FPGA_BASE_ADDR + 0x08)
#define REG_MANUAL_VX (FPGA_BASE_ADDR + 0x0C)
#define REG_MANUAL_VY (FPGA_BASE_ADDR + 0x10)
#define REG_MANUAL_VZ (FPGA_BASE_ADDR + 0x14)
#define REG_MANUAL_YAW (FPGA_BASE_ADDR + 0x18)
#define REG_MANUAL_MODE (FPGA_BASE_ADDR + 0x1C)
#define REG_EVENT_COUNT (FPGA_BASE_ADDR + 0x20)
#define REG_FLOW_PRED_BASE (FPGA_BASE_ADDR + 0x100) // Flow data from encoder

// DMA configuration for AXI4-Stream from encoder
#define DMA_RX_BASE 0x40400000
#define DMA_MAX_PACKET_SIZE (4096 * 8) // 4096 events × 8 bytes per flow vec

using namespace drone;

// ---------------------------------------------------------------------------
Expand All @@ -67,115 +48,6 @@ using namespace drone;
static std::atomic<bool> g_running{true};
static std::atomic<bool> g_motors_armed{false};

// ---------------------------------------------------------------------------
// Simulated FPGA register access (replace with actual MMIO for hardware)
// On actual Zynq: mmap /dev/mem → volatile pointer to FPGA AXI region
// ---------------------------------------------------------------------------
class FpgaInterface {
public:
FpgaInterface() {
// On real hardware: mmap FPGA AXI region
// void* ptr = mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, FPGA_BASE_ADDR);
// registers_ = reinterpret_cast<volatile uint32_t*>(ptr);

// Simulation: allocate local memory for testing
registers_ = new volatile uint32_t[128]();
}

~FpgaInterface() { delete[] registers_; }

void write_register(uint32_t offset, uint32_t value) {
registers_[(offset - FPGA_BASE_ADDR) / 4] = value;
}

uint32_t read_register(uint32_t offset) { return registers_[(offset - FPGA_BASE_ADDR) / 4]; }

// -----------------------------------------------------------------
// Read flow vectors from FPGA encoder output (AXI4-Stream)
// Returns event_count flow vectors
//
// NOTE: In the current FPGA top_level.cpp, flow_pred data is stored
// in internal static BRAM arrays and only debug_flow[2] is exposed
// via AXI4-Lite. A future enhancement should map flow_pred[] to an
// AXI-readable address range (e.g., s_axilite bundle=MEM_FLOW) so
// the ARM can read per-event flow vectors for collision prediction.
// Until then, this function returns dummy data.
// -----------------------------------------------------------------
int read_flow_vectors(std::vector<EventFlow>& events, int max_events) {
events.clear();

uint32_t event_count = read_register(REG_EVENT_COUNT);
if (event_count == 0 || event_count > static_cast<uint32_t>(max_events)) {
return 0;
}

events.reserve(event_count);

// Read packed flow + position data from FPGA BRAM
for (uint32_t i = 0; i < event_count; ++i) {
// Each event: 2 × 32bit for flow (vx, vy as float) + position data
uint32_t base = (REG_FLOW_PRED_BASE - FPGA_BASE_ADDR) / 4 + i * 4;

EventFlow ev;
// Reconstruct float from fixed-point (INT16.Q8 → float)
uint32_t vx_raw = registers_[base + 0];
uint32_t vy_raw = registers_[base + 1];
uint32_t x_raw = registers_[base + 2];
uint32_t y_raw = registers_[base + 3];
std::memcpy(&ev.vx, &vx_raw, sizeof(float));
std::memcpy(&ev.vy, &vy_raw, sizeof(float));
std::memcpy(&ev.x, &x_raw, sizeof(float));
std::memcpy(&ev.y, &y_raw, sizeof(float));
ev.t = i; // Sequential within this batch

events.push_back(ev);
}

return event_count;
}

// -----------------------------------------------------------------
// Write velocity command to FPGA PWM module
// -----------------------------------------------------------------
void write_velocity_command(const EvasionCommand& cmd) {
// Convert float velocities to fixed-point for FPGA
// ap_fixed<16,4>: [-8.0, 8.0) range, Q4.12
int32_t vx_fp = static_cast<int32_t>(cmd.velocity_x * 4096.0f); // 2^12
int32_t vy_fp = static_cast<int32_t>(cmd.velocity_y * 4096.0f);
int32_t vz_fp = static_cast<int32_t>(cmd.velocity_z * 4096.0f);
int32_t yaw_fp = static_cast<int32_t>(cmd.yaw_rate * 4096.0f);

// Clamp to INT16 range
auto clamp_int16 = [](int32_t v) -> uint32_t {
if (v > 32767) v = 32767;
if (v < -32768) v = -32768;
return static_cast<uint32_t>(v & 0xFFFF);
};

write_register(REG_MANUAL_VX, clamp_int16(vx_fp));
write_register(REG_MANUAL_VY, clamp_int16(vy_fp));
write_register(REG_MANUAL_VZ, clamp_int16(vz_fp));
write_register(REG_MANUAL_YAW, clamp_int16(yaw_fp));

// Set manual mode to feed computed commands to PWM
write_register(REG_MANUAL_MODE, 1);
}

void enable(bool motors) {
write_register(REG_ENABLE, 1);
write_register(REG_ENABLE_MOTORS, motors ? 1 : 0);
write_register(REG_INFERENCE_PERIOD, 100000); // 1kHz inference trigger
}

void disable() {
write_register(REG_ENABLE, 0);
write_register(REG_ENABLE_MOTORS, 0);
}

private:
volatile uint32_t* registers_;
};

// ---------------------------------------------------------------------------
// Telemetry / logging
// ---------------------------------------------------------------------------
Expand Down
175 changes: 175 additions & 0 deletions arm/fpga_interface.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
// arm/fpga_interface.h — ARM-side AXI4-Lite interface to the FPGA pipeline
//
// Register map mirrors fpga/top_level.cpp:
// CTRL bundle @ FPGA_BASE_ADDR : control_regs_t fields
// FLOW bundle @ FPGA_FLOW_BASE : per-event (pos, flow) export
//
// NOTE: base addresses and FLOW offsets are design-time placeholders. The
// authoritative offsets come from the Vitis-generated
// xcollision_avoidance_top_hw.h after synthesis — confirm before flight.
#pragma once

#include <cstdint>
#include <vector>

#include "collision_predictor.h"
#include "evasion_controller.h"

// ---------------------------------------------------------------------------
// FPGA AXI Register Map (AXI4-Lite base addresses)
// These map to the control_regs_t struct in fpga/top_level.cpp
// ---------------------------------------------------------------------------
#define FPGA_BASE_ADDR 0x43C00000 // Example AXI address
#define REG_ENABLE (FPGA_BASE_ADDR + 0x00)
#define REG_ENABLE_MOTORS (FPGA_BASE_ADDR + 0x04)
#define REG_INFERENCE_PERIOD (FPGA_BASE_ADDR + 0x08)
#define REG_MANUAL_VX (FPGA_BASE_ADDR + 0x0C)
#define REG_MANUAL_VY (FPGA_BASE_ADDR + 0x10)
#define REG_MANUAL_VZ (FPGA_BASE_ADDR + 0x14)
#define REG_MANUAL_YAW (FPGA_BASE_ADDR + 0x18)
#define REG_MANUAL_MODE (FPGA_BASE_ADDR + 0x1C)
#define REG_EVENT_COUNT (FPGA_BASE_ADDR + 0x20)

// FLOW bundle: per-event flow export (see fpga/top_level.cpp FLOW_EXPORT).
// 2 words per entry i:
// REG_FLOW_DATA + 8*i bits[15:0]=x (UQ4.12) bits[31:16]=y (UQ4.12)
// REG_FLOW_DATA + 8*i + 4 bits[15:0]=vx (Q8.8) bits[31:16]=vy (Q8.8)
#define FPGA_FLOW_BASE 0x43C10000
#define REG_FLOW_COUNT (FPGA_FLOW_BASE + 0x10)
#define REG_FLOW_SEQ (FPGA_FLOW_BASE + 0x18)
#define REG_FLOW_DATA (FPGA_FLOW_BASE + 0x2000)
#define FLOW_MAX_OUT 1024 // mirrors fpga/top_level.cpp — keep in sync

// Simulated register file spans CTRL base through the end of the FLOW window
#define REG_SPACE_WORDS ((FPGA_FLOW_BASE + 0x2000 + 8 * FLOW_MAX_OUT - FPGA_BASE_ADDR) / 4)

// ---------------------------------------------------------------------------
// Simulated FPGA register access (replace with actual MMIO for hardware)
// On actual Zynq: mmap /dev/mem → volatile pointer to FPGA AXI region
// ---------------------------------------------------------------------------
class FpgaInterface {
public:
FpgaInterface() {
// On real hardware: mmap FPGA AXI region
// void* ptr = mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, FPGA_BASE_ADDR);
// registers_ = reinterpret_cast<volatile uint32_t*>(ptr);

// Simulation: allocate local memory for testing
registers_ = new volatile uint32_t[REG_SPACE_WORDS]();
}

~FpgaInterface() { delete[] registers_; }

void write_register(uint32_t offset, uint32_t value) {
registers_[(offset - FPGA_BASE_ADDR) / 4] = value;
}

uint32_t read_register(uint32_t offset) { return registers_[(offset - FPGA_BASE_ADDR) / 4]; }

// -----------------------------------------------------------------
// Fixed-point packing — single source of truth for the FLOW word
// format, shared by decode below and the register round-trip tests.
// -----------------------------------------------------------------
static uint32_t pack_position(float x, float y) {
return (static_cast<uint32_t>(to_uq4_12(y)) << 16) | to_uq4_12(x);
}

static uint32_t pack_flow(float vx, float vy) {
return (static_cast<uint32_t>(to_q8_8(vy)) << 16) | to_q8_8(vx);
}

// -----------------------------------------------------------------
// Read per-event flow vectors from the FPGA FLOW bundle (AXI4-Lite).
// Seqlock protocol: snapshot flow_seq, read count + data, reread
// flow_seq — a mismatch means the FPGA exported mid-read; retry.
// -----------------------------------------------------------------
int read_flow_vectors(std::vector<drone::EventFlow>& events, int max_events) {
events.clear();

for (int attempt = 0; attempt < 3; ++attempt) {
uint32_t seq_before = read_register(REG_FLOW_SEQ);

uint32_t count = read_register(REG_FLOW_COUNT);
if (count == 0) return 0;
if (count > FLOW_MAX_OUT) count = FLOW_MAX_OUT;
if (count > static_cast<uint32_t>(max_events)) count = max_events;

events.reserve(count);
for (uint32_t i = 0; i < count; ++i) {
uint32_t pos = read_register(REG_FLOW_DATA + 8 * i);
uint32_t flw = read_register(REG_FLOW_DATA + 8 * i + 4);

drone::EventFlow ev;
ev.x = static_cast<uint16_t>(pos & 0xFFFF) / 4096.0f; // UQ4.12
ev.y = static_cast<uint16_t>(pos >> 16) / 4096.0f;
ev.vx = static_cast<int16_t>(flw & 0xFFFF) / 256.0f; // Q8.8
ev.vy = static_cast<int16_t>(flw >> 16) / 256.0f;
ev.t = i; // Sequential within this batch
events.push_back(ev);
}

if (read_register(REG_FLOW_SEQ) == seq_before) {
return static_cast<int>(count);
}
events.clear(); // torn read — retry
}
return 0;
}

// -----------------------------------------------------------------
// Write velocity command to FPGA PWM module
// -----------------------------------------------------------------
void write_velocity_command(const drone::EvasionCommand& cmd) {
// Convert float velocities to fixed-point for FPGA
// ap_fixed<16,4>: [-8.0, 8.0) range, Q4.12
int32_t vx_fp = static_cast<int32_t>(cmd.velocity_x * 4096.0f); // 2^12
int32_t vy_fp = static_cast<int32_t>(cmd.velocity_y * 4096.0f);
int32_t vz_fp = static_cast<int32_t>(cmd.velocity_z * 4096.0f);
int32_t yaw_fp = static_cast<int32_t>(cmd.yaw_rate * 4096.0f);

// Clamp to INT16 range
auto clamp_int16 = [](int32_t v) -> uint32_t {
if (v > 32767) v = 32767;
if (v < -32768) v = -32768;
return static_cast<uint32_t>(v & 0xFFFF);
};

write_register(REG_MANUAL_VX, clamp_int16(vx_fp));
write_register(REG_MANUAL_VY, clamp_int16(vy_fp));
write_register(REG_MANUAL_VZ, clamp_int16(vz_fp));
write_register(REG_MANUAL_YAW, clamp_int16(yaw_fp));

// Set manual mode to feed computed commands to PWM
write_register(REG_MANUAL_MODE, 1);
}

void enable(bool motors) {
write_register(REG_ENABLE, 1);
write_register(REG_ENABLE_MOTORS, motors ? 1 : 0);
write_register(REG_INFERENCE_PERIOD, 100000); // 1kHz inference trigger
}

void disable() {
write_register(REG_ENABLE, 0);
write_register(REG_ENABLE_MOTORS, 0);
}

private:
// UQ4.12 with round-to-nearest, saturating at the 16-bit rails
static uint16_t to_uq4_12(float v) {
float s = v * 4096.0f + 0.5f;
if (s < 0.0f) s = 0.0f;
if (s > 65535.0f) s = 65535.0f;
return static_cast<uint16_t>(s);
}

// Q8.8 two's-complement with round-to-nearest, saturating
static uint16_t to_q8_8(float v) {
int32_t s = static_cast<int32_t>(v * 256.0f + (v >= 0 ? 0.5f : -0.5f));
if (s > 32767) s = 32767;
if (s < -32768) s = -32768;
return static_cast<uint16_t>(s & 0xFFFF);
}

volatile uint32_t* registers_;
};
21 changes: 20 additions & 1 deletion fpga/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,26 @@ Base address: `0x43C00000` (AXI4-Lite slave)
| 0x18 | `manual_yaw` | R/W | Manual yaw rate |
| 0x1C | `manual_mode` | R/W | 1=manual, 0=auto |
| 0x20 | `event_count` | RO | Events in ring buffer |
| 0x100+ | `flow_pred` | RO | Flow vector data (4096×8B) |

### FLOW bundle — per-event flow export

Base address: `0x43C10000` (second AXI4-Lite slave, bundle `FLOW`). Offsets
below follow standard Vitis s_axilite allocation but are **design-time
placeholders** — the generated `xcollision_avoidance_top_hw.h` is
authoritative after synthesis.

| Offset | Register | Access | Description |
|--------|----------|--------|-------------|
| 0x10 | `flow_count` | RO | Valid entries (0..1024) |
| 0x18 | `flow_seq` | RO | Export generation counter (seqlock) |
| 0x2000 + 8i | `flow_out[2i]` | RO | bits[15:0]=x (UQ4.12), bits[31:16]=y (UQ4.12) |
| 0x2004 + 8i | `flow_out[2i+1]` | RO | bits[15:0]=vx (Q8.8), bits[31:16]=vy (Q8.8) |

Read protocol (torn-read safe): read `flow_seq`, then `flow_count` and the
data words, then `flow_seq` again — if it changed, the FPGA exported a new
batch mid-read; retry. Export is bounded to the first 1024 events of a batch
(8KB window); the ARM clusterer needs only 30–50 events per object. See
`arm/fpga_interface.h` for the matching decode.

## Hardware Integration

Expand Down
3 changes: 3 additions & 0 deletions fpga/build.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,9 @@ create_clock -period 10 -name clk
set_directive_interface -mode s_axilite -bundle CTRL "collision_avoidance_top"
set_directive_interface -mode s_axilite -bundle CTRL "collision_avoidance_top" ctrl_regs
set_directive_interface -mode s_axilite -bundle DEBUG "collision_avoidance_top" debug_flow
set_directive_interface -mode s_axilite -bundle FLOW "collision_avoidance_top" flow_out
set_directive_interface -mode s_axilite -bundle FLOW "collision_avoidance_top" flow_count
set_directive_interface -mode s_axilite -bundle FLOW "collision_avoidance_top" flow_seq

# Data interfaces (BRAM/stream for high-throughput paths)
set_directive_interface -mode bram "collision_avoidance_top" events
Expand Down
Loading
Loading