Overview
This project implements a production-grade AXI4 interconnect in SystemVerilog from first principles. The design is a 2×2 non-blocking crossbar — two AXI4 masters connecting to two synchronous SRAM slaves — built to demonstrate professional RTL engineering: protocol correctness, burst handling, multiple outstanding transactions, and system-level arbitration.
The implementation is fully synthesizable and parameterized, with a verified 250 MHz timing closure on Xilinx UltraScale+ FPGAs.
Repository: akshay-b-prasad/amba-axi4-interconnect
Architecture
The crossbar connects two masters to two slaves, each backed by 64 KB of synchronous SRAM. Masters can simultaneously transact with different slaves without any blocking — the non-blocking property is fundamental to the design.
| Slave | Base Address | Capacity |
|---|---|---|
| S0 | 0x0000_0000 | 64 KB |
| S1 | 0x0001_0000 | 64 KB |
Key Design Decisions
Round-robin arbitration. When both masters contend for the same slave, a round-robin arbiter resolves priority without starvation. The arbiter state advances after each granted transaction, not each beat, preserving burst atomicity.
ID extension. Each master’s AXI IDs are prepended with a 1-bit master index before being forwarded to slaves. Slaves track and return this extended ID; the crossbar strips the prefix before returning responses — following the ARM CoreLink pattern. This allows the crossbar to route read/write responses back to the correct master unambiguously.
Per-master W-route FIFOs. AXI4 decouples the AW (write address) and W (write data) channels, so write data can arrive before the slave is selected. A FIFO per master records which slave claimed each AW transaction, ensuring write beats are delivered to the correct destination even under back-pressure.
Read serialization. Current implementation processes one read burst per slave sequentially for protocol clarity and verification simplicity. The read data path is a documented extension point for pipelining.
Protocol Compliance
The interconnect handles the full AXI4 feature set required by AMBA IHI0022H:
- Burst lengths up to 256 beats (
AWLEN/ARLEN= 8-bit) - Burst types: FIXED, INCR, WRAP — all address calculation modes
- Byte strobes (
WSTRB) for byte-granular write masking across all paths - Outstanding transactions: up to 4 concurrent in-flight writes and 4 reads per slave
- Correct
VALID/READYhandshaking on all five AXI channels (AW, W, B, AR, R)
RTL Quality
The design is written to modern SystemVerilog coding standards:
always_fffor all sequential logic;always_combfor combinational$clog2-parameterized widths — no magic numbers or architecture-specific constants- Module-level parameters for data width, address width, ID width, and burst depth
- No latches; all outputs registered or explicitly driven combinationally
Verification
An 8-scenario self-checking testbench covers the full feature surface:
| # | Scenario |
|---|---|
| 1 | Single-beat read and write |
| 2 | Multi-beat INCR bursts |
| 3 | Address WRAP burst with boundary crossing |
| 4 | Byte-strobe selective write |
| 5 | Back-pressure on READY de-assertion |
| 6 | Cross-slave concurrency (M0→S0, M1→S1 simultaneously) |
| 7 | Arbiter contention (both masters targeting same slave) |
| 8 | Maximum-length bursts (256 beats) |
The testbench supports four simulators via a single Makefile:
make SIM=icarus # Icarus Verilog (default)
make SIM=modelsim # ModelSim / QuestaSim
make SIM=xrun # Cadence Xcelium
make SIM=vivado # Vivado Simulator
FPGA Results
Targeting Xilinx UltraScale+ at 250 MHz with default parameters (32-bit data, 2×64 KB SRAM):
| Resource | Utilization |
|---|---|
| LUTs | ~800 |
| Flip-Flops | ~600 |
| BRAM 18K | 4 blocks |
SRAM infers as Block RAM automatically from the synchronous read/write pattern. Timing constraints are included in constraints/ for out-of-the-box Vivado implementation.
File Structure
rtl/ - axi4_pkg.sv, interfaces, SRAM model, slave, master, crossbar, top
tb/ - axi4_tb.sv (self-checking, 8 scenarios)
sim/ - Makefile with multi-simulator support
constraints/ - Xilinx XDC for 250 MHz timing closure