Routing High-Speed NVLink-Like Traces: PCB Layout Best Practices for RISC-V SoCs with GPU Interconnects
PCB DesignRISC-VHigh-Speed

Routing High-Speed NVLink-Like Traces: PCB Layout Best Practices for RISC-V SoCs with GPU Interconnects

ccircuits
2026-01-25 12:00:00
10 min read
Advertisement

Practical PCB rules and stackups for NVLink Fusion‑style GPU interconnects with SiFive RISC‑V SoCs — trace, via, PI, and SI checklists for 2026 designs.

Hook: If you are designing a board that connects a SiFive RISC‑V SoC to an external GPU using NVLink Fusion‑style links, you already know the pain: aggressive lane rates, strict channel budgets, and PCB rules that make or break system performance. This guide gives you field‑tested layout rules, stackup templates, and signal‑integrity checks you can apply today to get first‑pass silicon working.

The context in 2026 — why this matters now

Late 2025 and early 2026 saw broader adoption of NVLink Fusion-style fabrics as vendors integrate GPU interconnects with heterogeneous CPU/GPU SoCs. SiFive’s partnership announcements and the move to higher PAM4 lane rates across the industry mean board designers must treat GPU interconnect lanes like high‑priority radio frequency traces. Expect channel requirements aligned with modern serializer/deserializer (SerDes) PHYs (think multi‑tens to low‑hundreds of Gbps aggregate), and plan accordingly.

Top takeaways up front

  • Design your stackup for tight impedance control and low loss — aim for a 100 Ω differential target with controlled microstrip/stripline layers.
  • Use back‑drilling, via optimization, and stitch planes to minimize inductance and stub effects.
  • Model the full channel with vendor S‑parameters and IBIS‑AMI models early; run insertion/return loss and eye simulations before tape‑out.
  • Prioritize power integrity — noisy PDNs destroy high‑speed eyes even when routing is perfect.

1. Channel assumptions and lane budgeting

Before a single trace is routed, establish the channel budget based on the PHY you’ll use. Typical modern GPU/SoC interconnect PHYs in 2026 use PAM4 at 56–112 Gbps per lane or NRZ at 25–56 Gbps. Work with the PHY vendor (SiFive/Nvidia or third‑party PHY IP) to get:

  • S‑parameter (touchstone) models for connectors, cables (if any), and packages
  • IBIS‑AMI behavioral models for the SerDes PHYs
  • Connector/channel insertion loss and return loss requirements at target baud rates

Set your system eye target and margin early — for PAM4 channels use tighter budgets for linearity and noise; for NRZ, equalization can help but still requires low insertion loss and controlled reflections.

2. Stackup recommendations

Stackup choice is the single biggest PCB decision for these links. Below are practical examples that have been validated for similar high‑speed interconnects in production boards.

Key stackup principles

  • Dedicated reference planes: Keep a solid plane directly adjacent to high‑speed signal layers to minimize return path loop inductance and control impedance.
  • Thin dielectric near signals: Thinner dielectrics reduce trace width for a given impedance and reduce radiation/crosstalk.
  • Symmetry: Make the stackup symmetric when possible to reduce warpage and ensure predictable insertion loss.
  • Use low‑loss materials for top HI‑rate layers: Choose laminates with Df ≤ 0.010 at relevant frequencies (e.g., 10–30 GHz) — modern FR4 variants or Isola/Rogers blends are common.

Example 8‑layer stackup for single‑board GPU interconnects

This stackup gives controlled microstrip transmission for routing NVLink‑like lanes with 100 Ω differential targets and good manufacturability:

  1. Top (Signal) — Microstrip — 0.1 mm copper, 35 µm
  2. Prepreg — 0.12 mm (adjacent reference plane)
  3. Plane (GND) — 1 oz
  4. Signal (internal) — Stripline — for less noisy traces
  5. Core — 0.2 mm
  6. Plane (PWR) — 1 oz
  7. Prepreg — 0.12 mm
  8. Bottom (Signal)

With typical FR4 (Er ~ 4.2) this gives differential trace widths in the 4–6 mil (0.1–0.15 mm) and spacing in similar range to achieve ~100 Ω differential — but always verify with your PCB fab stackup calculator and include copper roughness in the model. For ready visuals and editable templates, consider embedding your drawings and stackup templates into internal docs so the whole team can iterate on the layer plan.

12‑layer stackup for large systems and stricter loss budgets

For long channels or racks, use additional plane pairs and lower dielectric thicknesses on the top/bottom layers. Move the highest‑speed lanes to the top microstrip layer with nearest GND, and use internal stripline layers for sensitive mid‑rate signals.

3. Target impedances and trace geometry

Industry practice for differential SerDes lanes is to target 90–100 Ω differential (common is 100 Ω). Single‑ended reference is 50 Ω for many PHYs. Key routing rules:

  • Keep differential traces symmetrical: equal length, equal curvature, equal vias.
  • Use constant spacing and width for the majority of the channel; avoid impedance discontinuities at splits or returns.
  • Keep differential pair skew down to the PS budget (see next section).

Practical geometry checklist

  • Target differential impedance: 100 Ω ±5%.
  • Maintain pair separation (edge‑to‑edge) such that s/h ratio approximates your stackup calculator’s recommendation — common s/h ~ 0.3–0.6 for differential control.
  • Minimum trace width: 4–6 mil (0.1–0.15 mm) for standard fabrication; use wider traces on low‑loss laminates if copper roughness increases insertion loss.
  • Route with 45° turns or gentle arcs — avoid 90° corners.

4. Via strategy and transitions

Vias are one of the largest contributors to channel loss and reflection. Design via transitions carefully:

  • Avoid stubs: Back‑drill all vias used in high‑speed lanes to remove stubs; keep residual stub length < 0.5 mm where backdrilling is not possible.
  • Use via chains and keep antipads small: Minimize via barrel diameter consistent with manufacturability; reduce annular ring only if qualified by the fabricator.
  • Via‑in‑pad caveat: Via‑in‑pad reduces inductance but requires plated over and filled vias and care in assembly—use only with proven assembly houses.
  • Bury or blind vias: Consider burying vias to keep the top microstrip clean; implies higher cost and DFM constraints.

5. Length matching, skew and timing budgets

Lane skew kills link margin. Define timing budgets in picoseconds and convert to length using the board propagation speed (~150 ps/in or 59 ps/cm in FR4 as a starting point).

  • For multi‑lane NIcing (like NVLink‑style lanes), target lane‑to‑lane skew < 10 ps for PAM4 at the highest data rates; a practical target is < 5–10 ps wherever possible.
  • Use serpentine tuning distributed across lanes rather than concentrated in one area to avoid local density and crosstalk.
  • Keep per‑lane serpentine lengths minimal and avoid tight meanders; maintain consistent coupling to prevent impedance perturbation.

6. Power integrity (PI) — don’t treat it as optional

Even perfect routing fails if the PDN is noisy. GPUs and SoCs create simultaneous switching currents that modulate reference planes and create jitter.

  • Design plane pairs adjacent to signal layers for low loop inductance.
  • Place decoupling capacitors in a distributed pattern — bulk near regulators, smaller ceramics near power pins to achieve a wide frequency response.
  • Target a low PDN impedance: aim for single‑digit milliohms at the switching frequency of interest. Use PI simulation and observability tools to validate.
  • Use multiple through‑hole bulk caps for low frequencies and parallel MLCCs for high frequency decoupling.

7. AC coupling, common-mode, and termination

Most high‑speed differential SerDes links are AC‑coupled between chips and require proper common‑mode handling and terminations:

  • Place AC coupling capacitors close to the PHY (within 5 mm) — typical values: 75–200 nF (commonly 100 nF) with X7R or C0G dielectric for high‑frequency stability.
  • Follow vendor termination recommendations: many PHYs expect inline 50 Ω single‑ended equivalents per leg or internal termination — verify with IBIS models.
  • Consider common‑mode choke placement for EMI control if the PHY supports it; check insertion loss impact.

8. Crosstalk, isolation and component placement

Maintain isolation between high‑speed lanes and noisy components like switching regulators or Ethernet PHYs:

  • Keep high‑speed differential pairs grouped and away from power switching components by at least 3× the layer transition dielectric thickness.
  • Use ground stitching vias every 3–5 mm along differential route transitions and near connectors to provide consistent return paths.
  • Separate differential groups by at least 3× the pair width or use grounded guard traces for extreme cases.

9. Modeling and verification workflow

Simulate early and iterate quickly. A practical verification flow:

  1. Get vendor S‑parameters and IBIS‑AMI models.
  2. Build a channel model in your SI tool (Keysight ADS, Ansys HFSS, Cadence Sigrity, Mentor HyperLynx).
  3. Run insertion loss, return loss, and crosstalk simulations. Verify your channel meets the PHY mask to the required frequency (e.g., up to Nyquist or higher for PAM4).
  4. Simulate eyes with equalization settings. Adjust channel or PCB rules until the eye meets margin requirements.
  5. Post‑layout, create a test coupon with representative channels and panelize. Request s‑parameter measurement from the manufacturer for correlation.

10. Manufacturing, test coupons and DFM

Work with fabricators experienced in high‑speed boards. Key items to request:

  • Impedance control report with actual measured impedances across panels.
  • Soldermask definition: use consistent clearance to avoid microstrip changes.
  • Test coupons including paired serpentine traces, via stacks, and transitions for the exact layers used.
  • Specify surface finish: ENIG vs. immersion silver vs. OSP — each affects high‑frequency loss. ENIG is common but check skin effect and solderability trade‑offs.

11. Debug and lab checks — what to test on first prototypes

When first boards arrive:

  • Measure S‑parameters on the coupon to validate insertion/return loss vs. simulated values.
  • Run TDR to verify impedance along the routed lanes and catch unexpected discontinuities.
  • Flash firmware to enable PHY loopback and run BER testing; observe eye diagrams with a real‑time oscilloscope or BERT.
  • Record PDN impedance sweep with a VNA and compare to target — noisy PDN often correlates with jitter and BER failures.

12. Real‑world examples and rules of thumb (experience speaks)

“When we migrated a RISC‑V SoC board to support NVLink‑style lanes in Q3 2025, back‑drilling and plane stitching reduced our eye closure by ~25% and allowed us to relax EQ settings at the receiver.”

From field experience, here are pragmatic heuristics:

  • If insertion loss at Nyquist > 15 dB, expect heavy Rx equalization and limited margin; redesign to reduce loss or add retimers.
  • If early BER is sensitive to power sequencing, add supervisor logic or follow vendor ordering for power rails — GPU/SoC interconnects can be power‑sequence sensitive.
  • When using multi‑board solutions, keep the board‑to‑board connector as short as possible and model it as part of the channel.

Expect continued densification of GPU interconnects — more lanes at higher per‑lane rates, adoption of co‑packaged optics for rack interconnects, and vendor standardization of PHY models for easier channel modeling. The SiFive + NVLink Fusion trend accelerates RISC‑V adoption in heterogeneous compute, increasing demand for certified PCB design flows and SI co‑engineering between chip and board teams. Also watch adjacent infrastructure changes such as local‑first 5G and venue automation as they influence rack and board-level interconnect choices.

Quick layout checklist (printable)

  1. Acquire S‑parameters and IBIS‑AMI models from SoC/GPU/PHY vendors.
  2. Choose stackup with adjacent reference plane and low‑loss laminate for top layers.
  3. Target 100 Ω differential impedance and verify with fab calculator.
  4. Place AC coupling caps within 5 mm of PHY; typical 75–200 nF (commonly 100 nF).
  5. Back‑drill via stubs; keep residual stub < 0.5 mm.
  6. Match lane skew to < 10 ps (target < 5–10 ps for PAM4 high‑rate lanes).
  7. Run insertion/return loss and eye simulations; iterate before tape‑out.
  8. Create and measure test coupons for impedance and s‑parameters on first run.

Final thoughts

Routing NVLink Fusion‑style traces for RISC‑V SoCs joining GPUs is a multi‑discipline problem: stackup engineering, SI/PI simulation, and manufacturing execution. Start early with vendor models, pick a conservative but manufacturable stackup, and run the channel simulations before you commit to a board. Small decisions — dielectric choice, via handling, and AC coupling placement — compound quickly at the speeds we’re seeing in 2026.

Actionable next steps

  • Contact your PHY/SoC vendor to request S‑parameters and IBIS‑AMI models now.
  • Use the checklist above to create a pre‑tapeout validation plan and budget for test coupons and back‑drilling.
  • Engage an SI consultant or use an SI tool (Keysight/Ansys/Cadence) to validate your stackup against the PHY mask. For teams that need repeatable simulation pipelines, consider integrating your SI flow into a CI/CD-style validation loop so channel checks run as part of your release process.

Call to action: Want a ready‑to‑use 8‑layer and 12‑layer stackup PDF with example trace widths and via specs tuned for NVLink‑like lanes? Download our PCB stackup templates and validation checklist, or contact our SI review team for a focused pre‑tapeout consult to reduce risk and speed up bring‑up.

Advertisement

Related Topics

#PCB Design#RISC-V#High-Speed
c

circuits

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-01-24T04:38:26.496Z