Wasga Compiler: SoC partitioning for multi-FGPA prototyping board

Wasga Compiler is a multi-FPGA partitioning software flow for ASIC designers who use FPGA-based system to verify their design and to validate the software integration. It partitions automatically large designs onto multiple FPGA while meeting chips resources, connectivity constraints and clock frequency required for running software applications. Wasga Compiler performs significantly better than other tools which often fail to achieve an acceptable timing result, even with manual interventions. It delivers faster results and handles the largest designs (above one billion gates equivalent). It maximizes prototyped system performances, accelerates the prototyping process and enables to meet time-to-market challenges.

  Wasga Compiler is the advanced timing-driven partitioning platform including the following capabilities:
  • Includes a Graphical User Interface (GUI) to ease projects setup
  • Allows automatic and/or manual placement
  • Allows automatic and/or manual routing
  • Integrates very high speed multiplexing IPs to increase inter-FPGA bandwidth
  • Supports SDC for timing constraints and budgeting
  • Provides system level static timing analysis
  • Drives and runs automatically FPGA back-end flow
  • Meets clock frequency required for running software applications
  • Facilitates iterative runs and verification

Design Partitioning Flow

Wasga Compiler maps RTL and/or Gate level designs onto multi-FPGA platforms. The inputs are the design in RTL and/or Gate level, the board description, the timing constraints and optional partitioning constraints. The outputs are the different bit-streams for field FPGAs, and reports. The flow is fully automatic and can be semi-automatic if the user wants to run it on step-by-step mode. Wasga Compiler is an incremental flow and allows to re-run any step and to make tests gradually. Once a result quality is approved by the user, the corresponding setups can be recorded in the script to reproduce the result quickly and predictably for any future design revisions. Only the changed modules will be re-synthesized and all the setups will be reconsidered through the recompile phase. The timing behavior of the recompile will be similar to the previously successful compile.

 

Global Placement

The advanced automatic partitioning algorithms search for the best partition with the lowest inter- FPGA connections and highest system performance. All user constraints and directions including manual grouping, target system interface and FPGA filling rate are considered. Multiple partitioning objectives are competing for the best partition on any particular design:

  • Design Analysis: Wasga Compiler does a smart hierarchy management. It analyzes design hierarchy and preserves it when appropriate based on instances cut and combinatorial depth qualities. This selective preservation of hierarchical blocks reduces considerably problem complexity while keeping the global netlist view during partitioning.

  • Timing-driven optimization: The timing engine optimizes critical paths, and generates timing constraints on individual FPGA to achieve the targeted prototyping performance. Clock domains are managed independently based on their criticality.

 

Global Routing

The routing algorithm routes inter-FPGA and I/O signals through board traces or FPGA’s with the objective of minimizing signal delays.

  • Automatic Pin Multiplexing (APM): When the partition signals exceeds FPGA pins, the APM feature can be called to group multiple signals on a single board trace. This technique relaxes the pins requirement on FPGA partitions. The router may choose to route through an FPGA to avoid overcrowded traces. Once the signal groups are specified, Wasga Compiler inserts the wire-sharing logic including Transmitters (TM), Receivers (RM) and Control (CM) modules in every FPGA’s. The user can approve or modify the signal groups files. Complex hard-IPs integration, such as SerDes, LVDS mode, clock domain signals grouping and data-clock synchronization are handled by the Wasga communication IPs.

  • Timing driven-routing: Critical paths crossing FPGAs is an important factor inducing performance degradation. This important optimization objective is handled by Wasga Compiler. Thanks to its timing engine, the global placement absorbs combinatorial hops. However, not all combinatorial hops can be avoided by the automatic placement. It may be caused by the user constraints or the design nature. All the combinatorial hops through FPGAs are reported to inform the router. When pin multiplexing is necessary, Wasga Compiler is instructed to exclude critical signals (with low slack values) for the pin multiplexing to bring up the system performance.

 

Time Budgeting

Wasga Compiler unifies two aspects of timing management. The inter-FPGA timing is achieved by the timing driven-routing and multiplexing ratio optimization. The intra-FPGA timing is handled by constraining the FPGA PnR tools. To be able to achieve the target prototyping performance, the timing engine of Wasga Compiler calculates the delay constraint for every timing path. The delay budget on every segment of the timing path is based on graph back-annotating technique and considers the delay through the board traces and latency through pin multiplexing. The intra-FPGA delay calculation is done in two ways:

  • Fast estimation: The delay on each segment of the timing path is estimated by Wasga timing engine based on the proportional logic levels it covers.
  • High accurate estimation: This estimation is done by running intra-FPGA compilation tools. The timing report (delay on each timing path) is parsed by Wasga tool and timing graph is annotated with appropriate delays.

The resulting timing constraints are transmitted to each FPGA PnR tools. Thus, the Wasga timing engine meets the target frequency required by user.