1. Tracks

  2. Introduction

    Design partitioning helps in breaking down a large and complex digital design into smaller, modular subsystems. It enables the designer to better manage complexity and optimize mutually exclusive elements in a design for more optimal reuse. Different HLS tools support this in various ways, but the most common occurring features to allow for better design partitioning are:

    • Hierarchies
    • Reusable Entities
    • Clustering
  3. Hierarchies:

    In High-Level Synthesis, hierarchy refers to the design structure that organizes components and subsystems into different levels. This organization helps manage complexity by allowing designers to work at various abstraction levels, ranging from high-level algorithmic representations to low-level hardware descriptions.

    Most designs that use hierarchy consist of blocks that must run in parallel to meet throughput constraints and can only be described using hierarchy. The first decision when implementing such designs is whether you should use hierarchy and where to define the hierarchy boundaries. Some designs, however, are partitioned into blocks to improve either the tool runtime or the area of the design.

    Allowing Blocks to Run in Parallel:

    The primary reason to add hierarchy is to pipeline the design so the blocks can run independently. A common example is a chain of digital filters that implement decimation or interpolation. Writing these systems in a single block is possible, but you will find the system much easier to describe and test using hierarchy.

    Reducing Synthesis Runtime

    Improving Regularity:

    A block of code or function may be re-used multiple times in a design. In that case, the regularity of the design may be better maintained by partitioning that block of code or function into a hierarchical block or reusable entities.

    Hierarchies can be described using class or function. Refer to the Hierarchy section of HLS academy website for more details.

  4. Reusable Entities:

    Reusable entities are features supported by high level synthesis tools to create custom operator synthesized from a single source code function that can be reused throughout the design. The purpose of this is that it helps designers to partition their design better based on which piece of code they need to use frequently. HLS tools synthesize and optimize them separate entities and insert them in the schedule when the entity is called. In the C++ code, these are created as functions, the tool is made aware of this with the help of pragmas. An e.g. of reusable entity here is called “CCORE”. The naming will vary based on HLS tool used.

     #pragma hls_design ccore 
    
    void ccore_multi_instance (Data &a, Data &b,Data &max, Data &min) 
    
    {  
    
    Flag a_gt_b = a > b; 
    
    max = a_gt_b ? a : b 
    
    min = a_gt_b ? b : a;  
    
    } 
    
     
    
    #pragma hls_design ccore 
    
    template <int ID> 
    
    void ccore_with_state (Data &a, Data &b, Data &max,Data &min, Flag &cross) 
    
    { 
    
    static Flag a_gt_b_prev = 0; 
    
    Flag a_gt_b = 0; 
    
     a_gt_b = a > b; 
    
     max = a_gt_b ? a : b; 
    
     min = a_gt_b ? b : a; 
    
     cross = a_gt_b ^ a_gt_b_prev; 
    
    a_gt_b_prev = a_gt_b; 
    
     } 
    
    #pragma hls_design top 
    
    void top ( Data &a, Data &b, Data &c, Data &d, 
    
    Data &first, Data &second, Data &third, Data &fourth,Flag &cross_ab, Flag &cross_cd) 
    
     { 
    
    Data max_ab, min_ab, max_cd, min_cd; 
    
    Data min_max_abcd, max_min_abcd; 
    
    ccore_with_state<0>(a, b, max_ab, min_ab, cross_ab); 
    
    ccore_with_state<1>(c, d, max_cd, min_cd, cross_cd); 
    
    ccore_multi_instance(max_ab, max_cd, first, min_max_abcd); 
    
    ccore_multi_instance(min_ab, min_cd, max_min_abcd, fourth); 
    
    ccore_multi_instance(min_max_abcd, max_min_abcd, second, third); 
    
    }
    • Here min and max are obtained, which are placed in a separate function.
    • HLS tools will synthesize them as custom operators/components.
    • They are reused multiple times as and when they are called in the source code.
    • Templates are supported as well.
    • These ccores can be combinational or sequential in nature.
      • Sequential CCOREs are registered with it being pipelined with II=1 and the user has the control over unrolling.
      • Combinational CCOREs do not have a register, and the loops are auto unrolled completely.
    • The ccores can be created using top-down or bottom-up approach. Top-down approach is when the ccores are synthesized for a solution inside a project and then reused for different solutions by storing it in the cache.
    • With Bottom-Up, sub-blocks are created independently and assembled as a final step
      • Enables a “Divide-and-Conquer” approach
      • Sub-blocks can be synthesized without requiring the full design to be available
      • Incremental design strategy
        1. No need to re-synthesize the entire design when a sub-block is changed
        2. Only block B to be re-generated and top-level assembly

    HLS tools may handle this reuse feature more differently than the example mentioned here.

  5. Clustering:

    Cluster loosely means a group of things which are close together which can be placed into one bucket. In High Level Synthesis this refers to common structures which can be grouped together into clusters to create sub-components to improve on overall Quality of Results.

    The common structures here refer to

    1. Add tree
    2. Multiplier chain
    3. Squaring operations
    4. Multiply Add

    The Operator Clustering flow detects and optimizes groups of related data path operators such as adder trees, multiply-add, and squares. This methodology attempts to reduce many of the inefficiencies with fine-grained scheduling of operators. This targeted clustering approach can lead to improved quality of results with better area and timing correlation with RTL synthesis results. Clustering offers for following key benefits:

    • Better area and timing correlation with RTL synthesis
    • Smaller area due to course-grained sharing
    • Increased capacity due to reduced design complexity
    • Faster runtimes on complex designs with many synthesis iterations
    • Shorter design latency for lower power and smaller area

    Clustering is an important part of design partition as if the designer knows if any of these components will be present in the design, they can use clusters and get the benefits of the same.

    HLS tools identify clusters with the help of directives or pragmas mentioned by the user. The clusters are extracted, synthesized separately as a component where they are optimized as well. The final cluster with ideal area and delay is used for the schedule before generating the RTL.

  6. Conclusion

    The name of CCORE and cluster might vary based on HLS tool but the concept of design reuse and partition of complex logic and/or frequently occurring components is standard amongst all the HLS tools. Design partitioning helps HLS designers to better modularize their design with the help of hierarchy. The tools partition the modules based on operators and components to improve on PPA metrics.