1. Hierarchy Webinar Featured Content

  2. Hierarchy Featured Content

  3. Introduction

    RTL is modular in nature, where there is a top-level module which contains a large number of sub-modules. The top module is synthesized to give the final gate level logic. The sub-modules are defined separately and then instantiated throughout the top-level module. In other words, there is a hierarchy with the top-level module being the 1st level of hierarchy, while the other sub-modules forming the inner hierarchy.

    Similarly in High Level Synthesis written in C++, there is a top-level function which is synthesized to give the RTL logic. These functions will consist of function calls of sub-functions with the sub-functions being defined separately. The top-level function in the 1st level of hierarchy while the sub-functions form the inner hierarchies. The term “function” here can be C++ functions or can be classes as well. More on this can be seen below.

    For SystemC, just like RTL we can define modules using the keyword “SC_MODULE”. Just like RTL, there consists of a “top” SC_MODULE which will be the 1st level of hierarchy followed by the sub level SC_MODULEs which form the inner hierarchies. In RTL the hierarchies are connected with the help of wires. For SystemC this is done with the help of Matchlib connections or sc_signal. In C++, ac_channel is used for streaming of data between hierarchies. Ac_channel behaves as a FIFO. Instead of ac_channel, a shared memory can be used for inter block communication. There should be some form of arbitration between the blocks and the shared memory to make sure no data is being lost or overwritten. The other alternate is to use a ping-pong memory for better parallelism of hierarchical blocks.

    Hierarchy in C++:

    Before we get into hierarchy let us look at how a top module is defined in C++ using both function and class.

    How to define a top module in HLS?

    Function:

    • With function, the arguments in the function definition, form the interfaces of hardware code.
    • The HLS tools can inference the bitwidth and direction of the port.
    • The ports can be input, output or inout.
    • Outputs need to be passed by reference to have the value modified.
    • Different functions can be defined and called to form hierarchies.

    E.g.

    #pragma design top
    void top(int a, int b, int &c)
    {
    
    c = a +b;
    }

    The corresponding RTL module would be:

    module top (
      clk, rst, a, b, c
    );
      input clk;
      input rst;
      input [31:0] a;
      input [31:0] b;
      output [31:0] c;

    This is a simple e.g. Of course, with a ready/valid handshake interface the corresponding a_rdy/b_rdy/c_rdy and a_vld/b_vld/c_vld would be present.

    This single top function is the starting point which can be expanded to have multiple levels of hierarchies.

    Classes:

    • Some HLS tools support classes to create modules and thereby HLS designs.
    • They are the best for hierarchies due to ease of use and modularity.
    • A top-level pragma is defined at the beginning of the class.
    • An interface pragma is defined for a public member function. The arguments inside the member function form the interfaces in hardware. Only one public member function is allowed as an interface.
    • Each class can be defined separately, and an object can be called for forming hierarchies.
    • Multiple objects of the same class can also be used to form hierarchies.

    The same above code in a class would be written as:

    #pragma design top
    class add {
    
    public:
    add(){}
    
    #pragma design interface
    void top(int a, int b, int &c)
    {
    
    c = a +b;
    }
    };

    The corresponding RTL module for this is:

    module add_top (
      clk, rst, a, b ,c
    );
      input clk;
      input rst;
      input [31:0] a;
      input [31:0] b;
      output [31:0] c;
    
    module add(
      clk, rst, a, b, c
    );
      input clk;
      input rst;
      input [31:0] a;
      input [31:0] b;
      output [31:0] c;
    
      add_top add_top_inst (
          .clk(clk),
          .rst(rst),
          .a (a),
          .b(b),
          .c(c)
        );
    endmodule

    The class name forms a wrapper around the function interface. It aids in interconnect declarations for component instantiations. The components are names as <class_name>_<interface_function_name> as seen below.

    The top function and interfaces can be defined as pragmas or in the GUI. It varies according to the HLS tool.

    Design Block:

    Design Block flowchart
    • ‘din’ and ‘dout’ are streamed in using ac_channels in the above design. Other streaming interfaces such as axistream can be used as well. 
    • Run is the function interface name which is a clocked process.
    • SimpleOneBlock is the name of the class which forms a single hierarchy.

    Concurrent Processes:

    • C++ designs containing “rolled“ sequential loops that are not automatically merged will have lower performance
    • Loop merging will not happen when there is:
      • Out of order array accesses between loops
      • Complex control
      • Non-deterministic data exchange between loops

    E.g:

    void BLOCK0(int din[3], int dout[3]){ 
    
     WRITE:for(int i=0;i<3;i++){ 
    
        dout[i] = din[i]; 
    
      } 
    
    } 
    
    void BLOCK1(int din[3], int dout[3]){ 
    
     READ:for(int i=2;i>=0;i--){ 
    
        dout[i] = din[i]; 
    
      } 
    
    } 
    
    void top(int din[3], int dout[3]){ 
    
      int tmp[3]; 
    
      BLOCK0(din,tmp); 
    
      BLOCK1(tmp,dout); 

    Concurrent Hierarchies in Modular Design

    This single design block can be extended further to multiple hierarchies. The hierarchy blocks help in providing some form of concurrency as seen in the diagram below.

    Design Block:

    Design Block Two Block Hier One Mod. flowchart
    • Here block0 and block1 are both clocked processes. These are function interfaces in C++ which are both present within a hierarchy called TwoBlockHierOneMod.
    • They are connected using some form of streaming interface or memory. More on this in the next section.
    • Class member functions mapped to design blocks can run concurrently. This is one level of hierarchy and an e.g. where hierarchy can be used to improve the performance of designs in HLS.

    E.g.

    class TwoBlockHierOneMod{ 
    
      ac_channel<uint10> connect;//Interconnect channel 
    
      uint10 acc0; 
    
      uint20 acc1; 
    
      #pragma hls_design 
    
      void block0(ac_channel<uint4 > &din){ 
    
        acc0 += din.read(); 
    
        connect.write(acc0); 
    
      } 
    
      #pragma hls_design 
    
      void block1(ac_channel<uint6 > &dout){ 
    
        acc1 += connect.read(); 
    
        dout.write(acc1); 
    
      } 
    
    public: 
    
        TwoBlockHierOneMod(){ 
    
          acc0 = 0;//Initialize in constructor 
    
          acc1 = 0;//Initialize in constructor 
    
        } 
    
      #pragma hls_design interface 
    
      void run (ac_channel<uint4 > &din, ac_channel<uint20 > &dout){ 
    
        block0(din); 
    
        block1(dout); 
    
      } 
    Multi Block with Multi-Class (Another level of hierarchy)

    Multi Block with Multi-Class (Another level of hierarchy)

    • Previous design has one level of hierarchy with interfaces present as function within a single class connected through an interconnect like ac_channels.
    • Additional level of hierarchy is added, by defining the functions inside separate classes as seen in the diagram below.
    • More functions can be added inside each of these blocks. But only one can be an interface. The rest of them will be design blocks as seen in the previous design.

    Functions and AC_Channel Interconnects

    Source code:

    class Block0{ 
    
      uint10 acc; 
    
      public: 
    
      Block0(){ 
    
          acc = 0;//Initialize in constructor 
    
        } 
    
      #pragma hls_design interface 
    
      void run(ac_channel<uint4 > &din, ac_channel<uint10 > &dout){ 
    
        acc += din.read(); 
    
        dout.write(acc); 
    
      } 
    
    }; 
    
    class Block1{ 
    
      uint20 acc; 
    
      public: 
    
      Block1(){ 
    
          acc = 0;//Initialize in constructor 
    
        } 
    
      #pragma hls_design interface 
    
      void run (ac_channel<uint10 > &din, ac_channel<uint20 > &dout){ 
    
        acc += din.read(); 
    
        dout.write(acc); 
    
      } 
    
     
    
     
    
    class TwoBlockHierTwoMod{ 
    
      ac_channel<uint6> connect;//Interconnect channel 
    
      Block0 inst0;//Module instances 
    
      Block1 inst1; 
    
    public: 
    
        TwoBlockHierTwoMod(){} 
    
      #pragma hls_design interface 
    
      void run(ac_channel<uint4 > &din, ac_channel<uint20 > &dout){ 
    
        inst0.run(din,connect); 
    
        inst1.run(connect,dout); 
    • ac_channel must be used to interconnect classes or class member functions mapped to design blocks
    • Arrays mapped to memory must be passed through ac_channel for shared memories between design blocks
      • Coding style must be followed
      • Arrays mapped to memories on the top-level interface are allowed
    • Cannot mix design blocks and glue logic
      • Interconnect for classes or functions mapped to design blocks cannot be mixed with inlined C++ code
      • Will get a compilation error

    • There are two ways to share a memory between classes/member functions mapped to design blocks
      • Using an ac_channel and required coding style
      • Explicitly coding a separate design block that contains a memory
      • The diagram below explains them.

    Multi-Level Hierarchical Design Enabling Concurrency

    Distributed Functional Blocks in a Multi-Hierarchy System

    Reasons to Use Hierarchy

    1. Allowing Blocks to Run in Parallel: The primary reason to add hierarchy is to pipeline the design so the blocks can run independently. A common example is a chain of digital filters that implement decimation or interpolation
    2. Reducing Synthesis Runtime on HLS tools
    3. Improving Regularity

    What can be designed with this methodology?

    1. Sub-systems with bus or chip interfaces
    2. Point-to-point communication.
    3. Dataflow in a single direction through a channel.

    Hierarchy in SystemC:

    SystemC hierarchies are created with the help of SC_MODULE. Just like C++, they need to be interconnected with the help of some form of channel. This is done with the help of Matchlib connections.

    A simple exampleof sc_module is seen here below. The Dut is designed inheriting the sc_module class.

    #pragma hls_design top 
    
    class flop : public sc_module 
    
    { 
    
    public: 
    
      sc_in<bool>      CCS_INIT_S1(clk); 
    
      sc_in<bool>      CCS_INIT_S1(rst_bar); 
    
      sc_in<uint32_t>  CCS_INIT_S1(in1); 
    
      sc_out<uint32_t> CCS_INIT_S1(out1); 
    
     
    
      SC_CTOR(flop) { 
    
        SC_THREAD(process); 
    
        sensitive << clk.pos(); 
    
        async_reset_signal_is(rst_bar, false); 
    
      } 
    
     
    
      void process() { 
    
        // this is the reset state: 
    
        out1 = 0; 
    
        wait();                                 // WAIT 
    
        // this is the non-reset state: 
    
        while (1) { 
    
          out1 = in1.read(); 
    
          wait();                              // WAIT 
    
        } 
    
      } 
    
    }; 

    The corresponding RTL for this is:

    module flop_process ( 
    
      clk, rst_bar, in1, out1 
    
    ); 
    
      input clk; 
    
      input rst_bar; 
    
      input [31:0] in1; 
    
      output [31:0] out1; 
    
      reg [31:0] out1; 
    
     
    
     
    
     
    
      // Interconnect Declarations for Component Instantiations  
    
      always @(posedge clk or negedge rst_bar) begin 
    
        if ( ~ rst_bar ) begin 
    
          out1 <= 32'b00000000000000000000000000000000; 
    
        end 
    
        else begin 
    
          out1 <= in1; 
    
        end 
    
      end 
    
    endmodule 
    
     
    
    // ------------------------------------------------------------------ 
    
    //  Design Unit:    flop 
    
    // ------------------------------------------------------------------ 
    
     
    
     
    
    module flop ( 
    
      clk, rst_bar, in1, out1 
    
    ); 
    
      input clk; 
    
      input rst_bar; 
    
      input [31:0] in1; 
    
      output [31:0] out1; 
    
     
    
     
    
     
    
      // Interconnect Declarations for Component Instantiations  
    
      flop_process flop_process_inst ( 
    
          .clk(clk), 
    
          .rst_bar(rst_bar), 
    
          .in1(in1), 
    
          .out1(out1) 
    
        ); 
    
    endmodule 
    • Actual design is present in a module names <module_name>_<thread_name>. The top module is called <module_name>.
    • For parallel access, multiple threads can be used.
    • The threads are connected using Connections::Combinational , memories. They are synchronized using SyncChannel present between connections.
    • Hierarchy is created using multiple sc_modules connected with the help of Connections library or sc_signal.
  4. Conclusion

    Hierarchy in HLS helps in parallel processing, better modularity and faster runtime of HLS tools. More complex requirements can be handled easier with the help of hierarchy. Hierarchy in C++ is achieved with the help of functions and most importantly classes. In SystemC, threads and sc_module are used. For communication channels, signals or memories models between hierarchies help in achieving the final design.