How to infer block RAM in Verilog

17,614

I just remove something your code, the result like this:

 module RAM_param(clk, addr, read_write, clear, data_in, data_out);
parameter n = 4;
parameter w = 8;

input clk, read_write, clear;
input [n-1:0] addr;
input [w-1:0] data_in;
output reg [w-1:0] data_out;

// Start module here!
reg [w-1:0] reg_array [2**n-1:0];

integer i;
initial begin
    for( i = 0; i < 2**n; i = i + 1 ) begin
        reg_array[i] <= 0;
    end
end

always @(negedge(clk)) begin
    if( read_write == 1 )
        reg_array[addr] <= data_in;
    //if( clear == 1 ) begin
        //for( i = 0; i < 2**n; i = i + 1 ) begin
            //reg_array[i] <= 0;
        //end
    //end
    data_out = reg_array[addr];
end
endmodule  

Init all zeros may dont't need code, if you want to init, just do it:

initial
begin
    $readmemb("data.dat", mem);
end

Then the result that I got from ISE 13.1

Synthesizing (advanced) Unit <RAM_param>.
INFO:Xst:3231 - The small RAM <Mram_reg_array> will be implemented on LUTs in order to maximize performance and save block RAM resources. If you want to force its implementation on block, use option/constraint ram_style.

    -----------------------------------------------------------------------
    | ram_type           | Distributed                         |          |
    -----------------------------------------------------------------------
    | Port A                                                              |
    |     aspect ratio   | 16-word x 8-bit                     |          |
    |     clkA           | connected to signal <clk>           | fall     |
    |     weA            | connected to signal <read_write>    | high     |
    |     addrA          | connected to signal <addr>          |          |
    |     diA            | connected to signal <data_in>       |          |
    |     doA            | connected to internal node          |         

Update here!: Strong thanks to mcleod_ideafix Sorry about forgot your question: it's block RAM, not distributed. For block RAM, you must force it: Synthesis - XST -> Process Properties -> HDL option -> RAM style -> Change from auto to Block. The result will be this:

Synthesizing (advanced) Unit <RAM_param>.
INFO:Xst:3226 - The RAM <Mram_reg_array> will be implemented as a BLOCK RAM, absorbing the following register(s): <data_out>
    -----------------------------------------------------------------------
    | ram_type           | Block                               |          |
    -----------------------------------------------------------------------
    | Port A                                                              |
    |     aspect ratio   | 16-word x 8-bit                     |          |
    |     mode           | read-first                          |          |
    |     clkA           | connected to signal <clk>           | fall     |
    |     weA            | connected to signal <read_write>    | high     |
    |     addrA          | connected to signal <addr>          |          |
    |     diA            | connected to signal <data_in>       |          |
    |     doA            | connected to signal <data_out>      |          |
    -----------------------------------------------------------------------
    | optimization       | speed                               |          |
    -----------------------------------------------------------------------
Unit <RAM_param> synthesized (advanced).

End of Update

I recommend you read xst user guide for RAM sample code and the device data sheet. For example, in some FPGA LUT RAM: the reset signal is not valid. If you tried to reset it, the more logic module to reset must be integrate it. It leads to D-FF instead of RAM. The Reset signal will auto-assign to system reset.

In case of Block RAM (not LUT RAM), I prefer to specific depth/data-width or core generation or call it directly from library. More source code for general usage (ASIC/FPGA) can be found here: http://asic-world.com/examples/verilog/ram_dp_sr_sw.html

Share:
17,614
stevendesu
Author by

stevendesu

I like to code. That's about it. I prefer web-based applications. PHP, MySQL, HTML, CSS, JavaScript... I can also code in Perl or Python for shell automation and C++ when I need something to run fast. I'm learning ASM. I'm a Computer Engineering graduate with a Masters in Business Administration. Math and algorithms come easily for me. Art and graphics do not. I'm also a certified Mac Genius. Summer jobs. Whee. I don't own a Macintosh, but I can tear them apart, rebuild them, and fix just about any problem with them.

Updated on June 11, 2022

Comments

  • stevendesu
    stevendesu almost 2 years

    I've got one very specific problem with a project that has been haunting me for days now. I have the following Verilog code for a RAM module:

    module RAM_param(clk, addr, read_write, clear, data_in, data_out);
        parameter n = 4;
        parameter w = 8;
    
        input clk, read_write, clear;
        input [n-1:0] addr;
        input [w-1:0] data_in;
        output reg [w-1:0] data_out;
    
        reg [w-1:0] reg_array [2**n-1:0];
    
        integer i;
        initial begin
            for( i = 0; i < 2**n; i = i + 1 ) begin
                reg_array[i] <= 0;
            end
        end
    
        always @(negedge(clk)) begin
            if( read_write == 1 )
                reg_array[addr] <= data_in;
            if( clear == 1 ) begin
                for( i = 0; i < 2**n; i = i + 1 ) begin
                    reg_array[i] <= 0;
                end
            end
            data_out = reg_array[addr];
        end
    endmodule
    

    It behaves exactly as expected, however when I go to synthesize I get the following:

    Synthesizing Unit <RAM_param_1>.
        Related source file is "C:\Users\stevendesu\---\RAM_param.v".
            n = 11
            w = 16
        Found 32768-bit register for signal <n2059[32767:0]>.
        Found 16-bit 2048-to-1 multiplexer for signal <data_out> created at line 19.
        Summary:
        inferred 32768 D-type flip-flop(s).
        inferred 2049 Multiplexer(s).
    Unit <RAM_param_1> synthesized.
    

    32768 flip-flops! Why doesn't it just infer a block RAM? This RAM module is so huge (and I have two of them - one for instruction memory, one for data memory) that it consumes the entire available area of the FPGA... times 2.4

    I've been trying everything to force it to infer a block RAM instead of 33k flip flops, but unless I can get it figured out soon I may have to greatly reduce the size of my memory just to fit on a chip.

  • Admin
    Admin over 10 years
    I just wanted to point out that you have replaced a hardware reset with a run-time load of the RAM. I think that's exactly the right thing to do since a generic RAM typically will not allow an asynchronous reset of all cells, and initial blocks are not generally synthesizable.
  • stevendesu
    stevendesu over 10 years
    This fixed everything! The clear bit was actually something in our professor's example code. I'll have to let him know it broke everything.
  • Greg
    Greg over 10 years
    If there is a "clear", it usually implemented as if (clear==1'b1) begin ... end else if (read_write==1'b1) begin ... end else begin ... end. Also, negedge(clk) is normally written as negedge clk
  • Khanh N. Dang
    Khanh N. Dang over 10 years
    @JoeHass: I know the initial is not synthesizable, but initial with zero is not thing more in RAM design, it just for ROM init . I just want make it shorter :D. I try to give a reset but it was not synthesized as RAM (just array of FF). As I understand, RAM use system reset, not user-defined, because we don't need clear the RAM in running-times. If system do that, I will define it as registers.
  • mcleod_ideafix
    mcleod_ideafix over 10 years
    Your result from ISE indicates that distributed RAM has been used, not block RAM!
  • Khanh N. Dang
    Khanh N. Dang over 10 years
    @mcleod_ideafix: Thank you about your comment, I really forgot it. I updated in my answer now!.