TUTORIAL

John  Bryan 
2004

Purpose

The purpose of this tutorial is introduce you to the IEEE standard hardware description language, Verilog HDL, and to the tool chain.  The tool chain consists of the ModelSim simulation tool and the Synopsys synthesis tool.

Summary

The ultimate goal of chip-level design is to actually produce a chip.  There are many steps involved in doing this, and this tutorial focuses on two of them: simulation and synthesis.

A simulator accepts inputs that you specify and displays what the design's anticipated outputs will be.  In this tutorial, we will be doing RTL (register transfer level) simulation, in which the input to the simulator is the compiled Verilog HDL source code. Verilog HDL is a programming language for defining the structural and behavioral description of digital circuits.  In RTL simulation, since no delay information is present, the simulation has no delays in it.   The simulator that you will be using is the ModelSim simulator, which is available in the Mentor Graphics toolset.

Synthesis is the process of taking human-readable input such as Verilog HDL source code files and creating a low-level description which describes the design in terms of simple gates.  The synthesis tool you will be using is Synopsys Design Compiler. 

The Tutorial

The tutorial is comprised of the following sections:
  1. Preparation: Unix paths
  2. Verilog Source Code
  3. Simulation: ModelSim
  4. Synthesis: Synopsys Design Compiler
  5. Four-bit adder using an always block
  6. A few additional adder examples
  7. State Machines
  8. Counter using a function
  9. Register file example
  10. Functional RTL
  11. Notes

1.   Preparation: UNIX paths

You can work in either your oce home directory or in your directory in the class directory.  For the instructions given below, we will assume that you are working in your oce home directory.  Setup the following recommended directory structure in your oce home directory:

2.   Verilog HDL Source Code

In order to enter HDL source code, you will need to use a text editor.  Popular choices are vi, vim, gvim, xemacs, emacs and pico. Some basic Verilog features are:

3.   Simulation: ModelSim

To initialize ModelSim: To compile your design into the work library, (this needs to be done whenever you make a change to your source code), you can use the commands:
  • vlog full.adder.vl
  • vlog h4ba.vl
  • vlog tb.h4ba.vl
An option that is available to you in compiling files for ModelSim is to write a
makefile.  A script to compile the three files listed above is h4ba.makefile
To list the compiled modules in the work directory, you can use the command,
vdir

Macro .do files can be used in the simulation.  An example is tb_h4ba.do.  To invoke the simulator and load the top-level design unit, which in this case is the module tb_h4ba, and execute the .do file commands, you can use the command,
vsim  tb_h4ba  -do  tb_h4ba.do
Two windows will open and the simulation will run.  The Main window can be expanded to view the results from placing the strobe statement in the testbench.  Click in the wave window to obtain a marker to view the simulation results at various times.
To exit the simulator, the quit command can be used in the Main transcript window.

If you want to start ModelSim and load the module tb_h4ba without executing a .do file, you can use the command,
vsim tb_h4ba

The following are a few examples of command-line interface commands that can be entered at the ModelSim prompt in the ModelSim Main window.

 radix bin   Set the radix to binary.  
 radix unsigned   Set the radix to unsigned. 
 radix hex   Set the radix to hexadecimal. 
 radix decimal   Set the radix to decimal. 
 vdir    List the compiled entities in the work directory. 
 vdel reg    Delete the reg module in the work directory. 
 vlog reg.vl    Compile the file, reg.vl.  
 vsim reg    Load the reg module. 
 do  macrofile.do  Execute the macro file, macrofile.do. 
 vsim  tb_h4ba  -do  macrofile.do    Load the module tb_h4ba and execute the macro file, macrofile.do. 
 run 80    Run the simulation for 80 ns.  
 run -all    Run the simulation until a wait statement or breakpoint is executed. 
 restart -f    Restart the simulation. 
 view source   View the source window. 
 view *   View all the ModelSim windows. 
 add  wave  uut/*  Add all the signals in the uut component to the wave window. 
 add  wave  uut/sum  Add the sum signal in the uut component to the wave window. 
 pwd   List the name of the directory that you are in. 
 ls   List the contents of the directory. 
 quit    Quit ModelSim. 

A few common ModelSim questions:

In ModelSim,

  • To view the signals window, from the ModelSim Main window menu, choose View > Signals.

  • To view the wave window, from the ModelSim Main window menu, choose View > Wave.

  • From the signals window menu, choose Add > Wave > Signals in Region.

  • To run the simulation to the end, enter
    run  -all
    at the ModelSim prompt in the ModelSim Main window.

  • To restart the run, from the ModelSim Main window menu, choose Run > Restart..., and when the Restart window appears, select the Restart button.

  • To delete the signals presently listed in the wave window, from the wave menu choose Edit > Select All.  Then from the wave menu choose Edit > Cut.

  • To view the structure window, from the ModelSim Main window menu, choose View > Structure.

  • In the structure window, select uut:h4ba.  The signals in the h4ba environment will be displayed in the signals window.

  • From the signals window menu, choose View > Wave > Signals in Region.

  • To run the simulation for 200 ns, enter 
    run  200 
    at the ModelSim prompt in the ModelSim Main window.

  • To quit ModelSim, from the ModelSim Main window menu, choose File > Quit
Note:If you ever have a simulation that does not halt, a good way to stop it is to select the the Break icon in the ModelSim Main window.   You may need to use this if you ever choose Run > Run - All from the ModelSim vsim menu or the run - all icon in the ModelSim Main window.  If you use either of these and the simulation time increases to a large value before you select the Break icon or you reach the stop statement in your testbench, you should type the command  ls -l vsim.wlf  in a xterm window in the directory in which you are running the simulator to check the memory usage of the file vsim.wlf.  The file vsim.wlf is a file generated by vsim during the simulation, if you have the wave window up, to log data displayed in the wave window.  The file size can reach your alloted memory usage quota if you use run - all and don't use the Break icon soon enough.  To view your memory usage and quota, type  quota -v  in a xterm window.  To remove the vsim.wlf file, use the rm vsim.wlf command in a xterm window.  The vsim.wlf file is overwritten each time you run a simulation, so as long as you don't use run - all or set the run simulation time to an unnecessarily large number, you should not have to worry about it.

You may want to occasionally print out the simulation results that appear in the wave window.  You can use the zoom feature from the wave menu to obtain the proper zoom.  To print the wave, from the wave menu, select File > Print Postscript...  When the Write Postscript window opens, select File name: and type in a name for the file name, i.e., wave1.ps, and select ok.  Only the portion of the wave that is presently viewable will be printed out.  If the waveform is long, you will have to use more than one page.  To set the laser printer destination, in a xterm window, use the command, setenv PRINTER printername, where printername is the name of the printer that you want to print to .  To view and print out a postscript file, you can use ghostview by typing the command, ghostview wave1.ps in a xterm window, where wave1.ps is the name of your postscript file.  To print out the postscript file in ghostview, choose File > Print..., type the printer name in the popup window and select ok.

Note: Simulation can also be performed on the output of the synthesis tool.  This is called gate-level simulation as opposed to RTL simulation performed in this tutorial.  Gate-level simulation is more accurate, but takes longer than RTL simulation.   Gate-level simulation will not be performed in this tutorial.

4.   Synthesis: Synopsys Design Compiler

Your working directory should be ~/tutorial.

  1. Be sure you have copied full.adder.vl and h4ba.vl

  2. Copy the synthesis script file, h4ba.script to the directory that you are working in.  Read the contents of the synthesis script file, which contains commands and accompanying commented information.  Synopsys Design Compiler can be run in two different modes: dcsh and Tcl (Tool Command Language).  We will use Tcl mode.  Tcl is a language widely used in EDA (electronic design automation) tools including ModelSim.

  3. At the unix shell prompt, type,
    unlimit

  4. At the unix shell prompt, type,
    prep  synopsys

  5. At the unix shell prompt, type,
    nice   dc_shell-t

  6. To run the synthesis script, at the dc_shell-t prompt, type,
    source  h4ba.script
    Some files, as commented in the script file, will be generated, including area and timing reports for the hierarchical four-bit adder, which are contained in the files, h4ba.area.rpt and h4ba.timing.rpt.
Some Design Compiler commands:
  • analyze:   This command is used to read in Verilog source files.  Syntax errors will be reported when executed.  This command should be used on all modules that comprise your top-level module.  Some unsynthesizable Verilog constructs such as the initial statement, if they are used in a module, will generate a warning message at the output of this command.

  • elaborate:  This command is used with the name of your module.  Verilog code that cannot be translated to generic hardware will be reported when executed.  Near the end of the report generated from the elaborate command, you can check what signals in your design have been synthesized as flip flops.  If a signal has been synthesized as flip flips you should verify that you used the nonblocking assignment operator, <=, to assign to that signal.  If a signal that has been assigned to is not synthesized as flip flops, then you should use the blocking assignment operator, =, to assign to it.  Signals assigned to with the nonblocking operator are scheduled to change at the end of the time unit.  Signals assigned to with the blocking operator change instantaneously.  Signals assigned to in an a lways block that has the keyword posedge in the sensitivity list are synthesized as flip flops unless optimized out by Synopsys.  Any latches (level-sensitive memory elements) will be reported.  There should be no latches.  A source of unwanted latches is if a signal that is assigned to in one branch of an if or case statement in an unclocked always block is not assigned to in every branch of the if or case statement.  Also a signal that is assigned to in an if statement without as else branch in an unclocked always block may be a source of unwanted latches.  Latches can make obtaining an accurate timing report more difficult.

  • report_timing   -loops:   This command reports combinational timing loops, which are closed loop feedback paths in your design that don't have a register in them.  These should be avoided.  They can make obtaining an accurate timing report more difficult and could be a source of a potential oscillation condition.

  • create_clock:   This command is used to set the clock period constraint Design Compiler should target.

  • compile:  This command is used to optimize and map the design to a standard cell library.  This command consumes the most cpu time.  Errors generated by this command are less frequent than those generated by the analyze and elaborate commands.  Errors include not having standard cells that some of the generic hardware can be mapped to.

  • report_area:   This command generates an area report where the area in units of equivalent two-input nand gates is denoted by the cell area listing.

  • report_timing:   This command generates a timing report with the clock period in nanoseconds denoted by the sum of the absolute value of the data arrival time in the path column and the absolute value of the library setup time in the Incr column.

  • report_reference:   This command generates a report listing the number of each standard cell used in the design.

  • create_schematic
    highlight_path   -critical_path
    plot   -output   filename.ps
    Using these three commands in series, generates a postscript schematic file, filename.ps, with the critical path highlighted.
In Design Compiler,
  • To get help about a dc_shell-t command, enter the command, man   command_name.  i.e., man   analyze.

  • To get help about an error message, enter the command, man error_message_id., i.e. man ELAB-366.

  • To obtain further information on an error encountered in an attempted synthesis run, you can use the command,
    error_info

  • The history command lists the commands used in a dc_shell-t session.  The !! reruns the previous command.  !n reruns the command numbered n from the history list.

  • The exec command can be used to execute a unix shell command in dc_shell-t.  The syntax is, exec unix_shell_command, i.e., exec page filename.  The vi command does not work in combination with the exec command.

  • The exit command can be used to quit dc_shell-t.

5.   Four-bit Adder using an always block

Processes in Verilog are modeled with the always block.  In hardware description languages, processes are blocks of statements.  Processes are of two types: unclocked and clocked.  An unclocked process models combinational logic; a clocked process models synchronous logic.  A process is executed concurrently with respect to other processes and continuous signal assignment statements.  Processes offer a variety of powerful statements and constructs that make them very suitable for high level behavioral descriptions. 

The Verilog description of a 4-bit adder using a process with a sensitivity list is p4ba.vl.  The sensitivity list is to the right of always@.  For an unclocked always block (no clock in the sensitivity list), primary input signals to the unclocked always block should be placed in the sensitivity list.  Primary input signals are signals that are read in the always block that are generated outside of the always block.  Since the signals, addend_one, addend_two, and carry_in are primary inputs to this unclocked always block, they are included in the sensitivity list.  Whenever one of the signals in the sensitivity list changes value, the always block will be executed.  The assignment statements in the unclocked process use the blocking assignment operator, =.  Signals written to with the blocking assignment operator are updated instantaneously.  Combinational logic (unclocked always block (no clock in the sensitivity list)) should be described using the blocking assignment operator.  In an unclocked always block, statements are executed sequentially. The other type of Verilog assignment operator is the non-blocking assignment operator, <=.  When a signal is written to with the non-blocking assignment operator, it is scheduled to be assigned the new value at the end of the current time unit.  Sequential logic (clocked always block (clock in the sensitivity list)) should be described using the nonblocking assignment operator.  In a clocked process using only the nonblocking assignment operator, all assignments are executed concurrently and the order does not matter.  In a clocked process, only the clock and reset (if the reset is asynchronous) should be included in the sensitivity list.

In the Verilog module, sum, carry_out, and carry are declared to be of type reg because they are assigned to in the always block.  No registers will be synthesized as no edge is specified in the event specification list (sensitivity list). 

An integer type is used as the for loop index.  The integer type is a 32-bit signed number.  Use of the integer type should be reserved for testbenches and for use as a for loop index. 

The for loop construct should only be used for iteration over space such as to assign to the bits as in this example code.  The for loop construct should not be used to try to iterate over time.  It's possible for the synthesis tool to not flag use of a for loop to iterate over time as an error; the synthesis tool may just hang in the compile operation.  Hence, it's up to the designer to avoid this pitfall.  Use of a loop state as one of your states in a state machine that increments a counter each clock cycle is a preferable method to iterate over time.  An if statement can be used in the state to synthesize a comparator to check if the count equals the loop upper limit.  Try to always think what hardware will result in synthesis from your code.

6.   A few additional adder examples

7.   State Machines

  • Moore state machine:  moore.vl .  The value of output Z depends on the value of CURRENT_STATE.  The fsm is composed of two always blocks.  One unclocked always block is used to hold the combinational elements and one clocked always block is used to hold the synchronous elements.  The always block to hold the combinational elements uses a sensitivity list which is comprised of signals read in the always block, X and CURRENT_STATE.  Signals assigned to in an unclocked always block are synthesized as wires if no memory is inferred or as latches if memory is inferred.  In this unclocked always block, Z and NEXT_STATE are synthesized as wires.  The clocked always block to hold the sequential elements uses the keyword posedge to signify the rising edge and using it in the second always block enables synthesis of CURRENT_STATE as the state register.  When the value of the signal CLK changes from 0 to 1, CURRENT_STATE is scheduled to be assigned the value of NEXT_STATE.  Signals assigned to in a clocked always block enables synthesis as registers.  CURRENT_STATE is synthesized as the state register of the state machine.  This state machine uses state machine design style 1 from the list below.
    Some possible state machine design styles:
    1. One unclocked always block and one clocked always block.
    2. Two always blocks and a signal assignment statement.
      • One unclocked always block to assign next state.
      • One clocked always block to update state register.
      • One signal assignment statement to assign output.
      • moore2.vl
      • tb_moore2.vl
      • moore2.script
    3. Two clocked always blocks and one unclocked always block.
      • One clocked always block to update state register.
      • One unclocked always block to assign next state.
      • One clocked always block to register output.
      • moore3.vl
      • tb_moore3.vl
      • moore3.script
    4. One clocked always block.
    5. Two unclocked always blocks and one clocked always block.
      • One unclocked always block to assign output.
      • One unclocked always block to assign next state.
      • One clocked always block to update state register.
      • moore5.vl
      • tb_moore5.vl
      • moore5.script

  • Mealy state machine:  mealy.vl.  The output value Z depends on both the value of CURRENT_STATE and the value of the input X.

A sample testbench for the moore module is, tb_moore.vl.
Suppose that you want to view on the wave timing diagram the six signals from the moore state machine: RESET_N, X, CLK, Z, NEXT_STATE, and CURRENT_STATE.
One method to do this is to use a .do macro file.  Another method is to use the window menus.

  • To use a .do macro file for the moore state machine, the steps are:

    1. Copy the .do file, moore.do, to the directory that you are running ModelSim.

    2. Load the tb_moore module into Modelsim using the vsim tb_moore command.

    3. In the ModelSim main window, type the command,
      do moore.do

  • To use the window menus for the moore state machine, the steps are:

    1. From the ModelSim Main window menu, choose View Structure.

    2. From the ModelSim Main window menu, choose View Signals.

    3. From the ModelSim Main window menu, choose View Wave.

    4. In the Structure window, select uut: moore.

    5. The five signals of interest will be displayed in the Signals window.

    6. From the Signals window menu, choose Add > Wave > Signals In Region.

    7. The five signals of interest will be displayed in the wave timing diagram.

A Synopsys Design Compiler synthesis script file for the moore state machine is moore.script.

8.   Counter using a function

  • An example of the use of a function in Verilog is the counter source code, sync_reset_counter.vl. 

    The syntax for defining a function is:

    function [ range_or_type ] function_identifier ;
        function_item_declaration
        statement
    endfunction

    • The declaration begins with the keyword function.  The range is where the number of bits contained in the output value is declared. 
    • The range is followed by the name of the function and a semicolon. 
    • The function_item_declaration is where the inputs are declared in the order in which they are used in the function.
    • The return value is assigned to the function name in the statement section of the function.

  • A sample Verilog testbench for the counter is  tb_src.vl. 

9.   Register file example

A four 32-bit register file example is reg4.vl.  To read a register, the nr_w bit is set to 0; to write to a register, the nr_w bit is set to 1.  The register to be read from or written to is determined by a 2-bit address.  The register file module in reg4.vl uses four instantiations of the 32-bit register module in the Verilog source, reg.vl.

10.   Functional RTL

  • One always block.
  • Two always blocks.
    • Some designers prefer to keep sequential elements in a separate always block from the combinational elements.  An example is the module in frtl2.vl.  The default case in the combinational always block is needed to avoid latches being synthesized for the signals assigned to in the always block.   This style is more difficult than the frtl1 style in trying to avoid combinational loops and unwanted latches.
    • testbench: tb.frtl2.vl.
    • synthesis script: frtl2.script.
  • Three always blocks.
    • One clocked always block for state register.
    • One unclocked always block for fsm combinational logic.
    • One clocked always block for non-fsm registers and combinational logic.
    • module: frtl3.vl.
    • testbench: tb.frtl3.vl.
    • synthesis script: frtl3.script.
  • Four always blocks.
    • One clocked always block for state register.
    • One unclocked always block for fsm combinational logic.
    • One clocked always block for non-fsm registers.
    • One unclocked always block for non-fsm combinational logic.
    • module: frtl4.vl.
    • testbench: tb.frtl4.vl.
    • synthesis script: frtl4.script.
The four examples in this section synthesize to designs with equal functionality, area, and clock period.

11.   Notes