An automatic assertion generation tool

Vision

GoldMine started as a tool for assertion generation in hardware (RTL) design back in 2009. It was the first published methodology/tool that applied machine learning for assertion generation. Assertion generation refers to the automation of the manual, tedious, ad-hoc and incomplete process of writing assertions. The principle of GoldMine is simple but powerful. Machine learning algorithms are guided by automatic algorithms that can statically analyze the state space of the system. The domain knowledge of the static analysis algorithms and formal verification could be leveraged as an oracular, iterative mechanism to direct the learning process. This interaction between a statistical inductive inference engine and a static analysis engine could be used repeatedly to generate assertions that surpass manual assertions in quality. GoldMine also has a ranking strategy that can rank assertions on the basis of coverage, importance, expectedness and complexity.

GoldMine has a Verilog RTL parser, a symbolic execution engine, a static analysis engine, a ranking engine and several machine learning algorithms implemented as part of the source code. These can be used outside of assertion generation such as:

Generate design insights
Identify coverage holes in the test suite
Debug a failing assertion or test.

Impact

GoldMine is a large and open source code base. It has been licensed by three leading EDA companies and forms the basis of a commercial product since 2013. Multiple graduate and undergraduate students have worked on it since 2009. It has been licensed and used by multiple semiconductor companies including Qualcomm, IBM, Oracle, Broadcom and Intel. It has won multiple research awards and has also been used by universities to provide a regression test for their Verilog design course projects.GoldMine has a Verilog RTL parser, a symbolic execution engine, a static analysis engine, a ranking engine and several machine learning algorithms implemented as part of the source code. These can be used outside of assertion generation to (i) generate design insights (ii) identify coverage holes in the test suite (iii) debug a failing assertion or test.

Contribution

The principle behind GoldMine has been used beyond design verification. It has been (or is being) used in software testing, autonomous vehicle verification, neural network architecture validation, reliability, and design interpretability.

The software invariant generation tool we developed is called PRECIS. It performed significantly better than its predecessors that did software invariant generation (DAIKON and other tools).

Features of GoldMine

GoldMine can automatically extract various design information via static analysis of the source code. The static analysis includes the following.
- Control-data flow graph analysis (CDFG).
- Variable dependency graph analysis (VDG).
- Use-def chain analysis.
- Both global and local (with respect to a given target variable) rank analysis of design variables.
- Bounded cone-of-influence (COI) analysis for a given target variable for n number of cycles.
GoldMine can generate static analysis information by traversing arbitrary hierarchy of a design. It saves the static analysis information for the top module and each of the instantiated modules.
GoldMine can generate both intra-modular and inter-modular assertions.
GoldMine can generate assertions for both combinational and sequential designs.
GoldMine can calculate various metrics to quantify the quality of each of the automatically generated assertions and can rank the assertions based on such metrics.

Sample static analysis outputs of GoldMine on a two-port arbiter

Figure 1 shows a control-data flow graph (CDFG) of a two-port arbiter. Each node of the CDFG is labeled with source code line number and the type of the statement that node refers to. For example, 17:IF refers to to the IF statement at line 17 of the two-port arbiter source code. The type of nodes that a CDFG may contain are i) AL (always block), ii) AS (assign block), iii) IF (if statement), iv) BL (blocking statements), v) NS (non-blocking statements), vi) CA (Case statement), and vii) CS/CX (Case/Casex condition). The edges of a CDFG is labeled with the condition variable (if the source node of the edge is an IF/CASE statement) or the statements (if the node is a blocking/non blocking/assign statements).
Figure 2 shows the complete CDFG of the two-port arbiter consisting of two always blocks, one at line number 10 (10:AL) and another at line number 16 (16:AL).
Figure 3 shows the variable dependency graph (VDG) of a two port arbiter. The VDG summarizes both the control dependency and the data dependency that exists among various pairs of design variables and stores them as edge weights
Figure 4 shows a cone-of-influence (COI) fir the target variable gnt2 when the design is unrolled for two clock cycles. The clock cycle is annotated as an integer within square brackets prefixed to the design variable. For example, [1]state represents the state variable at one cycle before the current cycle.