Advanced X-Propagation Sign-off Methodology for Network Security Chip

Savitha Raghunath, Sr. Principal Engineer, ASIC Design
Palo Alto Networks

I. Overview

Palo Alto Networks’ RTL designers incorporate a comprehensive static sign-off methodology during RTL design that includes linting, clock domain crossing, and X-propagation sign-off.

This paper specifically details the advanced X-propagation methodology we’ve developed to identify X-initialization source errors and fix them to prevent the error from propagating.

II. Goal: Eliminate X-Propagation Problems

X-propagation issues are far less prevalent during design issues compared with the number found during RTL linting and clock domain crossing analysis.

However, even one X-bug can lead to incorrect behavior under certain conditions and cause a chip failure; further, identifying that type of bug in silicon is extremely hard.

Our objective was to develop a more advanced sign-off methodology so our RTL designers could sign-off early in our design flow that their design modules had no X-propagation errors.

III. Initial Methodology: Analysis during Simulation Only

Our earlier methodology only involved doing X-propagation analysis during simulation, using existing simulation stimulus.

Doing X-propagation analysis during simulation had value; however, we risked missing issues because simulation’s dynamic analysis approach inherently relies on the simulation test patterns. X-propagation can only be identified if there is a test pattern that enables catching it.

Simulation is not exhaustive. The tests are randomized and don’t cover all possible scenarios, so coverage is limited by the tests run. Potential X-optimism and X-pessimism issues during simulation are described below.

X-Optimism

X-optimism occurs when the select is X and the D output should be X, but the simulator instead evaluates D as zero — leaving a false conclusion that the flop was initialized to zero. This leads to potential design errors.

X-Pessimism

X-pessimism occurs when a definite 0 or 1 value is pessimistically evaluated by simulator to be X.

This leads to false failures reported.

IV. Methodology Advancement: Adding X-Propagation Static Sign-off

We decided to augment our methodology to include X-propagation static analysis and sign-off during the RTL design process. We use Real Intent’s Meridian RXV X-propagation sign-off tool.

X-propagation static sign-off

X-Propagation Static Analysis

IV-1. Constraining the Design for X-Propagation Sign-Off

To properly constrain the static analysis, our designers input the clocks, the resets (including reset duration and the order in which the resets are released), and the module inputs list. This is straightforward and fast to set up for the majority of our blocks.

Our goal is to minimize noise in the violations reported to make our review process as efficient as possible. So, for modules with unusual complexity, such as third-party IP with multiple resets and clock domains, we provide additional constraints to show design intent, which reduces noise in the violation report.

One example is that a good design rarely reads from a FIFO unless it has been written into. To prevent false-positive violations showing the data coming out of the FIFO is all X’s, we add a static analysis constraint saying if the read enable to the FIFO is asserted, assume the data stored in the FIFO is valid, and not an X value.

This leads to a more precise list of likely errors to review. As our static analysis tool and methodology improves, we expect to need fewer constraints for our more complex blocks.

IV-2. Running static analysis quickly during design

Most of the X-propagation sign-off static analysis runs are very fast — even for our design modules, some of which have many million gates. (More complex modules can take a few hours to run.)

This speed is important as our RTL designers want to be able to run the analysis quickly during design.

Thus, we did not evaluate formal tools for testing X-propagation because formal tools tend to be more capacity-limited and time-consuming to run.

IV-3. Using waveforms for debug

Debug efficiency is also a key element of our methodology. When a violation is reported, our designers primarily use the static analysis tool’s integrated waveform viewer to view the X-optimization errors and do root cause analysis to trace the original X-source.

It’s this order of how things happen which matters. The very first uninitialized flop is the source of the problem; once something goes X it has a ripple effect on the design. So, we look at the waveform see where the X came from and where it propagated. When we want to debug a violation, we then right click on the waveform to open the schematic.

Below is a representative design analysis to determine the root cause of an X-bug.

X-propagation X-optimism failure

X from “flop_b1” propagates to “flop_b2”, causing an X-optimism failure

VI. X-Propagation Sign-Off Result: Found X-Bug Missed by Simulation

Our new, advanced X-propagation static sign-off approach caught an X-optimism bug in a multimillion-gate proprietary in-house network design that was missed during simulation-based X-propagation analysis.

The X was getting cleared by the course of the stimulus due to the combination of the order in which events occurred and the way the patterns were sent in the simulator. The result was that the bad state was flushed out before an input came in. Our initial simulation debug required months of effort.

Our static X-propagation sign-off methodology quickly identified all the X-propagation issues, including the missed X-bug. First, a low-effort functional analysis found all problem flip flops. Second, a higher-level effort analysis filtered the harmless X sources (flip flops with X’s that did not propagate). The debug process was rapid because we had prioritized views in our X-propagation reporting.

Conclusion

We now have several designer engineers running X-propagation static sign-off during the RTL design stage – before simulation – to catch the X-related design bugs and eliminate the avalanche effect of X-propagation.

Our experience is that it’s best to have the design engineer with intimate knowledge of the specific module run the static analysis to find and debug any X-propagation issues. It would be highly difficult for an engineer to do the analysis and debug without that depth of knowledge.

We expect X-propagation static analysis and sign-off to be a continuing requirement given the inherent error potential for X-bugs due to the nature of designs and partitioning. Different modules are coded differently or by different people — a control signal may go from module A to module B, where module B doesn’t know that module A didn’t reset.

This enhanced approach is becoming part of our sign-off process.