What can Exostiv Blade do for FPGA prototyping?

Classifying FPGA prototyping debug and analysis methodologies.

‘FPGA Prototyping’ or ‘using FPGA boards to prototype an ASIC or a SoC’ can be done with a variety of systems. Using such a system requires additional tools to synthesize and partition the design, and importantly, to debug and analyze it. Most of these tools fall into one of the following categories:

External probes: uses external instrumentation products such as scopes or protocol analyzers to run measurements from outside FPGA chips.
FPGA BRAM-based logic analyzers: typically AMD ILA or Intel Signaltap. These instruments store traces from the FPGA into internal FPGA RAM, which is then readback through a JTAG interface.
Deep trace buffers: this approach extends the JTAG embedded logic analyzer approach and stores the traces to external DRAM.
Global State Visibility: this approach consists in running the system clock one clock tick at a time to stop the execution of the FPGA and allow reading back the state of the flip-flops and memory at each cycle, usually through a JTAG interface.

The usual tools to debug and analyze FPGA prototypes

All these techniques have their merits and limitations. They occupy a specific space which are usually summed up with the chart above. We understand that this chart might not be perfect (for instance, why consider multiplexing as a specific type of tool?) – however it constitutes a rather good consensus of how such tools are compared. (This chart was re-created after presentations given by some EDA companies).

The vertical axis of this chart measures the overall capture size, how much of the working system can be observed, which can be summed up as ‘visibility’. As indicated on the axis label, this measure is actually two-dimensional:

‘Signal length’: the capture ‘depth’. How many ‘samples’ can be captured and analyzed; in other words how many clock cycles – contiguous or not – does the debug and analysis tool allow to watch?
‘Signal count’: the capture ‘width’ or ‘reach’. How many internal signals can be observed?

The horizontal axis of this chart shows the frequency at which the system has to run for the debug and analysis method to be usable.

Reading the chart

For instance, ‘Global State Visibility’ scores high vertically because this technique allows reaching all registers of the system. It scores very low horizontally because this approach requires gating the system clock and run long read operations at each clock cycle in such a way that the system is really used as if it were a succession of ‘static’ states. Conversely, using a JTAG embedded analyzer (‘FPGA BRAM’) provides only a very limited capture width / depth because it uses leftover BRAM resources in the FPGA – and hence scores low vertically. This method works at system speed, however, and can go to several hundreds of MHz – spanning wide on the horizontal axis.

The limitations of Deep Trace Buffer tools

Have you ever wondered why ‘Deep Trace Buffer’ does not score higher in visibility and does not reach system frequency?
Extending the capabilities of Deep Trace Buffers

Integrated FPGA Prototyping system usually have the following limitations:

1) Storage size and bandwidth limitations.

FPGA Prototyping systems vendors often claim that they have ‘deep trace buffers’ – DDR SDRAM memory connected to the FPGAs. However, this is insufficient to understand the real capabilities.

Will the quantity of memory available for debug and the maximum bandwidth to it be practical for your FPGA Prototyping scenario? Is this memory resource dedicated to trace buffers or does it have to be shared with the prototype functionality? It is sometimes – wrongly assumed – that a FPGA prototyping system does not have to run at speeds beyond 5 or 10 MHz and hence that the bandwidth to the trace buffer is relatively unimportant.

In many instances, however, FPGA prototyping should run at or close to the target speed of operation, at least partially. An example of this occurs when the prototype includes interface IPs that need interoperability testing so they have to be connected to real external devices that cannot be slowed down. With local system speeds at 500 MHz and more, a high bandwidth to memory turns out to be a requirement in the end.

Hence, what is sufficiently ‘deep’ for your use? 1GB? 8GB? More? How much do you need per FPGA? What will be the target prototype speed and how many signals do you want to observe? These questions are crucial to a successful debugging scenario and you should not just stop with a checkmark in front of ‘deep trace buffer’.

From our client’s experience, we can say that no matter the resources available on a FPGA Prototyping system, there is some level of frustration at some point. Usually, a single DDR memory trace buffer is attached to all available FPGAs, which is not very flexible. It will be ok or far too much for some FPGAs, and insufficient for others, with no way to reallocate the memory that’s physically connected with a fixed scheme (or at the cost of having to send data across multiple FPGAs to reach more memory).

2) Architecture and raw performance of the sampling IP and the software.

How fast can the capture run?

Often buried deeply within the specs of your FPGA Prototyping system is the maximum speed performance of the trace capture. Whether it is limited by the capture IP, the platform software or the physical interface with the buffer memory, you’ll find one that is usually not the maximum speed of the used FPGA technology. Of course, in most of the cases, partitioning a large design onto multiple FPGAs will force you to reduce the overall frequency, because FPGA to FPGA interfaces are used in a time-multiplexed for multiple separate interfaces. For this reason, FPGA Prototyping systems are rarely scaled for at speed operation, and so are the accompanying resources for debug.

Exostiv Blade extends Deep Trace Buffer capabilities

Exostiv Blade extends FPGA Prototyping Deep Capture Buffers capabilities

The chart above shows Exostiv Blade positioning compared to the tools currently available in common FPGA Prototyping systems. It extends the ‘Deep Capture Buffer’ capabilities at various levels:

First, Exostiv Blade provides scalable DDR memory quantity and bandwidth resources. The user is no longer restricted to the resources available in the FPGA Prototyping system and is able to multiply the memory resources attached to any given FPGA and decide about the available bandwidth. The cost in FPGA resources is primarily the allocation of transceiver quads. The granularity is min. 16 GB of DDR memory per 100 Gbps of bandwidth, and this can scale up to 128 GB per 100 Gbps, enabling very deep capture.
Second, Exostiv Blade IP used to sample data from inside FPGAs is able to run up to 800 MHz – currently limited by the FPGA technology. Any Exostiv Blade IP insertion must still reach timing closure together with the overall FPGA design, but does not introduce any arbitrary limitation itself.
Finally, Exostiv Blade uses a low level protocol on the transceivers that can be scaled easily without bringing a limitation on the available bandwidth. This protocol guarantees over 90% usage of the maximal available bandwidth for trace data payload. For this reason, the structure of Exostiv Blade does not bring any arbitrary software-related limitation either.

Thank you for reading.

– Frederic

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
_wpfuuid	11 years	This cookie is used by the WPForms WordPress plugin. The cookie is used to allows the paid version of the plugin to connect entries by the same user and is used for some additional features like the Form Abandonment addon.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wordpress_test_cookie	session	This cookie is used to check if the cookies are enabled on the users' browser.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_HJ24NHJPZS	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_65087608_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
is_unique	5 years	StatCounter sets this cookie to determine whether a user is a first-time or a returning visitor and to estimate the accumulated unique visits per site.
is_visitor_unique	2 years	StatCounter sets this cookie to determine whether a user is a first-time or a returning visitor.
sc_is_visitor_unique	2 years	StatCounter sets this cookie to determine whether a user is a first-time or a returning visitor.

Cookie	Duration	Description
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
login_referer	session	No description
newsletter_leads	1 month	No description available.
wp-resetpass-387b59c5ae0dba568cdb2ac664e4f2eb	past	No description