The FPGA problem we are trying to solve
Designing FPGA can be complex. Each step of the design flow brings its own challenges, problems and solutions. As engineers, this is what we do: we find solutions. We use our knowledge, we mobilize our skills and find the right tools to constantly build better solutions to the problems that we encounter.
Exostiv Labs aims at providing better tools for FPGA debug. However, ‘FPGA debug’ may be understood quite diversely. In this post, I’d like to go back to the basics and pinpoint the specific problems we are trying to solve.
‘First time right’ at board bring-up is a problem.
I am an engineer. Like any engineer, I like when things are done right technically, with the right steps taken at the right time.
However, as an engineer, I also have learned to manage budget and market constraints. We all know that between the ‘perfect design flow’ and the ‘actual design flow’, the distance is made of budget and/or market constraints. We all know that sometimes, it is chosen to skip the verification of one ‘detail’ in order to reach the announced release date. Sometimes it is expected from us to accept a small percentage of uncertainty (that is called ‘experience’ or ‘know-how’) because it is statistically cost-effective.
“And what if a bug appears later? Well, we’ll fix it and send an upgrade.
Isn’t it the beauty of programmable devices?”
Sometimes, being ‘first on the market’ or ‘first in budget’ beats ‘first time right’.
A typical in-lab situation
We all have been there…
The system starts up and suddenly, ‘something goes wrong’. It happens randomly, sometimes 2 minutes, sometimes 2 hours after power up.
– We have no clue about what caused the bug.
– We have no idea about why it happens.
– We do not know the time from cause to effect.
This is called ‘an emergent system with a ‘random bug’.
It is emergent because it is the result of complex interactions of individual pieces. Such system-level bugs are the most complex – and time-consuming – to solve because they involve the complex interactions of a whole system.
EXOSTIV solves the problem of finding the roots of bugs appearing in complex systems placed in their real environment.
Solving such a problem requires *a LOT of* visibility on the FPGA under test:
– It is hard to put a trigger condition on an unknown random bug. To spot it, you’d better have the largest capture window possible.
– You sometimes need to extend the capture far in time (stressing the impracticability of simulation as sole methodology).
– What happened before seeing the bug is very important to understand the ‘fatal’ sequence of events that has created the faulty condition.
Without a sufficient observability of the system under test, you’ll simply loose precious engineering hours hoping to -just- capture the bug. Then you’ll spend time again trying to capture the key events that happened before it – that ultimately would help you understand why the bug occurs.
Instrumenting a real design is the key to overcoming modelling mistake and see bugs occurring in the real environment… but only if the instrumented design provides sufficient visibility !
What can you do with 8 GB captured at FPGA speed of operation?
Please think to it. I’ll explore how to make the most of the *huge* resources offered by EXOSTIV in my next post.
Thank you for reading