architecture / orchestration

Your tools work fine. The handoffs between them are the problem.

A six-layer way of seeing an embedded test workflow – from physical devices to test evidence – that makes handoff gaps visible and closable, without replacing any tool.

The thing everyone knows but nobody says

Embedded test teams have spent two decades optimizing individual tools. Every instrument driver, every sequence editor, every reporting module has been refined, versioned, and battle-tested in production. The tools themselves are rarely the bottleneck.

The bottleneck is what happens between them.

One aerospace lab we spoke with described it plainly: three days to set up a test campaign – cables, instrument configs, sequence files, report templates – and twenty minutes to run it. The test itself was fast. The handoffs ate the week.

So what? That ratio doesn’t just waste calendar time. It means the test campaign runs too rarely to catch regressions early, and it means the team burns engineering hours on clerical work that no tool vendor ever designed for.

Auditors feel the same pain from the other side. A quality auditor at a medical-device manufacturer told us they can verify that a test passed – the report says so – but they cannot trace which firmware version was loaded onto the DUT without opening three separate tools and hoping the timestamps align. The data exists. It sits in three different silos, each perfectly self-consistent, none talking to the others.

Most embedded test labs run at least five distinct software tools per workflow: an IDE or flash tool, an instrument control layer, a sequence orchestrator, a data logger, and a report generator. In none of those workflows does a single record connect DUT identity, firmware hash, instrument calibration status, sequence version, raw data, and the final evidence pack. Five tools, zero single source of truth.

So what? When there is no traceable chain from DUT to evidence, every audit is a forensic exercise. Every handoff is a place where the story can break.

Then there is the bus-factor problem. In a German automotive supplier, the one senior engineer who knew the exact pin-mapping between the Vector box and the custom breakout board went on leave for two weeks. The test cell sat idle. Not because the tools didn’t work – they worked perfectly – but because the connection between the instrumentation layer and the stand configuration lived exclusively in one person’s notebook. Nobody else could recreate the setup without reverse-engineering it.

So what? Bus-factor risk isn’t a staffing problem. It’s a handoff problem. The knowledge existed; it just wasn’t captured in a form the next engineer could consume.

L5 Evidence Report, traceability matrix, regulatory submission DIAdem, Excel, PDF nobody owns this handoff L4 Data Processing Raw readings to pass/fail. MATLAB, Python, analysis scripts. MATLAB, Python, Excel nobody owns this handoff L3 Orchestration Flash firmware, wait for signal, ramp voltage, branch on measurement. LabVIEW, TestStand, OpenTAP, pytest nobody owns this handoff L2 Stand Configuration Wiring, pin mappings, calibration offsets, physical topology. paper notebook, whiteboard photo nobody owns this handoff L1 Instrumentation Oscilloscopes, Vector CAN, dSPACE HIL, JTAG, breakout boards. Vector, dSPACE, NI, JTAG probes nobody owns this handoff L0 Device Under Test Physical unit, firmware revision, hardware variant, serial number. ECU, microcontroller, sensor

The Six-Layer Test Stack – Tools Own Layers, Nobody Owns the Handoffs

Figure 1: The six layers every test workflow spans. Tools own vertical slices within individual layers – nobody owns the handoffs (the dashed red lines) between them.

Every test workflow spans six layers – whether you named them or not

Here is a descriptive model. It is not a product architecture, not a standard, not something you need to buy into. It is simply what we observe across the embedded test workflows we’ve mapped: every one of them spans six functional layers.

L0 – Device Under Test. The physical unit, its firmware revision, its hardware variant, its serial number. This layer is the subject of everything that follows.

L1 – Instrumentation. Oscilloscopes, power supplies, Vector CAN interfaces, dSPACE simulators, JTAG programmers, custom breakout boards. The things that electrically or logically connect to the DUT.

L2 – Stand Configuration. How L1 devices are wired together, pin mappings, calibration offsets, interface baud rates, the physical topology. This is the layer most often stored in a lab notebook or a whiteboard photo.

L3 – Orchestration. The sequence that decides what runs when: flash this firmware, wait for that signal, ramp this voltage, branch on that measurement. LabVIEW, TestStand, OpenTAP, pytest, proprietary scripts – the tool doesn’t matter for the model.

L4 – Data Processing. Raw readings become pass/fail decisions. Thresholds are applied, timestamps aligned, units converted. This is where measurement becomes meaning.

L5 – Evidence. The report, the traceability matrix, the export that lands in the quality system or the regulatory submission. The artifact that outlasts the test.

These six layers are technology-agnostic. They describe a workflow that runs on National Instruments hardware, on dSPACE HIL rigs, on homegrown Python stacks, or on a mix of all three. The layers don’t prescribe tooling – they describe function.

Here is the uncomfortable observation: most tools own one or two layers really well, and nothing owns the spaces between them. LabVIEW and TestStand dominate L3 orchestration for many labs, but they do not reach down into L1 device driver configuration or up into L5 evidence packaging. Vector tools excel at L1 CAN instrumentation but know nothing about L3 sequencing logic. Excel sits in L4 doing data-munging work it was never designed for.

So what? The gaps aren’t failures of any vendor. They are structural. The industry optimized layers; it never optimized the handoffs between layers.

LayerWhat It IsNI / LabVIEWVectordSPACEKeysight / OpenTAP
L0 DUTPhysical unit, firmware, serial
L1 InstrumentationScopes, CAN, HIL, JTAGCAN busHILInstruments
L2 Stand ConfigWiring, pin maps, calibration
L3 OrchestrationSequences, flow controlTestStandOpenTAP
L4 Data ProcessingRaw to pass/failBuilt-in
L5 EvidenceReport, traceabilityDIAdem
HandoffsThe arrows between rows

Table: Every major vendor owns 1-2 layers. The bottom row — the handoffs — is collectively unowned. This is not a product gap; it’s a category gap.

The three handoff gaps that kill repeatability

When we map real workflows onto the six-layer model, three gaps appear so consistently that they deserve names.

Gap 1 – The device-to-orchestration blind spot (L1 ↔ L3). The orchestrator says “measure voltage on channel 3.” It has no way of knowing whether the power supply connected to channel 3 was calibrated last Tuesday or last year. The instrument knows its calibration date; the orchestrator knows the test step. The connection between them – which instrument, which calibration, which configuration – is a manual lookup.

So what? A test that passes on a calibrated instrument can pass identically on an uncalibrated one, and the evidence trail won’t show the difference. Repeatability isn’t “the test ran” – it’s “the test ran under known conditions.”

Gap 2 – The stand topology tribal knowledge (L2 ↔ L3). The orchestrator assumes a signal path exists. What it doesn’t capture is that the CAN channel on the Vector box is wired through a specific breakout harness, pinned to J12 on the DUT carrier board, with 120Ω termination enabled only when a second node isn’t present. That knowledge lives in the senior engineer’s head, or on a sticky note, or in a WhatsApp message from three years ago.

So what? When the harness is rebuilt or the engineer leaves, the test still runs. It might even still pass. But nobody can prove it’s the same test. In a regulated context, that’s indistinguishable from not having run it at all.

Gap 3 – The evidence cliff (L3 ↔ L5). The orchestrator produces a pass/fail verdict. The evidence pack needs that verdict plus firmware version, instrument serial numbers, calibration dates, raw data pointers, and a timestamp chain a human can follow. In most workflows, the orchestrator hands off a single boolean – PASS – and a human reconstructs the rest by hand.

So what? The cliff between L3 and L5 is where traceability dies. Every minute a human spends copying firmware hashes from one tool into a report template is a minute the process depends on attention, not on automation. Attention fails. Auditors know this.

GapNameSymptomRoot CauseFixable With Existing Tools?
1Calibration Blindness (L1 to L3)Test runs, nobody knows if instrument was in calOrchestrator has no API to query cal stateYes: add a pre-check script
2Tribal Knowledge (L2 to L3)Test only works when one engineer is presentPin maps, terminations live in a person’s headPartially: document once, syncing stays manual
3Evidence Reconstruction (L3 to L5)Every audit triggers days of forensic workOrchestrator emits PASS/FAIL; context reconstructed by handNo: structural — requires a handoff layer

Table: The three gaps, their symptoms, root causes, and whether they are fixable within your current toolchain.

Self-diagnosis – map your own workflow in 20 minutes

The six-layer model is most useful as a diagnostic tool you can apply without buying anything, without installing anything, and without involving a vendor.

Here is the method. Pick one workflow – one test campaign, one validation sequence, one production-line check – and do four things.

First, list every tool involved, in order of appearance. Don’t abstract. If the firmware gets flashed with STM32CubeProgrammer and the engineer types the serial number into an Excel sheet by hand, write both.

Second, draw six boxes on a whiteboard: L0 through L5. Place each tool in the box (or boxes) where it operates. Some tools will sit in one box; some will straddle boundaries.

Third, circle every arrow between boxes where one tool hands off to another. These are your handoff points.

Fourth, for each circled arrow, ask one question: is this handoff machine-readable, or does it depend on a person doing the right thing at the right time? Machine-readable means a file, a structured log, an API call, a database write – something a script could consume without human interpretation.

This takes twenty minutes. It costs nothing. The output is a gap map: a visual answer to the question “where does our process depend on tribal knowledge, manual copying, or hope?”

Teams often discover that half their handoffs are person-dependent – and that they already knew this, but had never seen it drawn. The map doesn’t fix anything. It tells you where to look.

StepActionTimeRed Flag If…
1List every tool in execution order5 minYou need to ask a colleague what tools are used
2Draw six rows (L0-L5), place each tool5 minA tool sits in a row you did not expect
3Circle every arrow crossing a row boundary5 minMore than 5 circled arrows
4Machine-readable or person-dependent?5 minAny red arrow at an L2-to-L3 or L3-to-L5 boundary

Table: The four-step self-diagnosis. Twenty minutes, one whiteboard, zero cost. Output: a bus-factor risk map.

When self-diagnosis isn’t enough

Some gaps you find will be fixable with the tools you already own. If the L4 data processing step is a Python script that reads a CSV but an engineer retypes column headers by hand every run, a twenty-line script fixes that. Within-layer gaps respond well to local optimization.

Cross-layer gaps are different. When the handoff between L2 stand configuration and L3 orchestration is entirely manual – when the orchestrator has no structured knowledge of which instrument is on which port, with which calibration – no single-tool improvement closes that gap. The gap is structural. It exists because the tools were designed to own layers, not handoffs.

The question at this point isn’t “should we replace our tools?” It’s “does one specific workflow justify an outside diagnostic?” Not a tool evaluation. A diagnostic: someone who has mapped dozens of these workflows, who can read yours against that pattern, and who can tell you what’s fixable with existing tooling and what requires something new between the tools.

An outside diagnostic produces three things: a workflow map (your six layers, drawn with your tools, your gaps), a gap assessment (which gaps are within-layer and which are cross-layer structural), and a scope boundary – a clear statement of what’s in and out of any follow-up work. It doesn’t produce a bill of materials. It doesn’t produce a migration plan. It produces clarity about where your repeatability actually breaks.

The point: you stop guessing whether the problem is people, process, or tools. You get a map that lets you act on the answer.

DimensionBefore (Manual Handoff)After (Machine-Readable Handoff)
DUT identityEngineer types serial number into LabVIEW and again into the reportSerial number read once at L0, propagated to L5 automatically
Firmware versionRetrieved by opening the flash tool separately during evidence assemblyFirmware hash captured at flash time, bound to test run ID
Calibration state”I think we calibrated that rack last month”Calibration status queried from instrument before sequence starts, logged with results
Wiring / pin mapLives in a marked-up PDF on a shared driveConfiguration file version-controlled with the sequence, hash stored in evidence
Audit trailHuman opens 3 tools, reconstructs timelineSingle record links DUT to firmware to cal to sequence to data to evidence
Bus factor2-3 senior engineers0 - the system knows what the people used to know

Table: The target state for a single handoff. Not a new tool — making the arrow between tools carry machine-readable context instead of human memory.

Map one workflow. Find the gaps. No pitch.

If the self-diagnosis turns up cross-layer gaps you can’t close with existing tools, an outside diagnostic is the logical next step – not a tool purchase, not a platform migration.

The diagnostic runs one to two weeks. It covers one workflow, end to end. It produces a workflow map, a gap assessment, and a scope boundary. It costs between two and five thousand euros, depending on complexity and toolchain diversity. It does not involve replacing any existing tool. It makes no compliance guarantees. It is not a sales process dressed as consulting – if your gaps are fixable with what you already have, that’s the answer you’ll get.

Request a diagnostic call – 20 minutes, no deck.

June 12, 2026
Marcin June 12, 2026

Understand why embedded test workflows break at handoff points and how a six-layer diagnostic lens reveals where.

Next step

Have this handoff problem in one workflow?

Bring one workflow context to a review and check whether a diagnostic is a fit.