Universität zu LübeckThesis System

Hardware‑Backed MCDC Coverage Framework

IN COOPERATION WITH UNIVERSITY OF APPLIED SCIENCE IN BERGEN/NORWAY

Redirecting Clang’s MCDC Coverage Instrumentation to ARM CoreSight: A High‑Performance, Hardware‑Backed Coverage Collection Framework


1. Motivation & Background

1.1. Modified Condition/Decision Coverage (MCDC)

MCDC is a stringent white‑box testing criterion required by safety‑critical standards such as DO‑178C (Level‑A)ISO 26262, and IEC 61508. It demands that each atomic condition in a decision be shown to independently affect the decision’s outcome. 

Historically, generating MCDC data on embedded targets is cumbersome:

  • Source‑level tools (e.g., GCOV, lcov) instrument the program, write massive text files, and then post‑process them on a host PC. 
  • The instrumentation overhead can be high (extra branches, counters) and the I/O bandwidth needed to dump GCOV files can dominate execution time on low‑power micro‑controllers.

1.2. Clang’s New MCDC Instrumentation

The LLVM/Clang project (January 2024) introduced built‑in support for MCDC coverage. The compiler now inserts instrumentation points that record the evaluation of each condition and decision, and the data is emitted through the existing -fprofile-instr-generate pipeline, ultimately producing a .gcda/.gcno file that can be consumed by GCOV.

1.3. ARM CoreSight

CoreSight is ARM’s on‑chip debug, trace and profiling infrastructure. It provides:

  • ETM/PTM trace for instruction flow. 
  • CTI/CTM/CTP for event‑driven instrumentation (e.g., Data Write events). 
  • TPIU/ETB for high‑speed, low‑latency transport (UART, USB, high‑speed trace ports, SWO).

CoreSight can stream events directly from the target to a host debugger with negligible software overhead, making it ideal for collecting coverage data in real‑time while preserving deterministic timing.

1.4. Gap

Clang’s MCDC instrumentation is hard‑wired to the GCOV workflow. No existing solution routes the coverage events through CoreSight, which would give:

  • Deterministic, low‑overhead collection (no file system writes). 
  • Immediate visibility of coverage for interactive debugging. 
  • Possibility of live‑feedback to a test harness (e.g., stopping a test when a condition is uncovered).

Goal: “High‑jack” the Clang MCDC instrumentation so that every coverage point generates a CoreSight trace event, bypassing GCOV entirely.


2. Problem Statement

How can LLVM/Clang’s MCDC instrumentation be repurposed to emit coverage events via ARM CoreSight, while preserving correct semantics, keeping overhead below a defined threshold, and providing a usable end‑to‑end toolchain for safety‑critical embedded developers?

Key challenges:

  1. Mapping Instrumentation to CoreSight Events – The instrumentation currently writes to internal counters; we must replace this with writes to CoreSight‑observable registers or memory‑mapped trace ports.
  2. Portability & configurability – The solution should compile for any ARM Cortex‑M / Cortex‑A core equipped with CoreSight (or a simulated environment such as QEMU with a CoreSight model).
  3. Toolchain Integration – We need a workflow that starts from source, produces an ELF that streams coverage, and a host‑side collector that reconstructs the MCDC matrix.
  4. Performance Guarantees – Quantify the added latency, code‑size increase, and any impact on timing‑critical code.
  5. Verification of Correctness – Ensure that the event stream faithfully represents the same coverage information as the reference GCOV implementation.

3. Research Questions (RQs)

#Question
RQ‑1What is the minimal set of CoreSight primitives (e.g., ITMETMCTITPIU) required to convey every MCDC instrumentation point without loss of information?
RQ‑2How does the CoreSight‑based coverage collection affect the execution time and binary size compared with the classic GCOV approach on representative safety‑critical benchmarks?
RQ‑3What software architecture (instrumentation library, runtime shim, host collector) provides the most robust and portable solution across different ARM families?
RQ‑4Can the generated CoreSight event trace be transformed back into the standard GCOV data format (or a new standardized format) so that existing coverage analysis tools can be reused?
RQ‑5How does the proposed framework integrate into a continuous‑integration (CI) pipeline for safety‑critical development, and what are the implications on certification (traceability, auditability)?

4. Objectives & Scope

ObjectiveDeliverable
O‑1 – Instrumentation Redirection – Modify Clang’s MCDC passes (or add a new LLVM pass) to replace the default __llvm_profile_* calls with CoreSight trace writes.Patched Clang source + LLVM pass plugin.
O‑2 – Runtime Support Layer – Implement a tiny, header‑only CoreSight shim(mcdc_coreSight.h) that abstracts the low‑level registers (ITM/CTM) and provides macros used by the instrumented code.Open‑source C/C++ library, unit‑tested on QEMU and real hardware.
O‑3 – Host‑Side Collector – Develop a Python/ C++ utility (mcdc_collect) that reads the CoreSight stream via SWOETB, or Debugger‑API (e.g., pyOCD) and reconstructs the MCDC matrix.CLI tool with documentation and test suite.
O‑4 – Performance Evaluation – Benchmark against GCOV on a set of open‑source safety‑critical kernels (e.g., AUTOSAR demo, DO‑178C flight‑control loop, Mini‑CANstack).Empirical data set, statistical analysis.
O‑5 – Verification & Regression Suite – Produce a set of regression tests that compare the CoreSight‑derived coverage with the legacy GCOV output for identical inputs.CI‑compatible test harness.
O‑6 – Documentation & Guidance – Write a Developer’s Guide describing the toolchain, required board configuration, and best‑practice recommendations for certification‑oriented projects.PDF manual, sample project repository.

Out‑of‑Scope

  • Development of a full CoreSight hardware emulator (we will reuse existing QEMU and commercial debuggers). 
  • Extending the approach to non‑ARM architectures (though the methodology could be adapted).

5. Methodology

5.1. Literature & Tool Survey (Weeks 1‑2)

  • Study Clang’s MCDC implementation (source files MCDCCoverage.cppCoverageInstrumentation.cpp). 
  • Review ARM CoreSight technical reference manuals (ITM, ETM, CTI, TPIU, ETB). 
  • Examine existing open‑source CoreSight drivers (e.g., pyOCDOpenOCDCMSIS‑DSP/ITM).

5.2. Design Phase (Weeks 3‑4)

  • Mapping Table: For each instrumentation point (condition entry, decision evaluation, condition toggle) define a unique event ID and payload layout (e.g., 32‑bit word: [event‑type | condition‑id | outcome]). 
  • Instrumentation Hook: Decide whether to replace the default profiling calls (__llvm_profile_*) or to inject an additional pass that appends a CoreSight call after each existing instrumentation.

5.3. Implementation

PhaseActivitiesExpected Outcome
P1 – Clang Modification(Weeks 5‑7)Fork LLVM, add an optional -fcoverage-mcdc-coresight flag, implement a new MCDCCoreSightPass that emits llvm::CallInst to __mcdc_coresight_emit(uint32_t id, uint8_t value).Patched Clang that can be built with standard LLVM build system.
P2 – Runtime Shim(Weeks 8‑10)Implement __mcdc_coresight_emit in a header‑only library: on Cortex‑M use ITM_SendChar/ITM_SendWord; on Cortex‑A use memory‑mapped CTI registers or ETM software‑triggered packets. Provide fallback to a stub for non‑CoreSight targets.Portable shim that compiles with -mcpu=cortex-m4 etc.
P3 – Host Collector(Weeks 11‑13)Use pyOCD or OpenOCD to capture the SWO stream, parse the binary protocol, and rebuild the MCDC matrix (condition‑decision table). Export to gcov‑compatible .gcda files or a custom JSON format.mcdc_collect CLI tool that produces human‑readable reports and can be fed to existing gcovr/lcov pipelines.
P4 – Integration & CI(Weeks 14‑15)Create a Docker image containing the patched Clang, shim library, and collector; build CI jobs that compile a test suite, run on QEMU with CoreSight emulation, and verify coverage equality with GCOV.Fully reproducible build pipeline.

5.4. Evaluation

MetricMethod
Runtime OverheadMeasure execution time of each benchmark with (a) no instrumentation, (b) GCOV instrumentation, (c) CoreSight instrumentation. Use hardware cycle counters (DWT_CYCCNT).
Code‑size OverheadCompare ELF size (text + data) across the three variants.
Trace BandwidthRecord the SWO data rate (bytes/s) needed for a typical test run; verify it stays under typical UART‑SWO limits (e.g., 2 Mbit/s).
CorrectnessFor each benchmark, compute the symmetric difference between the MCDC coverage matrix derived from GCOV and from CoreSight.
UsabilityConduct a short user study (≈ 5 embedded‑engineers) to rate the ease of setup, readability of reports, and perceived usefulness.
Certification ImpactProduce a gap analysis showing how the generated trace can satisfy traceability requirements of DO‑178C/ISO 26262.

Statistical analysis (paired t‑test or Wilcoxon) will be applied to the timing and size measurements.

5.5. Dissemination

  • Open‑Source Release – All patches, libraries and tools under the MIT licence on GitHub. 
  • Technical Report – A 12‑page white‑paper summarising the design decisions, performance numbers, and integration guide. 
  • Conference Submission – Target venues such as ICSEISISAESEC/FSE, or the International Workshop on Model‑Based Testing (MBT).

6. Expected Contributions

AcademicPractical
C1 – A systematic method for redirecting compiler‑generated coverage data to a hardware trace infrastructure (CoreSight).P1 – A ready‑to‑use Clang‑CoreSight toolchain that eliminates filesystem‑based coverage collection on ARM embedded targets.
C2 – Empirical evidence on performance, code‑size, and bandwidth trade‑offs of hardware‑backed MCDC vs. software‑only coverage.P2 – Host‑side collector that produces standard .gcda‑compatible files, enabling existing coverage analysis pipelines to be reused.
C3 – Insight into certification‑friendly instrumentation: deterministic timing, traceability, and auditability.P3 – Documentation and sample projects for DO‑178C / ISO 26262 development teams, shortening the onboarding curve.
C4 – Open‑source artifacts (LLVM patch, runtime shim, collector) that can be extended to other coverage criteria (e.g., MC/DC, condition‑coverage) or other hardware trace engines (e.g., Intel PT, RISC‑V ETM).P4 – A CI‑ready Docker image that can be integrated into existing build pipelines for safety‑critical firmware.

7. Work Plan & Timeline (≈ 6 months, 38 weeks)

WeekActivity
1‑2Literature survey, tool‑chain setup, board selection (Cortex‑M4 + SWO, optional Cortex‑A).
3‑4Design mapping table; draft specification of CoreSight event format.
5‑7Implement LLVM pass (-fcoverage-mcdc-coresight), compile patched Clang.
8‑10Develop CoreSight shim library, test on QEMU and on real hardware (SWO).
11‑13Build host collector (pyOCD/ OpenOCD based), parse events, reconstruct coverage matrix.
14‑15Create Docker/CI environment, integrate all components, run regression suite.
16‑18Benchmark suite (runtime, size, bandwidth) on 4–5 representative applications.
19‑20Statistical analysis, generate plots, write performance chapter.
21‑22Conduct user‑study / interview with embedded engineers, collect SUS scores.
23‑24Write certification‑impact analysis (traceability, auditability).
25‑27Draft thesis chapters (introduction, background, methodology, results).
28‑29Polish documentation, prepare open‑source release, write user guide.
30‑32Submit conference abstract / paper; incorporate reviewer feedback.
33‑35Final thesis write‑up, include all appendices (code listings, test data).
36‑38Proof‑reading, supervisor review, final submission, defence preparation.

8. Resources Required

ResourceDescription
ARM Development BoardCortex‑M4 or Cortex‑A board with CoreSight SWO/ETB (e.g., STM32F7 DiscoveryNXP i.MX RT1060, or Xilinx Zynq‑7000).
Debug ProbeSegger J‑Link, ST‑Link, or open‑source pyOCD compatible probe for SWO capture.
Host PCLinux workstation, 16 GB RAM, recent GCC/Clang.
SoftwareLLVM/Clang source (≥ 18), QEMU with CoreSight model, pyOCD/OpenOCD, Python 3.11, CMake, Docker.
Time~ 38 weeks full‑time (≈ 780 h).
OptionalAccess to a safety‑critical code base (e.g., AUTOSAR demo) for realistic benchmarks.

9. Risks & Mitigation

RiskImpactMitigation
R1 – CoreSight configuration complexity (different boards expose different trace ports).Delay in obtaining reliable event stream.Start with a board that has a well‑documented SWO (e.g., STM32). Build an abstraction layer that can be swapped for other ports.
R2 – LLVM build failures (patching a large code base).Stalls development.Use the LLVM monorepo Docker image for reproducible builds; keep changes isolated in a separate pass plugin rather than modifying core files when possible.
R3 – Event‑rate overflow on high‑frequency loops (too many condition evaluations).Lost coverage data.Implement sampling mode (emit only every N‑th evaluation) and evaluate its impact on coverage completeness.
R4 – Certification acceptance – regulators may question non‑standard coverage collection.Reduced practical impact.Produce a mapping document that shows a one‑to‑onecorrespondence between CoreSight events and the standard MCDC definition; include traceability artifacts.
R5 – Limited participant availability for user study.Inconclusive usability results.Use an online questionnaire with representative screenshots; complement with expert interviews.

10. Evaluation Criteria (for the thesis grading)

  1. Technical Correctness – The modified Clang generates correct CoreSight events; the collector reproduces the same MCDC matrix as GCOV. 
  2. Performance Analysis – Quantitative, statistically sound comparison of overheads. 
  3. Toolchain Completeness – End‑to‑end workflow from source to coverage report, documented and reproducible. 
  4. Scientific Rigor – Clear problem formulation, research questions, and discussion of results. 
  5. Contribution to Practice – Usability guide, open‑source release, and relevance to safety‑critical development.

11. Conclusion

This thesis bridges two powerful technologies – Clang’s modern MCDC instrumentation and ARM CoreSight hardware tracing – to produce a low‑overhead, real‑time coverage collection framework tailored for safety‑critical embedded software. 

The student will gain deep expertise in: 

  • Compiler internals (LLVM/Clang passes). 
  • Embedded hardware debug infrastructure (CoreSight). 
  • Low‑level C/C++ instrumentation and real‑time trace handling. 
  • Empirical performance evaluation and statistical analysis.

The outcome is a practical, open‑source artifact that can be immediately adopted by companies developing DO‑178C/ISO 26262 compliant firmware, and a research contribution that can be presented at major software‑engineering or embedded‑systems conferences.

Related Thesis Topics

FPGA‑Based Quantum‑Computer Simulator with a Web‑Interface

FPGA‑Based Quantum‑Computer Simulator with a Web‑Based Design‑Run‑Analyse Front‑End

Formalising Trustworthy AI

Mapping the European AI HLEG Trustworthiness Framework to Formal Methods

AI‑Enhanced FMaaS for Copilot

AI‑Enhanced Formal‑Methods‑as‑a‑Service: Designing, Implementing, and Evaluating an MCP‑Based Backend that Powers a VS Code Copilot Extension for Automated Program Analysis and Verification.

Electromagnetic Tracking of Pasture Fences

Maintenance of electric fences is essential to keep animals secure in pasture fields. For any automatic maintenance procedure, the location of the fence needs to be known. Techniques like GNSS or visual detection come with drawbacks that make them unapplicable for the real application, as the accuracy and reliablity are not high enough to precisely […]

AI-based 3D Volume Reconstruction of Ultrasound Images

This project aims to develop a machine learning model capable of reconstructing a 3D volume using the ultrasound images of a forearm, tracked probe data, and the calibration matrix as inputs. The model should be able to compensate for inaccuracies in the calibration matrix, thereby improving the accuracy of the reconstructed 3D volume. What you […]