The science behind RehabVision
An explainable AI platform for pervasive, personalised knee rehabilitation with real-time biofeedback from consumer video — grounded in peer-reviewed research by Martin Štufi, Ph.D. (Solutia) and Boris Bačić, Ph.D. (AUT).
Research team
Led by two principal investigators across Europe and New Zealand
Martin Štufi, Ph.D.
Principal Investigator · Solutia s.r.o.
Prague, Czech Republic
Project coordinator for RehabVision under the TWIST programme. Leads the platform engineering, clinical integration, and deployment strategy. Responsible for the ICIST 2026 paper, system architecture, and the end-to-end validation pipeline.
Prof. Boris Bačić, Ph.D.
Research Partner · Auckland University of Technology
Auckland, New Zealand
Key research partner and co-author. Leads the computer-vision and AI methodology at AUT, including the reference pose-estimation pipeline described in arXiv 2412.20733. Brings deep expertise in markerless motion analysis for sport and rehabilitation.
Research questions
Four questions spanning validation and clinical applicability
Kinematic accuracy
What measurement accuracy (MAE, RMSE, Pearson correlation) can the monocular CV pipeline achieve for joint flexion/extension angles compared to goniometric reference across front-view and side-view configurations?
Real-time performance
What end-to-end latency and throughput (FPS) are achievable on consumer hardware to support real-time biofeedback within clinically acceptable thresholds (< 100 ms)?
Movement quality detection
How reliably can confidence-aware, view-specific rules detect clinically relevant issues (insufficient ROM < 60°, valgus deviation > 15°, irregular cadence, knee-past-toe violations) compared to physiotherapist annotations?
Environmental robustness
Under home-use conditions (lighting 50–1000 lux, distance 1–3 m, partial occlusions), which factors most significantly affect tracking reliability?
Preliminary results
Validated across five subjects, 150 repetitions
Platform details
Platform architecture
RehabVision implements a container-based microservices architecture comprising five components: (1) Computer Vision Service processing monocular video using MediaPipe Pose or Google ML Kit; (2) AI Motion Analysis Service applying interpretable rule-based detectors; (3) Clinical Backend API orchestrating workflows; (4) Web/Mobile Frontend rendering real-time biofeedback; and (5) AR/VR Runtime for future immersive guidance.
Privacy by design
Raw video never leaves the patient's device without explicit informed consent. Processing produces a stick-figure overlay with no faces or identifiable features. Session output is a CSV time series (schema v0.1) plus the anonymised overlay MP4 — no PII in file names, metadata, or telemetry.
What we measure
Per session: range of motion (ROM = maxAngle − minAngle), repetition count, mean and maximum knee angle, time in target band, frame-level exercise correctness (F1 > 0.83), and a confidence continuity check. Mean absolute errors below 5° for knee angles with latency of 12–18 ms per frame.
Current scope — knee PoC
Phase 1 (F1) covers a knee-focused proof of concept with four rule-based detectors: Insufficient ROM (< 60°), Irregular Cadence (CV > 0.4), Valgus Deviation (> 15°), and Knee-Past-Toe Guard. Extension to hip, ankle, and shoulder joints is planned for Phase 2.