← Deep-Check
Whitepaper v1.0February 2026
Technical Whitepaper

Deep-Check: Continuous Identity Verification and Document Forensics for the Synthetic Era

Deep-Check Research Team · February 2026

Abstract

We present Deep-Check, a multi-modal continuous identity verification platform that combines keystroke dynamics biometrics, facial liveness detection, and document forensics to detect fraud in remote sessions and document submission workflows. The system operates entirely client-side for biometric signal extraction, transmitting only derived feature vectors for server-side ML inference. We describe the architecture of three analytical modules — behavioral biometrics, anti-deepfake liveness, and image forensics — and report performance characteristics under controlled evaluation conditions. The platform is designed to comply with GDPR, the EU AI Act, and best practices for privacy-by-design in biometric systems.

1. Introduction

The proliferation of generative AI tools (large language models, image synthesis, voice cloning, and deepfake video generation) has fundamentally altered the threat landscape for remote identity verification. A remote candidate can now pass a technical interview using LLM-generated code, present a synthetic face via a virtual camera, and submit AI-generated supporting documents — all while appearing entirely legitimate to a human reviewer.

Existing point-in-time identity verification solutions (document scanning at login, facial recognition at session start) are insufficient because they verify identity once and assume continuity. Deep-Check addresses this by continuously verifying behavioural consistency throughout a session, not just at its inception.

This whitepaper describes the technical design of three integrated modules:

  • Module 1 — Keystroke Biometrics: Continuous typing pattern analysis for identity continuity and AI-assisted input detection
  • Module 2 — Facial Liveness: Real-time detection of deepfakes, virtual cameras, and pre-recorded video spoofs
  • Module 3 — Document Forensics: Analysis of submitted images for manipulation, AI generation, and metadata falsification

2. Module 1 — Keystroke Biometrics

2.1 Signal Capture

Keystroke dynamics are captured at the DOM event level via keydown andkeyup listeners on a Monaco editor instance. Two primary timing signals are extracted:

  • Flight time (inter-key interval): Time in milliseconds between keyup of key N and keydown of key N+1. Human neuromotor minimum is approximately 15ms; values below this threshold indicate synthetic input.
  • Hold time (key duration): Time between keydown and keyup for a single key. Typically 40–120ms in natural typing.

In addition to single-key timing, bigram timing (digrams) is collected: the flight time for specific key-pair transitions (e.g., “t→h”, “i→o”). These transition times are highly stable within an individual and vary significantly across individuals, making them useful for identity matching beyond aggregate statistics.

2.2 Feature Extraction (high-dimensional vector)

A proprietary multi-dimensional feature vector is computed from a rolling window of keystroke events and submitted to the ML inference endpoint. The vector spans four families of biometric signals:

FamilySignal TypeDescription
Temporal dynamicsFlight & hold statisticsMean, standard deviation, skewness, and kurtosis of inter-key intervals and key-hold durations. Captures the stochastic variability unique to human motor execution.
Entropic structureShannon entropy (multi-channel)Information-theoretic measure of distributional regularity applied independently to flight and hold histograms. Synthetic input exhibits characteristically low entropy.
Rhythmic periodicitySpectral analysis (FFT)Dominant-frequency amplitude computed via Fast Fourier Transform over the keystroke time series. Automated tools produce detectable periodic patterns absent in human typing.
Temporal evolutionVelocity & fatigue signalsLinear trend and regression slope of typing speed over the session. Human typists exhibit measurable fatigue drift; programmatic input does not.
Micro-correction behaviourCorrection keystroke analysisStatistical properties of correction keystrokes (timing, frequency, reaction latency) that reflect genuine cognitive load and error-correction cycles.
Bigram biometricsDigraph pair consistencyPair-wise inter-key interval variability across all observed character combinations. Each person exhibits a stable, unique bigram profile that is computationally expensive to replicate.
Burst injection detectionSub-100ms key cluster rateRate of implausibly fast multi-key clusters per session volume. Paste injection, clipboard automation, and LLM-assisted input produce anomalous burst patterns.
Session throughputEffective typing velocityDerived words-per-minute with outlier sensitivity for both extremes of the human plausible range.

Exact feature definitions, internal identifiers, and weighting coefficients are proprietary and withheld to prevent adversarial calibration. The full specification is available to authorised partners under NDA.

2.3 ML Model

The classification layer uses a dual-model ensemble architecture combining a supervised gradient-boosted classifier with an unsupervised anomaly detection layer trained exclusively on genuine human sessions. The ensemble design requires an adversary to simultaneously fool two independent statistical models — one optimised for class separation, one for novelty detection — substantially raising the cost of evasion attacks compared to single-model systems.

Models are exported to ONNX format for server-side inference using onnxruntime-node, ensuring deterministic, version-controlled inference independent of client device capabilities. Model artifacts are stored outside the public HTTP path and are not directly accessible to clients.

Internal architecture details (tree count, feature weights, decision thresholds, and training data distributions) are withheld to prevent adversarial calibration.

2.4 Adaptive Baseline (Identity Matching)

When an enrollment profile exists for the session candidate, a Mahalanobis distance comparison is performed between the live session's feature distribution and the enrollment baseline. The identity match score is computed as:

match_score = 100 × exp(−λ × √Σ((xᵢ − μᵢ)² / σᵢ²))

The decay constant λ is calibrated from enrollment validation data and withheld.

The Welford online algorithm is used to update the adaptive baseline during a session, allowing the system to account for fatigue and context-switching without being locked to initial typing conditions.

3. Module 2 — Facial Liveness Detection

3.1 Architecture

Liveness detection runs entirely client-side using face-api.js with the TinyFaceDetector (SSD MobileNetV1-derived, ~190KB) and the 68-point FaceLandmark68Net models, both loaded from /public/models/. No video data is transmitted to the server.

3.2 Detection Signals

SignalMethodDeepfake/Spoof Indicator
Blink detectionEye Aspect Ratio (EAR) via landmarks 36–41, 42–47Blink rate <2/min (photo) or >50/min (artifact). GAN deepfakes often blink at unnatural rates
Micro-saccadesVariance of horizontal gaze ratio across 60-frame historyReal eyes have micro-jerk movements; deepfake video is unnaturally smooth (score <10 = suspicious)
Lighting challengeScreen flashes white (500ms); EAR measured during flashReal pupils constrict and gaze changes; pre-recorded video shows no response
Blink edge trajectoryEyelid closure speed symmetry analysisAI renderers often show unnatural snap-close without the natural asymmetric trajectory
Oculo-manual desyncCross-correlation between cursor movement and gaze directionVirtual camera: cursor active but gaze frozen on fixed point
Micro-movementsNose tip position variance over timePhoto: zero variance. Deepfake: artificially periodic. Human: stochastic

3.3 False Positive Mitigation

Conservative thresholds, multi-frame consensus requirements, and 60-second cooldowns between alerts prevent alert flooding from normal user behaviour (reading pauses, natural gaze variation, corrective blinking). All camera-based alerts require a 7-second startup grace period to account for camera initialisation artefacts.

4. Module 3 — Document Forensics

4.1 Error Level Analysis (ELA)

ELA exploits the lossy compression model of JPEG encoding. When a JPEG image is re-saved at a known quality level (Q=75), regions that have already been compressed at that quality show minimal change, while regions that were edited and re-saved at a different quality show larger discrepancies. This differential is amplified (×12) and rendered as a heatmap.

The ELA score is derived from the mean heatmap brightness (normalised to 255) and the fraction of 8×8 blocks with mean brightness above a 40-point threshold:

ela_score = 0.60 × (mean_diff / 255 × 100) + 0.40 × (suspicious_blocks / total_blocks × 100)

AI-generated images that have never been JPEG-compressed score anomalously low on ELA (absence of compression artefacts is itself a signal). This is captured by the noise module.

4.2 EXIF Metadata Analysis

EXIF metadata is extracted using exifr (client-side, no server upload). Anomaly signals include:

  • Presence of known editing software strings (Photoshop, GIMP, Affinity, Canva, Stable Diffusion, Midjourney, DALL-E) in the Software or CreatorTool fields
  • Discrepancy between DateTimeOriginal and DateTime exceeding 60 seconds
  • Absence of Make and Model fields (camera metadata always present in genuine device captures)
  • Complete absence of EXIF data (common in screenshots and synthetic images)

4.3 Noise Uniformity Analysis (AI Image Detection)

Images generated by diffusion models (Stable Diffusion, DALL-E, Midjourney) and GAN architectures exhibit characteristically uniform noise distributions. Real photographs contain heterogeneous noise from sensor shot noise, JPEG quantisation, and scene variation. Deep-Check computes:

  • Laplacian variance — Measures high-frequency content. Low values indicate unnaturally smooth images.
  • Block variance coefficient of variation — Standard deviation of per-16×16-block variance, divided by mean block variance. Real photos: high CV. AI images: low CV (uniform).
noise_score = 0.65 × uniformity_score + 0.35 × laplacian_flag

4.4 Aggregate Risk Score

risk_score = 0.50 × ela_score + 0.30 × exif_score + 0.20 × noise_score

Risk levels: 0–29 = Clean · 30–59 = Suspicious · 60–100 = High Risk

5. Privacy Architecture

Deep-Check is built privacy-first:

  • All signal extraction runs client-side in the browser. Raw video, audio, and keystroke sequences never leave the user's device.
  • Only derived numerical feature vectors (18 floats for keystroke, statistical aggregates for enrollment) are transmitted over HTTPS.
  • Document forensic analysis (ELA, EXIF, noise) is fully client-side. Only the final scores and metadata are persisted — not the original image.
  • Enrollment profiles store only mean, standard deviation, and bigram statistics — not reconstructable biometric signals.
  • All biometric profiles expire after 90 days and are hard-deleted from the database.

6. Known Limitations

  • Training data: The keystroke ML model is trained on synthetic data. Performance on real-world diverse user populations has not been formally evaluated. A validation study with a representative sample is planned.
  • Device variation: Keystroke timing is affected by keyboard type (mechanical, membrane, touchscreen). The adaptive baseline partially compensates for this but does not fully normalise cross-device variation.
  • ELA limitations: ELA is ineffective on PNG files (lossless compression) and on images that have been upscaled, screenshotted, or processed through a lossless pipeline before JPEG compression.
  • AI image detection: As generative models evolve, their noise characteristics change. The current noise analysis is based on known model families as of early 2026.
  • No 100% guarantee: No biometric or forensic system achieves perfect accuracy. Deep-Check outputs are probabilistic and should always be combined with human review.

7. Roadmap

ItemTarget
Independent algorithm audit by external cybersecurity firmQ3 2026
Validation study on real-world diverse keystroke populationQ3 2026
DPIA (Data Protection Impact Assessment) completionQ2 2026
EU AI Act technical documentation (Article 11)Q3 2026
ISO 27001 certification process initiationQ1 2027
ENS (Esquema Nacional de Seguridad) certificationQ2 2027
Publication of peer-reviewed technical paperQ4 2026
Video deepfake detection via temporal consistency analysisQ4 2026
PDF document forensics (embedded image ELA, font analysis)Q3 2026

8. Contact & Citations

For technical questions, partnership inquiries, or to request a Data Processing Agreement:

Deep-Check Technical Whitepaper v1.0 · February 2026 · Deep-Check Inc.
This document is provided for informational purposes. Performance metrics are derived from internal evaluation and are subject to revision following independent audit.