GDPR Article 35
Data Protection Impact Assessments
This register contains DPIAs for high-risk processing activities involving AI, automated analysis, and third-party data sharing. Each assessment follows ICO DPIA guidance and EDPB recommendations.
Risk Rating Key
Residual risk = risk level after mitigations are applied. All ratings follow the ICO 5×5 risk matrix methodology.
AI-Powered Investigative Research Generation
Description of Processing
Use of large language models (LLMs) via Abacus AI to generate research summaries, source analysis, and investigation briefs from user-provided topics and context. This constitutes automated processing that produces content used in editorial decision-making.
Necessity & Proportionality
The processing is necessary to deliver the core service (investigative research acceleration). No less intrusive alternative achieves equivalent utility. Human review is mandated before any generated content is published or acted upon.
Data Processed
- Investigation topic descriptions and context
- Source URLs and reference materials
- User prompts containing investigation details
- Generated research summaries and analysis
Identified Risks & Mitigations
LLM hallucination producing factually incorrect research that, if unchecked, could lead to reputational harm to data subjects.
Mitigations:
- Human-in-the-loop policy: all AI outputs require editorial review (VIQ-POL-051)
- Source attribution and citation requirements in generated content
- Confidence scoring and uncertainty flagging in research outputs
- Responsible AI Policy (VIQ-POL-050) with editorial standards
Prompt injection or data leakage through adversarial inputs causing the model to reveal data from other investigations.
Mitigations:
- Organisation-scoped data isolation at database level
- Prompt sanitisation before LLM submission
- No model fine-tuning on user data — stateless inference only
- Topic-level RBAC prevents cross-investigation access
International data transfer: investigation context sent to US-based LLM infrastructure for inference.
Mitigations:
- Standard Contractual Clauses (SCCs) with Abacus AI
- Data minimisation: only necessary context included in prompts
- No persistent storage of prompts by LLM provider
- Transfer Impact Assessment completed (VIQ-TIA-001)
Consultation Notes
ICO guidance on AI and data protection consulted. EDPB Guidelines 06/2020 on AI transparency reviewed. Processing aligns with journalism exemption under DPA 2018 s.174 where applicable.
AI Deepfake Detection & Media Analysis
Description of Processing
Automated analysis of media content (images, video, audio) to detect potential deepfakes, manipulated media, and synthetic content. Results inform editorial decisions about source credibility.
Necessity & Proportionality
Essential for maintaining journalistic integrity in an era of synthetic media. Manual detection is insufficient given the volume and sophistication of manipulated content.
Data Processed
- Media files submitted for analysis
- Analysis results (manipulation confidence scores)
- Metadata extracted from media files
- EXIF data, steganographic signatures
Identified Risks & Mitigations
False positive deepfake detection incorrectly flagging authentic media, potentially discrediting legitimate sources.
Mitigations:
- Confidence thresholds clearly communicated (never binary yes/no)
- Human review required before any editorial action based on analysis
- Multiple detection methods cross-referenced
- Regular model evaluation and bias testing (VIQ-POL-053)
Processing of biometric data (facial features) within media files.
Mitigations:
- Biometric data not stored separately — only aggregate analysis scores retained
- Processing limited to editorial assessment purposes
- Journalism exemption (DPA 2018 s.174) applies to editorial analysis
Consultation Notes
ICO guidance on biometric data consulted. EDPB Guidelines on facial recognition reviewed. Analysis classified as editorial processing under journalism exemption.
Script Generation & Publication Automation
Description of Processing
AI-assisted generation of investigation scripts, publication drafts, and editorial content from structured investigation data. Content is generated for journalist review and editing.
Necessity & Proportionality
Accelerates the editorial workflow. All generated content undergoes mandatory human review and editing before any publication.
Data Processed
- Investigation findings, entity relationships, timelines
- Source attributions and evidence chains
- Generated draft scripts and narratives
- Editor revision history
Identified Risks & Mitigations
Generated content containing unverified allegations about identifiable individuals.
Mitigations:
- Mandatory editorial review before any publication
- Source verification checklist required before publishing
- Legal review process for sensitive content
- Right-of-reply workflow for named individuals
Automated profiling of individuals through entity relationship mapping.
Mitigations:
- Entity data derived from public sources and investigation evidence only
- No automated decision-making with legal effects
- Organisation-scoped access controls
- Audit logging of all entity data access
Consultation Notes
IPSO guidance on automated content generation reviewed. All outputs treated as journalist work product requiring editorial judgment.
VirusTotal Malware Scanning
Description of Processing
Automated submission of uploaded document URLs to VirusTotal for malware scanning. File hashes and metadata are shared with VirusTotal’s cloud service to determine if files contain malicious content.
Necessity & Proportionality
Necessary to protect platform users and infrastructure from malware. Manual scanning is impractical at scale. VirusTotal aggregates results from 70+ antivirus engines.
Data Processed
- File URLs (pre-signed, time-limited)
- File hashes (SHA-256, MD5)
- File size and type metadata
- Scan results and threat classifications
Identified Risks & Mitigations
Confidential investigation documents exposed to VirusTotal’s community scanning platform.
Mitigations:
- URL scanning used (not file upload) — reduces exposure of file contents
- Pre-signed URLs expire within 15 minutes
- Users informed of scanning in document upload flow
- Option to disable scanning for highly sensitive documents (future enhancement)
International transfer of file metadata to VirusTotal (US/EU infrastructure).
Mitigations:
- Standard Contractual Clauses with VirusTotal/Google
- Only metadata and URLs transferred, not full file content
- VirusTotal privacy policy reviewed and documented
Consultation Notes
VirusTotal Terms of Service reviewed. Processing classified as necessary for legitimate security interest under Art. 6(1)(f).
Automated Security & Anomaly Monitoring
Description of Processing
Automated analysis of audit logs to detect suspicious activity patterns including mass data deletions, brute-force authentication attempts, unusual access times, and privilege escalation. Alerts sent to administrators.
Necessity & Proportionality
Required for ISO 27001 compliance (A.12.4.3) and timely incident response. Manual log review is insufficient for real-time threat detection.
Data Processed
- Audit log entries (user IDs, actions, timestamps, IP addresses)
- Authentication failure patterns
- Access frequency and timing patterns
- Aggregated anomaly scores
Identified Risks & Mitigations
Employee monitoring concerns: behavioural profiling through access pattern analysis.
Mitigations:
- Monitoring focused on security events, not productivity metrics
- Users informed of monitoring in privacy policy and employment terms
- Aggregated pattern detection, not individual tracking
- Alerts reviewed by human administrators before any action
Consultation Notes
ICO Employment Practices Code consulted on legitimate monitoring. Processing proportionate to security objectives.
Data Protection Officer
For questions about these DPIAs, to request the full assessment documentation, or to raise concerns about data processing, contact our Data Protection Lead: