SCENE 1 0:00

Preparing for Rapid AI Change

AI in Investigations

Current Challenges, Practical Applications
and Future Opportunities

Iren Irbe

Head of Coordination and Development Unit

Estonian Tax and Customs Board

PhD Researcher in Applied Informatics

Tallinn University


Preparing for Rapid AI Change

Kurzweil Curve - exponential technological progress
Ray Kurzweil's exponential growth curve.

Current Investigation Challenges

Investigations are becoming cognitively overwhelming.

1 TB+ per case

Volume

A single case can exceed a terabyte of data and digital evidence.

Many systems

Fragmentation

Spread across multiple systems and incompatible formats, with no single coherent view.

Deadlines

Time pressure

Investigators must connect facts and reach decisions under tight deadlines.

Undocumented

Knowledge loss

Experience, context and decision rationale stay in people's heads, and leave with them.

Growing

Complexity

More entities, relationships and information sources with every case.

Key Insight

Investigators spend too much time finding and organising information, and too little time analysing it.

Information Overload (Miller, 1956)

Human working memory holds roughly 7±2 items at once. When data volume exceeds that capacity, processing degrades. Modern investigations routinely exceed these limits, creating a cognitive bottleneck.


Where AI Can Help

From finding information to preserving knowledge and supporting analysis.

01

Find Information Faster

  • Search across multiple systems and information sources
  • Translate multilingual information
  • Access previous cases, documents and organisational knowledge
02

Understand Information Faster

  • Summarise reports and case materials
  • Extract key entities, events and relationships
  • Organise large volumes of information into a structured view
03

Preserve Knowledge

  • Support onboarding and knowledge transfer
  • Capture organisational knowledge and expertise
  • Reduce dependence on individual experts
04

Support Analysis and Forecasting

  • Identify patterns, relationships and anomalies
  • Generate and compare alternative hypotheses
  • Challenge assumptions and reduce confirmation bias
  • Generate risk indicators, forecasts and predictive insights based on available data
  • Support complex investigations with explainable analytical assistance
Example: Challenging a Hypothesis

Investigator: "I believe this company is involved in VAT fraud."

AI: "Possible. But based on the available evidence, what if this is a subcontracting arrangement, an accounting error, or a money laundering case instead?"

From Analysis to Foresight

Beyond explaining what happened, AI can flag risk indicators and project likely outcomes, always with the evidence it relied on shown alongside.


Shortcomings of Current AI Solutions

General-purpose AI is not built for investigative work.

Reliability

  • Hallucinations and factual inaccuracies
  • Overconfidence in generated outputs

Reasoning and Transparency

  • Limited understanding of investigative context
  • Lack of transparency and explainability
  • Tendency to reinforce existing assumptions

Security and Control

  • Data protection and confidentiality concerns
  • Dependence on external cloud services

Why It Matters

In investigations the cost of a confident wrong answer is high, so these limits decide where AI can be trusted and where a human must stay in control.

AI is useful, but not sufficient on its own.


What We Have Learned

Three lessons that shape how AI should fit investigative work.

The Information Problem

  • Information overload limits effective analysis
  • Context and relationships matter as much as the evidence itself

Knowledge at Risk

  • Expertise and reasoning are difficult to preserve
  • Knowledge is often lost when experienced employees leave

What AI Must Provide

  • Counter fixation on a single explanation
  • Support human judgement rather than replace it
  • Stay explainable, traceable and legally compliant

Human Judgement First

Investigators need support for understanding and reasoning, not just answers from an AI "oracle".

The goal is to strengthen human reasoning, not replace it.


Current AI Capability

Infrastructure

  • Established a secure on-premises AI environment for testing and evaluation
  • Deployed local language models, including Gemma and EuroLLM

Evaluation

  • Testing translation, summarisation, information retrieval and document analysis
  • Evaluating model performance, accuracy and suitability for operational use

Adoption and Governance

  • Exploring practical use cases with managers, analysts and investigators
  • Assessing security, legal and governance requirements for future deployment

On-Premises Models

Gemma and EuroLLM run inside the agency's own infrastructure, so sensitive operational data never leaves the controlled environment.

Learning where AI is trustworthy before we rely on it.


Next Step: AI-Supported Investigations

How a case turns from raw material into investigator judgement.

01

Documents

Reports
inspection reports, intelligence notes
Emails
communications between traders and brokers
Bank records
payments, transfers, account activity
Registry data
company ownership, vessel registries
02

Information

Entities
people, companies, vessels, accounts
Events
shipments, payments, inspections
Relationships
ownership, communications, transactions
Timelines
sequence of activities over time
03

Insights

Patterns
repeated trade routes, recurring counterparties
Anomalies
unusual payments, unexpected routing changes
Alternatives
legitimate business activity versus concealment
Risk indicators
shell companies, high-risk jurisdictions, hidden ownership
04

Investigator

Understanding
situational awareness and context
Decisions
assessing risks and setting priorities
Actions
initiating investigations and enforcement
From Storage to Insight

Current tools mostly store and retrieve. The next step is connecting evidence into a structure an investigator can read and question.

The Human is the Destination

Each step adds structure, but the chain ends with the investigator. AI prepares the material; the judgement, decisions and actions remain human.


Investigative Reasoning is Iterative

A conclusion is the end of a loop, not a straight line.

Rarely

Evidence Conclusion

A straight path from evidence to conclusion is the exception.

AI can support this process by

  • Identifying missing information
  • Suggesting alternative explanations
  • Highlighting contradictory evidence
  • Preserving the reasoning process

Usually, a loop

Evidence and hypothesis alternate until they agree.

evidence hypothesis

Initial evidence

Working hypothesis

Additional evidence

Refined hypothesis

Additional evidence

Refined hypothesis

Conclusion

Reached once evidence and explanation hold together

The Confirmation Trap

When the first hypothesis seems plausible, we tend to look for evidence that confirms it. Each new piece can confirm, change or overturn the current understanding.

A Devil's Advocate, in Plain Language

AI does not have to give the answer. It can help ask the harder questions:

  • What else could explain this?
  • Which evidence argues against our hypothesis?
  • What information do we still need?
  • Are there similar past cases?

Example: AI-Assisted Investigation

From three raw emails to a structured picture of the case.

Input

Three related emails

The raw material an investigator would otherwise read and cross-reference by hand.

Analysis

AI-Supported

  • Entity extraction
  • Relationship mapping
  • Timeline creation
  • Identification of key actors and events
Output

Structured picture

  • Visual knowledge graph
  • Structured overview of connections
  • Faster understanding of investigative context
Knowledge graph for case NST-2024-0847: entities, evidence and hypotheses extracted from the email archive, shown alongside a timeline and case entities panel
Output — knowledge graph linking subjects, evidence and hypotheses extracted from the three emails (case NST-2024-0847).
Case entities extracted by AI: Subject A, NorthStar Trading, Unknown Sender and Cayman Holdings LLC, with connections and evidence tags
Entity extraction — actors and organisations identified from the source material.
Investigation timeline: initial hypothesis formed, evidence retrieved, preliminary analysis complete
Timeline creation — events and analysis steps placed in sequence.

Weighing the Hypotheses

Evidence timeline for case NST-2024-0847: three supporting and four challenging items, each tagged to hypothesis H1, H2 or H3 with a timestamp
Each piece of evidence is tagged to a hypothesis and marked as supporting or challenging, so contradicting evidence stays visible instead of being explained away.
Decision-tree outcomes for the three hypotheses: blackmail confirmed (35%), fabrication narrative (60%) and collusion (5%), each with consequences and a scenario selector
Each hypothesis is carried through to its consequences, so the investigator can weigh what would follow from each outcome before acting.
Devil's Advocate panel raising counter-arguments, a confirmation bias warning, logical analysis and a recommendation for the same case
Devil's Advocate — counter-arguments, a confirmation-bias warning and a recommendation for further evidence.
Phase portrait simulation showing attractor basins, a cusp fold and trajectories between stable and creative basins
Simulation — modelling how the situation could shift between stable and unstable outcomes over time.
Input to Insight

The same three emails that take time to read by hand become a structured graph the investigator can grasp at a glance.


Live Demo: AI Summarisation Module

The AI ingests a stream of structured Knowledge Objects (KOs), representing different evidence sources, and synthesises them into a coherent, readable summary.

Input View (Knowledge Objects) Token Usage: 842/8192
// INGESTED EVIDENCE STREAM (JSON-LD)
KO-001 (Incident Report):
"At 02:35, silent alarm at Central Data Facility. Rear door unsecured. Guard J. Kask found unconscious".
KO-002 (Surveillance Log):
"Camera 04 captures Blue Van (771-BKV) departing at 02:15. Driver unidentifiable. Logs 02:00–02:30 deleted".
KO-003 (Suspect Interview):
"Suspect A. Tamm (Owner 771-BKV) claims alibi: 'Night Market 22:00–03:00'. Status: UNVERIFIED".
KO-004 (Forensics Preliminary):
"USB Drive (Ev-001) recovered near rack 14. Contains encrypted partition. Traces of 'DarkSide' ransomware signature."
KO-005 (Toxicology Report):
"Guard J. Kask blood sample positive for Zolpidem (sedative). Dosage consistent with forced ingestion approx 01:30."
KO-006 (ANPR Hit):
"Vehicle 771-BKV detected by camera #442 (Pärnu Mnt) heading South at 02:45. Speed: 110km/h."
KO-007 (Witness Statement):
"Market vendor M. Tamm (no relation) states stall #42 was closed at 22:00. Contradicts Suspect A's alibi."
KO-008 (Financial Intel):
"Wallet 0x7a...f2 linked to A. Tamm received 2.5 BTC at 03:15. Sender wallet flagged as 'DarkSide Affiliate'."
KO-009 (Background Check):
"A. Tamm: Prior conviction (2021) for cyber-facilitated fraud. Known associate of 'The Broker' (Suspect B)."
KO-010 (Network Log):
"Firewall alert 02:10: Outbound SSH connection to IP 185.x.x.x (Moldova). 4.2GB data exfiltrated."
KO-011 (Physical Evidence):
"Latent print lifted from Server Rack 14 handle. Match: A. Tamm (99.9% confidence)."
KO-012 (Suspect B Sighting):
"Patrol unit reports individual matching description of 'The Broker' entering vehicle 771-BKV at 01:45."
KO-013 (Dark Web Chatter):
"Post on 'BreachForums' at 03:30: 'Fresh gov database for sale. Estonia origin.' User: 'SilentNight'."
KO-014 (Vehicle Search):
"Vehicle 771-BKV intercepted at 04:00. Laptop (Ev-002) found under passenger seat. Driver A. Tamm detained."
KO-015 (Laptop Forensics):
"Ev-002 contains SSH keys matching Central Data Facility server. Browser history shows access to 'BreachForums'."
KO-016 (Arrest Report):
"Suspect B ('The Broker') apprehended at safehouse. Confirms A. Tamm was hired for physical access."
Task: Synthesize KOs into Executive Briefing.
Simulation of multi-source evidence summarisation.
System Specs
● Online

Model: Mistral 7B (Ollama)

Input: JSON-LD Stream

Context: 8k Tokens

Mode: Air-gapped (Offline)

Why Summarise KOs?

When the AI summarises structured KOs rather than arbitrary free text, the risk of hallucination drops, because the summary must rely on facts already recorded in the knowledge graph.


Interactive Demo: Mobile + Desktop

Collaborative hypothesis workflow — from field officer to analyst in real time.

Mobile App Voice-based capture in the field
Desktop App Investigation Platform (A4 format)

Field Capture

Officers use voice to capture observations during patrol, interviews, or inspections. The mobile interface prioritises speed and minimal cognitive load.

Deep Analysis

Analysts access the full knowledge graph, entity relationships, and reasoning chains through the desktop platform's multi-widget layout.


Assistant Desktop

Interactive prototype of the investigation workspace.

This prototype demonstrates the multi-widget dashboard. Use the sidebar to switch between views (Home, Dashboard, Documents, Analysis, Chat) and select a role from the dropdown to see role-specific configurations.


Interface

Interactive prototype of the voice-first assistant designed for high-stress environments.

Key Features

01

Voice-First Interaction

Prioritising voice lowers the cognitive barrier for articulating tacit knowledge, encouraging storytelling and in-the-moment narration.

02

Conversational Externalisation

The AI acts as a Socratic partner, using "intuition pumps" to elicit hidden assumptions during the conversation.

03

Groundedness (GraphRAG)

Every answer is anchored in the knowledge graph. The interface explicitly links generated insights back to their source KOs.

04

Context-Aware Adaptation

Adapts interface and suggestions based on the user's current role and location.

05

EASCI Integration

Bridges the gap between capturing raw experience and articulating it into structured knowledge.

Try it: Click the microphone icon in the prototype to simulate a voice capture session.

Mobile App

Field officers use voice input to quickly capture evidence and observations at the scene.

Desktop App

The analyst sees the knowledge graph updating in real time and can immediately work with new evidence.

Synchronisation

Information captured on mobile appears instantly on the desktop graph, so teams collaborate without delay.

Widget-Based UI

The dashboard adapts to the user's role. The investigator sees graphs and timelines; the analyst sees scenarios and hypotheses.

Roles

Select a role from the dropdown to see how the interface adapts: Investigator, Analyst, Supervisor, Prosecutor, Auditor, AML Specialist, Audio Secretary.

Cognitive Load Theory

Sweller (1988). Working memory is limited. In high-stress situations, the cognitive load of typing (visual-motor) competes with the task. Voice (auditory-verbal) uses a separate channel, reducing interference.

Socratic Method

The AI doesn't just record; it asks "Why?". "Why did you check the trunk first?" This forces the expert to make their implicit reasoning explicit.

Voice Efficiency

Speaking is roughly three times faster than typing (150 wpm versus 40 wpm). In high-stress environments, typing is a friction point that prevents knowledge capture.

Presenter Notes
  • Interactive Demo: This isn't a screenshot. It's the actual code running in an iframe.
  • Why Voice? It's not just convenience. It's about cognitive load. Police officers can't type while assessing a threat.
  • Socratic Partner: Emphasize that the AI is active, not passive. It probes for details.
  • EASCI Integration: This is the "E" (Experience) and "A" (Articulation) part of the loop happening in real-time.

Data Lifecycle and Traceability

How information moves through the system.

01

Data Sources

  • Devices and extracted data
  • Registries and databases
  • Bank information
  • Court decisions and documents
  • Procedures and legislation
02

Case File

  • Data is stored within a specific case file
  • Each case remains fully isolated
  • Access is role-based and logged
03

AI Analysis

  • Summarisation and information extraction
  • Relationship and pattern identification
  • Analytical support and visualisation
  • Full traceability of outputs
04

Reports and Archive

  • Reports can be exported
  • Retention periods are managed automatically
  • Data is deleted according to legal requirements

Key Principles

  • Data remains within agency-controlled infrastructure
  • No cross-searching between case files
  • AI does not train on operational case data
  • Every result can be traced back to its source
Isolation by Design

Each case file stands on its own. There is no cross-searching between cases, so information stays scoped to the investigation it belongs to.

Traceable Outputs

Every AI result links back to the source it came from, which keeps analysis auditable and lets an investigator verify any claim.

Retention and Deletion

Data is kept only as long as legal, regulatory or operational obligations require, then deleted on schedule.


Legal and Governance Principles

Four commitments keep the system lawful and accountable.

Human Oversight

  • Investigators remain responsible for all decisions
  • AI supports, but does not replace, human judgement
  • AI outputs can be reviewed, corrected or ignored

Transparency and Explainability

  • AI-generated outputs are clearly identified
  • Results include explanations and source references
  • Facts and AI-generated analysis remain separated

Data Protection

  • Personal data remains within the case file
  • Access is controlled and fully logged
  • Retention and deletion follow legal requirements
  • Secure on-premises deployment, no external cloud

Compliance by Design

  • LED (Law Enforcement Directive)
  • GDPR (where applicable)
  • EU AI Act
  • National legal requirements
  • Auditability and accountability

Traceability in Practice

Provenance and reasoning graph linking an AI conclusion back to its source emails and reasoning steps
Provenance graph: every AI conclusion links back through its reasoning steps to the original evidence.
Devil's Advocate chat message raising a coercion concern about timestamps aligning with Subject A's login sessions, with three reasoning sub-steps shown
The conclusion in context: the Devil's Advocate message that raised the timestamp concern, with its reasoning sub-steps, before it was traced in the graph.
Highlight context popup showing data lineage, element type and synchronized source references for an AI output
Each output exposes its data lineage and source references, so a reader can verify or challenge it.
Export menu showing compliance artefacts: FRIA, FRIA PDF, EU database metadata and XML, integrity proof and attestation
One-click export of compliance artefacts: FRIA, EU database metadata, integrity proof and attestation.

Compliance Built into the Workflow

Reasoning graph with dedicated LED, GDPR, EU AI Act and ECHR scanning steps feeding a compliance agent
Dedicated LED, GDPR, EU AI Act, and ECHR scanning steps run as part of the reasoning graph, with a compliance agent in the loop.

Explainable, Auditable Reasoning

Full reasoning graph tracing every analytical step from user messages and AI responses through field evidence to each hypothesis
The complete reasoning graph: every step, evidence link and hypothesis branch is recorded, so an analysis can be replayed and reviewed end to end.
Reasoning graph legend documenting node types, edge types and agent roles, including a dedicated Compliance and ComplianceReporter agent
A documented legend makes the graph readable: each node, edge and agent role is defined, including a dedicated compliance reporter that produces the FRIA and Full Report exports shown above.

The Compliance Reporter at Work

ComplianceReporter agent generating a FRIA for a high-risk law-enforcement AI, checking EU AI Act, LED, GDPR, EUROPOL, Europol JIT, EU WTR, FATF, ISO 20022, SWIFT, NIST AI RMF, ISO 27001 and Estonian ISKE, with two items flagged pending
The compliance reporter generating the FRIA: each regulatory framework is checked article by article, most cleared and a few flagged pending, with the running steps shown alongside.

Provenance Graph and Legend

Provenance legend based on the W3C PROV-O model, defining node types, field observation types, hypothesis versioning, compliance types and every edge type used in the graph
The graph follows the W3C PROV-O standard: the legend defines each node, edge and compliance type, so the lineage is read the same way every time.
Full provenance graph for case NST-2024-0847: a vertical chain of agents, activities, entities and compliance frameworks linking every conclusion back to its sources and the hypotheses
The full provenance graph: every agent, activity, entity and compliance framework in the case, linked end to end so each conclusion can be traced back to its sources.

Compliance Report at a Glance

Compliance report showing an 85 percent compliance score, evidence chain integrity checks, eight items requiring review and recommended actions including generating a FRIA report and exporting the audit trail
The case compliance report: an overall score, evidence-chain integrity checks, the items flagged for review (one high, seven medium) and recommended next actions, including a FRIA report and an audit-trail export.
Compliance assessment summary band showing an 85 percent score with five FRIA reports, one attestation, seventeen framework checks and fourteen frameworks
The coverage behind the score: five FRIA reports, one attestation and seventeen checks across fourteen regulatory frameworks.
Items requiring review: eight medium-severity items, including a coercion concern, a flagged timestamp anomaly and several preliminary FRIA and field-evidence requirements
What the score asks a human to check: the items flagged for review, from a coercion concern and a flagged timestamp anomaly to preliminary FRIA and field-evidence requirements.

Legal Compliance

Requirement Implemented Measures and Solutions System Feature
Data Protection and Privacy
Estonian Constitution § 26, § 43 · ECHR Art. 8 · CFR Art. 7, 8 · FI PL § 10 — Privacy and protection of personal data Case-based data isolation; role-based access control (RBAC); encrypted communication-data storage; access only through court-order ID binding; all queries logged; automatic data-expiry checks. Case isolation Court-order binding
LED Art. 4, Art. 8 · FI 1054/2018 § 4–6 — Processing principles and lawfulness Lawfulness, fairness, purpose limitation and data minimisation: data is processed solely for law-enforcement purposes; each case file is strictly isolated; only abstract schemes are stored in the pattern database. Purpose limitation
Estonian IKS § 20 · LED Art. 10 · FI 1054/2018 § 11 — Special categories of personal data Special-category data (race, ethnicity, political views, religion, health, biometrics) processed only where strictly necessary and prescribed by law; automatic classification, restrictions and additional security measures. Auto-classification
Estonian IKS § 43 · FI 1054/2018 § 31–32 — Security measures Data encryption at rest (AES-256); RBAC-based role management; full audit logging; automatic backup. Encryption + RBAC
GDPR Art. 5, 6 · FI Tietosuojalaki 1050/2018 § 4 (where applicable) When processing administrative or non-criminal investigation data: data-subject consent or legitimate interest; mandatory data-subject notification. Consent / basis check
Human Oversight and Automated Decision-Making
Estonian Constitution § 22 · FI PL § 21 — Presumption of innocence AI does not make guilt-determining decisions; the system supports the investigator and does not replace court rulings. No guilt scoring
LED Art. 11 · FI 1054/2018 § 13 — Automated decision-making AI does not make automated decisions; all results are confirmed by the investigator; profile-based decisions without human intervention are prohibited. Human-in-the-loop
EU AI Act Art. 14(1)–(4) · CFR Art. 41 · FI Hallintolaki § 6 — Human oversight and good administration The investigator can override, correct or ignore AI output at any time; the system can be stopped with a “stop” button; the UI displays limitations and capabilities. Per Article 14(4)(b), prompts counter automation bias and over-reliance on AI output. Override / stop Automation-bias prompts
Estonian KrMS § 211 · ECHR Art. 6 · FI ETL 805/2011 4:1 — Objectivity; duty to weigh exculpatory and incriminating evidence Devil's-advocate and Socratic-questioning prompts force consideration of alternative and exculpatory explanations, surface disconfirming evidence and challenge premature closure, supporting the duty to investigate objectively. Devil's advocate
EU AI Act Art. 9 — Risk-management system Continuous risk-management process across the system lifecycle; identified risks are logged, assessed and mitigated; results recorded in the technical file. Risk log
EU AI Act Art. 27 — Fundamental Rights Impact Assessment Completed Fundamental Rights Impact Assessment (FRIA); technical documentation per Article 11; risk-assessment log; conformity declaration. FRIA export
LED Art. 27 · GDPR Art. 35 · FI 1054/2018 § 20 — Data protection impact assessment A DPIA is conducted for high-risk processing, separate from and complementary to the AI Act FRIA. DPIA export
Transparency and Right to Explanation
Estonian Constitution § 15, § 24 · ECHR Art. 6, 13 · CFR Art. 47 · FI PL § 21 — Effective proceedings, fair trial and remedy AI reasoning chain exportable (PDF/JSON); the reasoning graph enables step-by-step challenge of the decision process; outputs are transparent, explainable and accessible to the defence. Reasoning graph
Estonian Constitution § 44(3) · FI 1054/2018 § 23 — Right to access data Built-in Data Subject Access Request (DSAR) export; personal-data query report is generated automatically. DSAR export
LED Art. 16 · FI 1054/2018 § 25 — Rectification and erasure Inaccurate police data can be rectified or erased on request; corrections propagate through the provenance graph. Provenance graph
EU AI Act Art. 86 — Right to explanation Each AI output carries a process-level explanation: the reasoning graph shows the inference steps; the provenance graph shows inputs and sources. Model-internal XAI is rarely feasible and is not relied on. Reasoning graph Provenance graph
EU AI Act Art. 13 — Transparency The system user guide and UI explain AI capabilities, limitations and intended use cases. User guide / UI
EU AI Act Art. 50 — User notification Users are clearly informed they are interacting with AI; outputs are marked as AI-generated. AI-output marking
FI Laki digitaalisten palvelujen tarjoamisesta 306/2019 § 7–§ 9 — Digital-service accessibility (Finland) Public-facing service interfaces and AI-generated explanations meet statutory accessibility requirements (perceivable, understandable, operable); an accessibility statement is published. Accessibility
Data Quality and Evidence
LED Art. 7 · FI 1054/2018 § 7–8 — Data quality AI-based conclusions are clearly distinguished from facts; the investigator confirms data accuracy before further use. Reasoning graph
LED Art. 6 · FI 1054/2018 § 8 — Categories of data subjects Suspects, victims, witnesses and contacts are distinguished and labelled; AI outputs preserve these category tags. Subject-category tags
EU AI Act Art. 10 · ECHR Art. 14 · CFR Art. 21 · FI Yhdenvertaisuuslaki 1325/2014 § 8 — Data governance and non-discrimination Reference data is examined for bias and representativeness; data origin is traceable; safeguards against discriminatory outputs. Bias checks Provenance graph
EU AI Act Art. 15 — Accuracy, robustness and cybersecurity Accuracy is declared and monitored; robustness and cybersecurity controls; low-confidence outputs trigger fallback and human confirmation. Confidence + fallback
Estonian KrMS § 63 · FI ETL 805/2011 — Concept of evidence AI is an investigative aid; documents prepared by the investigator based on AI analysis may qualify as evidence within the meaning of § 63 (other document). Provenance graph
Estonian KrMS § 64 · FI Pakkokeinolaki 806/2011 — Conditions for evidence collection Full traceability is ensured: each AI output references the original source and maintains data integrity. Provenance graph
Estonian KrMS § 146 — Procedural action protocol Documents prepared with AI assistance comply with protocol format requirements: date, author, criminal-case number, course and results of the action. Protocol format
Estonian KrMS § 150 — Audio and video recording A report based on AI analysis may rely on material recorded under KrMS § 150; recordings are unaltered and added to the case file. Unaltered recording
Security and Logging
Estonian IKS § 36 · LED Art. 25(1) · FI 1054/2018 § 19 · 906/2019 § 17 — Logging Collection, modification, reading/query, transmission/disclosure, combination and deletion are automatically logged; logs ensure traceability, help detect unauthorised access, and are retained for at least 3 years. Tamper-evident log
E-ITS · FI Tiedonhallintalaki 906/2019 ch. 4 — Security measures Compliance with the Estonian information security standard (E-ITS): security class is determined by data confidentiality, integrity and availability; catalogue measures are applied. Security class
FI Tiedonhallintalaki 906/2019 § 18 — Security-classified documents / Katakri (Finland) Where the system handles security-classified material, statutory classification marking and Katakri-level controls apply, beyond the general information-security standard. Security classification
Risk, Documentation and Governance
EU AI Act Art. 11 + Annex IV — Technical documentation Full technical documentation is maintained and produced via the one-click compliance export. Compliance export
LED Art. 24 · FI 1054/2018 § 18 — Records of processing activities A record of processing activities is maintained; the provenance graph provides per-output processing records. Provenance graph
LED Art. 41–46 · FI 1054/2018 § 45 — Supervisory authority Processing is subject to oversight by the Data Protection Inspectorate (AKI); cooperation and audit support are built in. Audit export
LED Ch. V (Art. 35–40) · FI 1054/2018 § 41–44 — Transfers to third countries Cross-border transfers only on a valid legal basis; on-premises deployment keeps data within jurisdiction by default. On-prem default
Model provenance and licensing Local models (Gemma, EuroLLM) run on-premises under their respective open-weights licences; model versions and licences are recorded. Model registry
FI Tiedonhallintalaki 906/2019 §§ 28a–28g · Hallintolaki 434/2003 §§ 53e–53g — Automated-decision deployment governance (Finland) A public deployment decision (käyttöönottopäätös) is issued; processing rules are documented with an explicit non-discrimination demonstration; responsible persons are named; five-year traceability is kept; each automated decision is marked as such. Deployment decision
FI Tiedonhallintalaki 906/2019 § 10–§ 11 — Information Management Board (Finland) The system is subject to assessment by the Information Management Board (Tiedonhallintalautakunta) against 906/2019, an institutional oversight distinct from the Data Protection Ombudsman. Board assessment
Cross-Jurisdiction Safeguards (Estonia · Finland · EU)
Marquee obligations confirmed across all three jurisdictions. The full source-by-source mapping is in the Legal Jurisdiction Matrix appendix.
Estonian HMS § 56 · KrMS § 305¹ · FI Hallintolaki § 45 · PL § 21(2) — Reasoned and contestable decision Every AI-assisted output that feeds an official act states the facts found, the evidence relied on and the provisions applied; a judgment may rest only on evidence the parties could examine. Reasoned decision
Estonian KrMS § 126⁴, § 126¹³, § 126¹⁵ · FI Pakkokeinolaki 806/2011 10:60, 10:65 — Surveillance authorisation, notification and oversight Covert measures require a prior scope-bound authorisation; the subject is notified once the authorisation expires; access is recorded for two-tier independent oversight. Judicial authorisation Post-hoc notification
Estonian KrMS § 61, § 63 lg 2, § 305¹ · FI OK 17:1, 17:25 — Admissibility; AI inference is not evidence AI output is labelled as inference with no predetermined evidentiary weight; unlawfully obtained material can be excluded; a conviction cannot rest on an unexaminable score. AI-inference label
Estonian KrMS § 15² · FI 616/2019 — Police-data law-enforcement regime Police processing follows the dedicated law-enforcement regime, not general data-protection rules: closed-purpose secondary use, class-specific retention, and shielded tactical and source data. LED-mode processing
Estonian KorS § 7 · KrMS § 9 · FI ETL 805/2011 4:4–4:6 — Proportionality and minimum intervention Each measure is the least intrusive that achieves the aim, proportionate to offence gravity, and stops once the aim is met. Minimum intervention
EU AI Act Art. 25, Art. 26, Art. 43, Art. 49(4) — Deployer and provider duties on self-hosting On-premises build or modification can flip the agency from deployer to provider; conformity assessment runs by internal control and the system is registered in the secure non-public section of the EU database. Deployer duties EU DB (non-public)
Retention and Deletion
LED Art. 5 · FI 1054/2018 § 6(3) — Retention periods Personal data is retained only as long as necessary for the purpose; the lifecycle graph tracks deadlines and notifies of expiry. Lifecycle graph
EU AI Act Art. 12 — Log retention AI-system logs are retained for at least 6 months; provenance and decision logs are exported to archive on the lifecycle schedule. Lifecycle graph
Estonian ArhS § 13, § 7 · FI Arkistolaki 831/1994 § 7, § 13 — Records appraisal and archival rules Documents subject to the archiving obligation are appraised and exported separately; retention/deletion schedules follow the archival rules and document type. Lifecycle graph
FI Arkistolaki 831/1994 § 8, § 13 — National Archives appraisal (Finland) Permanent-preservation decisions rest with the National Archives (Kansallisarkisto); retention logic defers to its appraisal; destruction is secure and data-protection-compliant. Archives appraisal

Retention Periods and Deletion

Data Type Retention Period Legal Basis Exception Possible?
Criminal Case File
evidence, protocols
10 yrs (general); 15 yrs (1st degree crimes); permanent (crimes against humanity) KrMS § 209 lg 2; VVm § 6 lg 1–4 Yes, if archival value
ArhS § 2 §§ 3–4, § 8; VVm § 6 § 5
Surveillance File
wiretaps, surveillance
Up to 50 years KrMS § 12612 lg 3 No
KrMS § 126¹² lg 3 (strict limit)
Court File
hearing protocols, decisions
10 yrs (after entry into force) KrMS § 1601 lg 6–7 Yes, if archival value
ArhS § 2 §§ 3–4; § 8 § 2
DNA/Fingerprint Data Until criminal record data is deleted KrMS § 206; ECHR Art. 8 (S. and Marper v. UK) Yes, on termination of proceedings (§ 200)
KrMS § 206 (deletion from DNA / fingerprint registers)
AI reasoning logs
provenance, decisions
Min 6 months; technical docs 10 yrs after market withdrawal EU AI Act Art. 12(1), Art. 18(1), Art. 19(1) Yes, extendable upon challenge
LED Art. 16(3)(a)(b); GDPR Art. 18(1)
Personal Data in case file Until purpose is fulfilled IKS § 17; LED Art. 4(1)(e), Art. 5 No, except as prescribed by law
IKS § 17 lg 2; § 25 lg 3–4
Pattern Database Entries
anonymous modus operandi
Indefinite (no personal data) LED Art. 4(1)(c) Not applicable
Anonymous data, GDPR does not apply
Audit Log
who, when, what
At least 3 years IKS § 36; LED Art. 25(2) Yes, extendable
IKS § 36 lg 5; E-ITS security class
Protocols and Recordings
interrogations, observations
With case file (10–15 yrs) KrMS § 146, § 148, § 150 lg 4 Yes, with case file
VVm § 6 (follows main document)
Public-sector Documents
public information
5–50 years (document type) ArhS § 13, § 7; AvTS § 40 (access restriction) Yes, if archival value
ArhS § 7, § 8, § 13

Acronyms and Legal Sources

Acronym Full name In English / what it is
Estonian legislation
PSEesti Vabariigi põhiseadusConstitution of the Republic of Estonia
IKSIsikuandmete kaitse seadusPersonal Data Protection Act
KrMSKriminaalmenetluse seadustikCode of Criminal Procedure
AvTSAvaliku teabe seadusPublic Information Act
ArhSArhiiviseadusArchives Act
VVmVabariigi Valitsuse määrusGovernment of the Republic regulation
E-ITSEesti infoturbestandardEstonian Information Security Standard (replaced ISKE in 2023)
AKIAndmekaitse InspektsioonEstonian Data Protection Inspectorate (supervisory authority)
HMSHaldusmenetluse seadusAdministrative Procedure Act (duty to give reasons, discretion)
KorSKorrakaitseseadusLaw Enforcement Act (state supervision, proportionality)
KüTSKüberturvalisuse seadusCybersecurity Act (statutory basis of E-ITS, 24 h incident reporting)
VõrdKSVõrdse kohtlemise seadusEqual Treatment Act (non-discrimination, shared burden of proof)
KarSKaristusseadustikPenal Code (data-misuse and secrecy offences)
JASJulgeolekuasutuste seadusSecurity Authorities Act (intelligence-data separation, oversight)
KeeleseadusLanguage Act (Estonian as the language of administration)
Finnish legislation
PLPerustuslaki (731/1999)Constitution of Finland
1054/2018Laki henkilötietojen käsittelystä rikosasioissa…Act on the Processing of Personal Data in Criminal Matters (Finnish LED transposition)
ETLEsitutkintalaki (805/2011)Criminal Investigation Act (objectivity principle, ch. 4 § 1)
Tiedonhallintalaki (906/2019)Information Management Act (security standard, ADM governance, board)
Hallintolaki (434/2003)Administrative Procedure Act
Tietosuojalaki (1050/2018)Data Protection Act (general; Tietosuojavaltuutettu = supervisory authority)
Arkistolaki (831/1994)Archives Act (National Archives appraisal)
Pakkokeinolaki (806/2011)Coercive Measures Act
Yhdenvertaisuuslaki (1325/2014)Non-Discrimination Act
Julkri / KatakriFinnish public-administration / national-security information-security criteria
OKOikeudenkäymiskaari (4/1734)Code of Judicial Procedure (evidence in ch. 17, rewritten by 732/2015)
616/2019Laki henkilötietojen käsittelystä poliisitoimessaAct on the Processing of Personal Data in Police Matters (police-data lex specialis)
Julkisuuslaki (621/1999)Act on the Openness of Government Activities (secrecy and classification marks)
Tasa-arvolaki (609/1986)Act on Equality between Women and Men
121/2019Laki tiedustelutoiminnan valvonnastaAct on the Oversight of Intelligence Gathering (Intelligence Ombudsman)
EU and international
LEDLaw Enforcement Directive — Directive (EU) 2016/680EU data-protection rules for police and criminal justice
GDPRGeneral Data Protection Regulation — Regulation (EU) 2016/679EU general data-protection law
EU AI ActRegulation (EU) 2024/1689EU Artificial Intelligence Act
ECHREuropean Convention on Human RightsCouncil of Europe human-rights treaty (European Court of Human Rights)
CFRCharter of Fundamental Rights of the European UnionEU fundamental-rights charter (assessed by the FRIA)
NIS2Directive (EU) 2022/2555EU network- and information-security directive (incident reporting)
Compliance and technical terms
FRIAFundamental Rights Impact AssessmentDeployer assessment under EU AI Act Art. 27
DPIAData Protection Impact AssessmentRisk assessment under LED Art. 27 / GDPR Art. 35
DSARData Subject Access RequestAn individual's request to access their personal data
RBACRole-Based Access ControlAccess granted according to user role
XAIExplainable AIModel-internal interpretability (vs. process-level explanation)
AESAdvanced Encryption StandardSymmetric encryption (AES-256, data at rest)
TLSTransport Layer SecurityEncrypted data transmission (TLS 1.3)
Human Stays Responsible

The investigator can override, correct or ignore any AI output. The system advises; the decision stays with a person.

Facts vs Analysis

AI-generated analysis is kept visibly separate from established facts, so a reader always knows which is which.


Key Takeaway

The goal is not to automate investigations.

It is to help investigators —

Find information faster
Understand information better
Preserve organisational knowledge
Make more informed decisions

AI should augment investigators, not replace them.


AI Investigator Alternatives

Why a custom airgapped solution? Evaluating AI Investigator against market alternatives.

Strong · Partial · Weak — favourability for a sovereign agency

Dimension Palantir Gotham IBM i2 Analyst ChatGPT/Claude AI Investigator
Cost & Licensing
Pricing Model Per-user annual
€50k+ / analyst
Perpetual + maint
€25k + 20%
Metered API
Variable
Tiered models
€36k (infra)
Lock-in Risk High
Proprietary fmt
Medium
Some export
High
Cloud-only
Low
Open Standards
Sovereignty & Security
Data Sovereignty Configurable
On-prem costly
Full control
On-prem
None
US Cloud
Air-gapped
100% Sovereign
LED Compliance Possible
Audit needed
Possible
Manual
Non-compliant
Data export
Native
By Design
Knowledge Management
Tacit Knowledge No
Explicit only
No
Visual only
No
Stateless
Core Feature
EASCI Framework
Market Analysis
  • Lock-in: Proprietary formats hold data hostage. AI Investigator uses open W3C standards.
  • vs Palantir: Palantir is for explicit data fusion, not tacit reasoning. Cost-prohibitive for small agencies.
  • vs ChatGPT: Public LLMs violate Data Sovereignty and LED compliance.
Unique Value Proposition
  • Sovereign: Air-gapped & On-premise.
  • Specialized: Built for Tacit Knowledge.
  • Predictable: Fixed hardware cost.
  • Compliant: Automated "Smart Forgetting".

AI Support for Investigations

From technical foundations, to investigative functions, to impact.

01

AI Foundations

How it is technically enabled

  • Processes large volumes of text and audio (incl. transcription)
  • Combines data from multiple sources (documents, emails, interviews)
  • Detects patterns and links across data
  • Handles foreign languages
  • Runs fully inside agency (secure, no external data sharing)
02

Investigation Functions

What it does for the user

  • Summarises large case files
  • Finds connections (people, events, transactions)
  • Supports hypothesis testing
  • Preserves investigation logic
03

Investigation Impact

The outcome for the case

  • Faster understanding of complex cases
  • Reduced cognitive load
  • Clear, traceable reasoning (audit / court-ready)