The bottom line — most PoCs don’t survive in production

Since 2017, industry has invested heavily in artificial intelligence. The announcements keep coming: predictive maintenance, vision for quality control, copilots for engineering, energy optimization. And yet the figures published by the analyst firms converge on the same troubling conclusion: the majority of AI projects never meet their business objectives. Gartner routinely cites an 85% failure rate; McKinsey, in its annual State of AI report, shows that adoption is progressing but that scaling up remains the exception.

The ratio is even less favorable in industry. OT data is fragmented, controllers stay in production for 20 years, functional safety constraints prohibit closed ML → process loops, and the cost of a false positive on a production line is measured in hours of downtime.

Yet the gap does not come from an algorithm problem. The models exist, they are mature, they are accessible. Six structural challenges determine the transition from PoC to production. This article names them — without dressing them up.


Challenge 1 — Industrial data is rarely ready

The myth: “we already have 20 years of SCADA data, all we need is to train a model on it.”

The reality every data scientist hits on day one:

FrictionHow it shows up
Scattered sources4 to 7 different historians per site (OSIsoft PI, AVEVA Historian, Wonderware, Ignition, ad hoc CSV exports, shared Excel files)
No labelsVery few annotated datasets (“this failure was of such-and-such type”). Incident reports are free text.
Destructive compressionHistorian compression algorithms (PI Compression, AVEVA Swinging Door) discard points deemed “redundant.” For a vibration analysis, the information was lost before you even went looking for it.
Inconsistent samplingOne tag at 1 s, another at 5 min, a third event-driven. Without proper resampling, correlation is impossible.
Missing metadataTIC_2301 has no meaning without the P&ID. Data scientists have to reconstruct the physical unit, the range, the criticality.
Drifting sensorsA 10-year-old sensor is already sending values biased by a few %. A model trained on this data reproduces the bias.

Hence the rule of thumb cited in every public lessons-learned report: 60 to 80% of an industrial AI project’s time goes into preparing the data. Before any model, you need:

  • A unified data catalog (Microsoft Purview, Apache Atlas, Collibra)
  • A structured information model (OPC UA Information Models, IEC 61850 in energy, ISA-95 in manufacturing)
  • A modern historian or time-series database (InfluxDB, TimescaleDB, Aveva PI System in uncompressed mode, or a cloud time-series store such as Azure Data Explorer)
  • Governance: who validates quality, who labels incidents, who removes outliers, who versions the datasets

Without these foundations, the best model drifts within three months.


Challenge 2 — Brownfield and legacy: two clocks running at different speeds

The life cycle of an industrial site: 20 to 30 years for controllers, 30 to 50 years for valves, pumps and structures. The AI cycle: 6 to 18 months before a model goes stale.

This dissonance creates three constraints that many PoCs ignore:

  1. AI does not control a sensitive process directly. A model that proposes a setpoint must go through the operator, then — if it has a safety impact — through an IEC 61511-validated safety instrumented function. No closed AI → SIS loop without a complete review of the SIL study. See the IEC 61511 reference page for details.

  2. PLCs in production do not natively talk to ML models. The bridge goes through OPC UA, MQTT Sparkplug or an edge gateway. Edge deployment becomes a sub-project in its own right (ONNX/TF Lite containerization, dependency management, remote updates, supervision).

  3. Vendor contracts limit modifications. Many OEM warranties (Siemens, Emerson, Honeywell, ABB) provide for loss of support if you connect “3rd party software” to their DCS without agreement. To be negotiated as early as the tender — often forgotten, a source of blockage at commissioning.

For more on the architecture that lets you expose data to an ML model without breaking the cyber posture: Exchanging data between 2 OT controllers.


Challenge 3 — Physics, pure ML, hybrid: the choice that changes everything

The fundamental debate. And the default choice “let’s take a deep learning model” is rarely the right one.

ApproachAdvantagesLimitationsWhen to use it
Physical model (first-principles)Explainable, robust out-of-distribution, works with little data, auditableComplex calibration, requires deep process expertise, slow to developWell-modeled processes (thermodynamics, heat transfer, fluid dynamics) — distillation columns, heat exchangers, reactors
Pure machine learningDiscovers non-obvious patterns, scales with data volume, fast to prototypeBlack box, drifts when operating conditions change, weak out-of-distribution, hard to auditVision (quality, defect detection), acoustic patterns (rotating equipment), industrial NLP (procedure extraction, spec generation), predictions on large historical streams
Hybrid physics-informed ML (PINN, neuro-symbolic)Combines physical robustness + ML flexibility, requires less data, more explainableMore complex to design, few packaged tools, rare expertiseProcesses that are well modeled physically but with non-linear drifts or poorly captured phenomena (fouling, catalyst aging, mechanical fatigue)

In 2026, hybrid physics-informed ML is emerging as the dominant pattern for continuous processes (fine chemicals, petrochemicals, energy). Pure ML models remain dominant for vision and industrial NLP — use cases where physics provides no analytical equation.

The classic mistake: applying deep learning to a regulation problem when a well-tuned PID does better for 1/100th of the cost.


Challenge 4 — Industrial MLOps: the Achilles’ heel

A model running in the lab is 10% of the way there. The remaining 90%: keeping it alive in production.

The industrial MLOps cycle:

   Raw data → Preparation → Training → Validation → Edge deployment

   Re-training ← Drift detection ← Monitoring ← Real-time inference

Specifically industrial points that most cloud platforms do not cover:

  • Continuous drift detection. Operating conditions change: new raw material, change of season, equipment aging, a modified setpoint. Without automatic drift detection, the model degrades silently. Common tools: Arize AI, Fiddler, Evidently AI, WhyLabs.
  • Edge deployment. Model compiled for a specific target (ONNX runtime, TensorFlow Lite, OpenVINO). CPU/GPU/TPU chosen according to the required latency: real-time vision = 10 to 50 ms per image, energy optimization = 1 to 15 min.
  • Joint versioning: model + training data + preparation code, everything must be versioned together (DVC, MLflow, Git LFS). Reproducing a model 18 months later is an EU AI Act audit requirement.
  • Automatic rollback. If a new model degrades the KPIs measured in production, revert to the previous one — without human intervention. Very few industrial platforms handle this as standard.

Common industrial edge platforms:

  • Siemens Industrial Edge + Industrial Edge Management (orchestration)
  • NVIDIA EGX / IGX (vision and high-performance inference, up to real-time deep learning)
  • AWS Panorama (on-site vision, AWS integration)
  • Microsoft Azure Stack Edge (general-purpose, Azure ML integration)
  • Phoenix Contact PLCnext (Docker containers running on the controller itself)

Challenge 5 — The new cyber attack surface

An AI model is an asset. Like any asset deployed in production, it is attackable. Five specific vectors to know:

  1. Model poisoning. An attacker who can inject data into the training pipeline can bias the model. Theoretical example: poisoning vibration data so that a predictive maintenance model ignores drift on a critical piece of equipment. Countermeasure: cryptographically sign the datasets and trace the ingestions.

  2. Adversarial examples. Subtly modifying an input image or signal to fool the model. Eykholt et al. demonstrated in 2018 (paper “Robust Physical-World Attacks on Deep Learning Visual Classification”) that a sticker on a road sign is enough to make a vision system classify it as something else. Applicable to industrial vision systems.

  3. ML supply chain. Using a pre-trained open-source model containing a backdoor. Anthropic published a study in January 2024 (“Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training”) showing that a model trained to behave normally except on a specific trigger remains undetectable by conventional fine-tuning methods. Countermeasure: source models only from certified and signed registries.

  4. Model extraction / inversion. Querying a model exposed in production allows, with enough requests, reconstructing its behavior (cloning) or even elements of the training dataset (privacy attacks). Countermeasure: rate limiting, monitoring of query patterns, isolating the model in the industrial DMZ.

  5. Prompt injection (for industrial LLMs / copilots). Injecting instructions into a user prompt that modify the model’s behavior. Particularly risky for copilots that generate PLC code: a malicious supplier could inject instructions into documentation to produce vulnerable code. Countermeasure: isolate the user context, filter inputs, manually validate any generated code before deployment.

Applicable IEC 62443 framework: these issues are not specifically addressed in the current 4-1 / 4-2, but the 2026 revision of the series is beginning to incorporate them. In the meantime, treat the ML model as a standard IACS component and apply SL-C(2) or SL-C(3) to it depending on the criticality of the process it informs.


Challenge 6 — A ROI that is hard to quantify

The classic trap: you deploy AI, but you no longer know how to measure what it delivers.

Three sources of measurement difficulty:

  1. Counterfactual comparison. You need to know what would have happened without AI. Across a fleet of equipment, you can do A/B testing (one subset equipped, another not) but this requires a long observation period, often 12 to 18 months.

  2. Attribution. A 2% availability gain may come from the AI, or from a change in operator procedure, or from a new sensor installed at the same time, or from a particular season. Isolating the AI contribution is rare and requires a statistical discipline that few industrial managements have in-house.

  3. Hidden costs. The model is the visible tip of the iceberg. MLOps, cloud infrastructure, updates, retraining, operator training, dataset maintenance, monitoring probe support: these line items often represent 3 to 5 times the initial cost of developing the model. Many business cases collapse the moment these costs appear in year 2.

Business metrics to instrument before launch (6 to 12 months baseline):

  • Availability / OEE (Overall Equipment Effectiveness)
  • Quality scrap rate
  • Specific energy consumption (kWh / unit produced)
  • Maintenance cost and MTBF
  • Cycle time
  • Number of unplanned ESD events

Without a rigorous baseline documented before the project, the ROI will always be contestable — either by the person who wants to prove success, or by the one who wants to prove failure.


EU AI Act — timeline and industrial applicability

The world’s first regulatory framework for AI. Regulation (EU) 2024/1689, adopted in June 2024, entered into force on 1 August 2024, with progressive application over 36 months.

DeadlineWhat applies
2 February 2025Prohibited practices (cognitive manipulation, social scoring, etc.) — Article 5. Obligation of AI literacy for the relevant staff — Article 4.
2 August 2025Obligations for GPAI (General-Purpose AI) models, governance, national authorities, applicable penalties — Chapters V, VII, XII.
2 August 2026General application of the regulation. High-risk AI systems (Annex III): full obligations.
2 August 2027AI systems integrated as a safety component in products already subject to harmonization (Annex I: machinery, toys, medical devices, etc.).

Four risk categories:

  1. Unacceptable riskprohibited. Social scoring, behavioral manipulation exploiting vulnerabilities, real-time remote biometric identification by authorities (with limited exceptions), etc.
  2. High risk → strong obligations: risk management (Article 9), data governance (Article 10), technical documentation (Article 11), record-keeping (Article 12), transparency (Article 13), human oversight (Article 14), accuracy/robustness/cybersecurity (Article 15).
  3. Limited risk → transparency obligations (inform the user that they are interacting with an AI).
  4. Minimal risk → no obligation (games, spam filters, etc.).

Industrial cases that fall under “high risk” (Annex III or Annex I):

  • AI as a safety component of a machine (Machinery Regulation (EU) 2023/1230 → AI Act Annex I)
  • AI in the management of critical infrastructure (water, energy, gas, transport, heating supply — Annex III §2)
  • AI in medical devices (Regulation (EU) 2017/745 MDR → AI Act Annex I)
  • AI in the safety of products subject to the ATEX Directive, the Pressure Equipment Directive, etc.

Penalties (Article 99):

  • Prohibited practices: up to €35M or 7% of worldwide turnover
  • Breach of obligations by operators of high-risk systems: up to €15M or 3% of worldwide turnover
  • Incorrect information to authorities: up to €7.5M or 1.5% of worldwide turnover

The regulation applies to providers and to users (“deployers”) of AI systems. An industrial site that uses AI supplied by a third party bears part of the obligations — notably human oversight and log retention.


ISO/IEC 42001 and NIST AI RMF — the management frameworks

Two complementary frameworks established themselves in 2023-2024.

AspectISO/IEC 42001:2023NIST AI RMF 1.0
TypeManagement system standard — certifiable by a third partyFramework / guide — non-certifiable
PublishedDecember 2023January 2023
OriginISO/IEC JTC 1/SC 42NIST (USA), Department of Commerce
StructureAnnex A: a set of controls structured by objective (A.2 to A.10)4 functions: Govern, Map, Measure, Manage
Compatibility with existing ISOStrong — ISO High-Level Structure, integrable with ISO 9001 / 27001 / 14001Independent but designed to integrate with existing practices
EU AI Act compatibilityVery strong — candidate for harmonized standard status under the AI ActReference in the United States and for international providers
Typical useOrganizational certification, third-party audit (the ISO 27001 equivalent for AI)Self-assessment, gap analysis, due diligence

In practice in 2026, an industrial organization that develops or integrates AI should aim for ISO/IEC 42001 + EU AI Act alignment. NIST AI RMF remains useful as an internal analysis grid and for transatlantic operations.


A few real (public) cases

So as not to invent anything, here are a few cases documented publicly by their authors.

Successes shared publicly

  • BMW — AIQX. Since 2019, BMW has deployed “AIQX” (Artificial Intelligence Quality Next) in several plants, notably Regensburg and Dingolfing. An automated vision system for the quality control of parts. Communicated and presented by BMW as a flagship industrial case.
  • Siemens Industrial Copilot. Announced in 2024, the first generative assistant for TIA Portal code generation. Public demonstrations in 2024-2025, field deployments still limited as of mid-2026 — the published operational feedback is anecdotal.
  • Anheuser-Busch InBev. The brewer communicates publicly about the use of ML predictive maintenance on its bottling lines, in partnership with Microsoft. Quantified operational details are sparsely published.

High-profile public failures

  • GE Predix (2014-2020). An ambition for a global IIoT/AI platform, presented as “the operating system of industry.” GE announced several billion in write-downs on its GE Digital division over the period. Gradual dismantling: Predix was divested, GE Digital refocused, several activities sold off. Cause publicly acknowledged by GE: the difficulty of productizing a generic platform sellable beyond GE’s installed base.
  • IBM Watson Health. Several billion invested over a decade via acquisitions (Phytel, Explorys, Truven Health Analytics, Merge Healthcare). In January 2022, IBM announced the divestiture of its Watson Health assets to Francisco Partners. Cause publicly acknowledged: the gap between commercial expectations and the actual performance of the models on clinical use cases.

Cross-cutting lesson from the public failures: productizing a generic AI platform is much harder than the demonstrations suggest. Industrial successes are built on narrow, well-defined use cases, with measurable ROI, and an integration effort owned over 18-36 months.


The 7 classic pitfalls

  1. Starting with the technology instead of the business use case. “Let’s take Azure AI” before having defined the KPI to improve and the baseline to measure. Always reverse it: KPI → baseline → use case → data → model → tech.

  2. Underestimating data preparation. Allocate 60% of the project budget to data, not 60% to the model. Data engineering costs more than data science.

  3. No proof-of-test on historical data before a live PoC. Before mobilizing operational teams, demonstrate that the model would have behaved correctly over 12-24 months of past data (backtesting). A model that fails the backtest will never pass live.

  4. No quantitative go/no-go criteria defined at launch. Without objective thresholds (for example: no more than 5% false positives over 3 consecutive months), a PoC that doesn’t work is continued for political reasons and ends up exhausting the budget.

  5. Coupling AI and process without safety validation. An AI model does not control a safety instrumented function. Period. See IEC 61511 §11.2.3 on the requirements applicable to decision-support tools integrated into the SIS.

  6. No identified owner of the model after go-live. Who maintains it? Who retrains it? Who decides on a withdrawal? Who audits it under the EU AI Act? To be named at the design stage, not after deployment.

  7. Marketing communication before field feedback. Publicly announcing an AI success without 12 months of validation in production is statistically a reputational risk. The successes that last are the ones that were tested quietly first.


Going further

One last thing. AI is not an industrial revolution — it is a new layer of tools, deployed on top of existing layers (process, automation, functional safety, cybersecurity) that it does not replace. The sites that succeed with their AI are the ones that first master these underlying layers. The others enter Gartner’s 85% statistic — not because AI doesn’t work, but because the foundations weren’t ready.