The Vendor Landscape: Who Sells What, and What Is Real

📍 Where we are: Part VII · ML/AI in Industry Today — Chapter 25. The previous chapter weighed generative AI and LLMs and found copilots (AI that assists a human who stays in control) useful and agents (AI that acts autonomously) oversold. This chapter zooms all the way out to the market that sells every technique in this book, and asks the only question a buyer should ask: of all this, what is actually real?

Twenty-four chapters have walked the bioprocess spine through the learning lens, naming algorithms, datasets, and a few named deployments along the way. But a process scientist does not buy an algorithm; they buy a product — a license, a support contract, a validation package, a vendor on the other end of an audit. This chapter is the buyer's map. It lists who sells what across the biomanufacturing software market, sorts each offering by the same honest maturity and evidence tiers this book has used throughout, and draws the single most important line on the whole map: the line between what is demonstrated in routine GMP (Good Manufacturing Practice — the binding quality regulations all drug manufacturing runs under) production and what is agentic marketing running ahead of the demo.

The market is loud right now. Every vendor has an "AI ecosystem," every press release has a percentage, and "agentic" has become the word "cloud" was a decade ago. Underneath the noise, the production-grade reality is narrower and older than the marketing: multivariate monitoring, spectroscopic soft sensing, computer-vision inspection, and human-in-the-loop documentation — most of it built on math that predates the deep-learning era. The corrections this book has repeated matter most here, where they are easiest to get wrong on a slide.

The simple version

Buying biomanufacturing AI is like buying a self-driving car in 2026. The brochure shows a car that drives itself anywhere; the fine print says "driver-assist features, hands on the wheel." The honest buyer learns to read past the headline to the operational design domain — where, exactly, does this thing actually work unattended, and where is a human still steering? This chapter teaches that reading for the bioprocess software market: which products genuinely run in a GMP plant making release decisions, and which are a very good demo with a roadmap attached.

What this chapter covers

The four evidence tiers and three maturity markers, applied to a market instead of a paper
The MVDA/PAT (multivariate data analysis / process analytical technology) incumbents (Sartorius, AspenTech/Emerson) — the genuinely production-grade core
The mechanistic and hybrid specialists (Cytiva GoSilico, DataHow, Yokogawa/Insilico) — and the two attributions everyone gets wrong
The GxP-AI SaaS (good-practice-regulated AI as software-as-a-service) and data-layer players (Aizon, TetraScience, Ganymede)
The automation and MES (manufacturing execution system) incumbents (Korber, Siemens, Rockwell, Emerson, AVEVA) onto which ML is layered, not native
The consumables-and-intelligence houses (MilliporeSigma/Merck, Thermo Fisher), the CDS/QC (chromatography-data-system / quality-control) tools (Waters), and the validation/quality stack (ValGenesis, Veeva)
The consolidation map — who bought whom — and why it changes the attribution on a slide
How "agentic" marketing outran demonstrated production, and what the Purolea warning letter did to that gap
The open-source example suite as the reproducible counterpoint to every proprietary box

How to read a vendor claim: tiers, maturity, and the headline number

Every claim in this chapter carries two labels, the same two the rest of the book uses. The evidence tier says how the claim was established: peer-reviewed-independent (a journal, authors with no commercial stake), peer-reviewed-self-authored (a journal, but the vendor or customer is a co-author), vendor-self-reported (a product page, white paper, or conference slide), or press-release-only (an announcement with no methods behind it). The maturity marker says how far the thing actually got: (production) deployed in GMP or commercial manufacturing, (pilot) demonstrated at scale but not in routine release, (research) academic or early-stage.

The two labels are independent, and the gap between them is where buyers lose money. A vendor can have a genuinely (production) product whose flagship number is vendor-self-reported — the deployment is real, the percentage is marketing. The discipline this chapter enforces is to never let a maturity marker borrow credibility from an evidence tier, or vice versa. When a vendor page says a model "cut experiments by 80%," that is a vendor-self-reported figure regardless of how production-grade the platform is [1]. When a peer-reviewed paper co-authored by the vendor and its customer reports a 33% accuracy gain, that is peer-reviewed-self-authored — stronger than a product page, weaker than an independent replication, and the figure you should quote in preference to the vendor page's own larger number [2].

There is a third, quieter test that catches more bad slides than either label: the claim's denominator and counterfactual. "80% faster validation" is faster than what baseline, measured over what scope, on whose process? A figure with no stated comparator is not a weak measurement — it is not a measurement at all, and it should be read as a marketing target rather than a result. The same goes for the conflation of capability with deployment: a vendor that "supports" continuous learning is not the same as a vendor that runs a continuously-learning model in a GMP-critical loop, and the draft EU/PIC/S GMP Annex 22 (a forthcoming European and international AI-in-manufacturing rule, detailed below) makes that distinction legally load-bearing. When you cannot find the denominator, the comparator, or the scope, treat the number the way a quality unit treats an undocumented result: as a hypothesis awaiting evidence, not as evidence.

Evidence

There is, as of mid-2026, almost no peer-reviewed-independent evidence for any commercial bioprocess-AI efficiency headline. The strongest independent signal is structural, not numerical: the 7th ISPE (International Society for Pharmaceutical Engineering) Pharma 4.0 survey — Pharma 4.0 being the industry's name for the digital, data-driven factory — found AI/ML to be the technology with the most pilot projects and the fewest scaled implementations — it trails big-data analytics, advanced analytics, robotic process automation, GxP cloud, and IIoT (the Industrial Internet of Things — networked sensors and equipment on the plant floor), all of which are already more mature in pharma [3]. Read every vendor headline in this chapter against that backdrop: the market is selling the end-state of a journey most plants have not finished.

The MVDA and PAT incumbents: the production-grade core

If you want to know what biomanufacturing AI looks like when it actually ships, look at the oldest products in the market. Multivariate statistical process monitoring — PCA and PLS (the dimensionality-reduction and regression methods the analytics chapter below builds) over batch trajectories, scored by Hotelling's T-squared and squared prediction error (two distance measures that flag when a batch drifts away from the normal cluster), diagnosed by contribution plots (which input variables drove that drift) — is the one technique that is unambiguously (production) across the industry, and it has been for two decades. The book has already built its open-source core in Book 3's analytics chapter; the commercial incumbents are productized, validated wrappers around the same math. The reason MVDA (multivariate data analysis) is the market's only true production-wide AI is not that the algorithm is special but that the regulatory path is paved: multivariate models are static and deterministic once fitted, their outputs (a T-squared distance, a contribution plot) are inspectable and reproducible, and the EMA and FDA (the European and US drug regulators) already recognize the approach for Real-Time Release testing (RTRT — releasing a batch on in-process model predictions instead of waiting for end-of-line lab tests) as a regulatory pathway. (As the QC-and-release chapter is careful to note, that recognition is of the method and the path — fully approved closed-loop RTRT — the model directly releasing the batch with no human in the loop — in routine GMP remains a small-molecule reality, e.g. Janssen's Prezista (a conventional chemically-synthesized drug, simpler to characterize than a biologic), not yet demonstrated for a biologic critical quality attribute, or CQA — a measurable property like purity or potency that must stay in spec.) That combination — old math, locked model, recognized path — is exactly the profile the draft Annex 22 will demand of any new AI that wants to touch a critical decision.

Sartorius, through its Umetrics line, is the market standard: SIMCA for offline multivariate modeling and golden-batch fingerprinting, SIMCA-online for real-time process monitoring, and MODDE for design of experiments [4]. These are genuinely (production) tools running in commercial GMP plants for Continued Process Verification (CPV — the ongoing monitoring that a validated process stays in control) and fault detection, validated for 21 CFR Part 11 environments (the US rule governing electronic records and signatures). Sartorius's broader "Umetrics Digital Twin AI Ecosystem" and "Biobrain" positioning is newer and vendor-self-reported — the MVDA core is proven; the AI-ecosystem branding is a layer of marketing on top of it. The cleanest production anchor in the whole market sits here: Amgen's Juncos site in Puerto Rico runs SIMCA OPLS (orthogonal PLS, a PLS variant) models predicting harvest titer (the antibody concentration at the end of the culture) and in-process attributes in commercial GMP, reporting the elimination of roughly six hours of harvest idle time and ten hours of column idle per batch — but that is a first-party, vendor-self-reported figure (Amgen engineers with Sartorius as a case-study sponsor), and the hour-savings are not externally verifiable [5]. Note what the Juncos case is and is not: it is a real model in real commercial GMP (the maturity is genuinely production), and the technique it deploys is OPLS — a multivariate regression, not a neural network. The strongest production AI in the market is the least fashionable math in the book.

The other production-grade MVDA house is AspenTech ProMV (formerly ProSensus/MacGregor), now inside Emerson after Emerson's roughly fifteen-billion-dollar move on AspenTech [6]. ProMV does the same fault-detection and contribution-plot diagnosis SIMCA does, and its "fallacy of the golden batch" critique is a useful corrective to the naive single-reference-trajectory idea — a reminder that even the production-grade core has a known failure mode (an over-fit reference batch) that the vendors themselves document. Emerson's own DeltaV PredictPro brings model-predictive and analytics features into the DCS (distributed control system — the automation layer that runs the equipment) layer. All of it is (production) for monitoring; none of it is autonomous control of a critical quality attribute (CQA). The honest summary of the entire MVDA tier is one sentence: it watches the process and flags deviations for a human to act on, and that is precisely why it is the part of the market with no agentic gap to close.

The mechanistic and hybrid specialists: where attribution goes wrong

This is the part of the map where slides get the ownership wrong, and getting it right is the whole point of an honest landscape. The three companies here have confusingly similar names and confusingly different model classes, and a single mislabel quietly distorts a build-versus-buy decision.

Cytiva (a Danaher company) sells GoSilico — the ChromX/DSPX mechanistic chromatography modeling suite it acquired in 2021 [7]. GoSilico is (production) in CMC downstream development, and it is the single most important attribution correction in the chapter: GoSilico is mechanistic, not machine learning. It solves the physics of chromatography — transport-dispersive mass-balance equations and steric-mass-action equilibrium — to predict elution behavior from a few calibration runs. It is the hybrid-models chapter's "white box" — a model whose internals are physically interpretable, as opposed to a "black box" whose internals are opaque fitted weights — not a learned model: it has parameters with physical meaning (binding constants, column porosity) rather than weights fit to minimize a loss, and it extrapolates the way physics does rather than the way a regression does. Shelving it under "AI" on a vendor slide is a category error this book refuses to make, and it is not a pedantic one — a mechanistic model and an ML model have different validation obligations, different failure modes, and a different posture under Annex 22 (a deterministic physics solver is exactly the kind of static model the draft permits for critical use). The time-savings GoSilico claims for in-silico process development are real in kind but vendor-self-reported in magnitude.

DataHow is the pure-play hybrid-modeling specialist, and the second attribution everyone gets wrong: DataHow is an independent company. It is an ETH Zurich spin-off, with a Series A led by Momenta and including Rockwell Automation and Zurich Kantonal Bank, and an Eppendorf collaboration announced in late 2024 — it is not owned by Sartorius, a confusion that recurs because both sell bioprocess analytics and both trace to the same European academic milieu [1]. Its DataHowLab and SpectraHow products do hybrid (mechanistic-plus-data) modeling and transfer learning — a Monod-style kinetic backbone (the textbook equation relating cell growth rate to nutrient concentration) with a learned residual for what the kinetics miss, the same mechanistic-plus-residual architecture the open suite's hybrid_model module builds in miniature (there the backbone is the IVCD-times-specific-productivity titer relation (integral of viable cell density over time, multiplied by per-cell output, gives total product), atop simulator state driven by Monod growth kinetics). The product pages claim 30 to 60 percent — up to 80 percent — fewer experiments; every one of those figures is vendor-self-reported, and the spread itself (30 to 80 percent) is a tell that the number depends heavily on which baseline and which process you pick [1]. The honest strong anchor for DataHow is not the product page but the peer-reviewed-self-authored companion to its flagship Bristol Myers Squibb case: a Biotechnology Journal (2024) paper, co-authored by DataHow and BMS, on 48 experiments at 5 L with 12 critical process parameters (CPPs — the knobs an operator sets, like temperature or feed rate) and 18 critical quality attributes (CQAs), headlining roughly 33% better prediction accuracy with about half the data versus a black-box model [2]. The vendor's own page cites the larger "22% / 3x" framing; prefer the peer-reviewed numbers, note that "prediction accuracy" is not "yield" and is not closed-loop control, and note the maturity is process-development (pilot), not GMP production.

Yokogawa owns Insilico Biotechnology — the third attribution correction. Yokogawa acquired Insilico in November 2021; it was not Cytiva [8]. Insilico's approach is a metabolic-network model coupled to a data-driven (machine-learning) model — a hybrid digital twin (a live software model running alongside the real process) for soft-sensing (estimating a hard-to-measure quantity from cheap available signals) and model-predictive control (steering the process by simulating its near future) — and it sits at (production/pilot). (A full genome-scale metabolic model is rarely run directly inside a real-time twin; the deployed form is a reduced hybrid that keeps the physics tractable at line speed.) The two consolidation facts to keep straight, because they flip the company name on a slide, are nearly a mnemonic: GoSilico went to Cytiva, Insilico went to Yokogawa — similar names, different acquirers, different model classes (mechanistic versus hybrid). A slide that swaps either one has, in a single word, mis-stated the owner, the technology class, and the regulatory posture of the product.

The GxP-AI SaaS and data-layer players

A newer cohort sells AI as a regulated service, built for GxP (the family of "Good Practice" regulations — chiefly GMP — governing how medicines are made) from the start rather than retrofitted — and a parallel cohort sells the data plumbing that any AI needs before it can work at all.

Aizon is the clearest example of the first: a (production) GxP AI SaaS (Execute, Unify, Predict, plus an "Agentic Studio") whose flagship is a multi-site Grifols deployment. Aizon is also the rare vendor with a genuine peer-reviewed-self-authored anchor that is about the validation question itself — its study in the PDA Journal of Pharmaceutical Science and Technology on qualifying AI algorithms (demonstrated on an Isolation Forest) for regulated manufacturing is the notable peer-reviewed exception in a market of product pages [9]. That paper is worth more to a buyer than any outcome percentage, because it argues the how-to-validate, which is the part the rest of the market hand-waves. Its customer-outcome numbers (the generic "around 30% deviation reduction") are, by contrast, vendor-self-reported marketing, and its "Agentic AI" was pre-announced for early 2026 — announced, not demonstrated, which is exactly the gap this chapter is about. Aizon is the cohort's best illustration of the rule: judge it on its validation science, which is real and reviewed, not on its agentic roadmap, which is a press release.

TetraScience (Tetra OS / Tetra AI) and Ganymede (Lab-as-Code, since absorbed into Apprentice.io) sell the data layer that AI needs rather than the models themselves — the "AI-ready" replatforming of siloed lab and process data into FAIR (Findable, Accessible, Interoperable, Reusable), queryable form. This matters because, as the data-the-fuel chapter argued, data readiness is the field's number-one barrier, not algorithms; a perfect model on un-harmonized data is worth less than a mediocre model on clean, lineage-tracked data. TetraScience is (production) with vendor-self-reported deployment counts (claims of a dozen of the top-25 pharma; Takeda, Bayer, and others named) and outcome figures (QC turnaround "weeks to days"); Ganymede is (pilot), pre-general-availability at the time of its absorption. These products are real infrastructure, and the smartest near-term AI money in many plants goes here rather than into a flashier model — but the headline numbers are still marketing, and "AI-ready" is a readiness claim, not an outcome.

What "AI-ready" actually requires is the most under-specified phrase in the whole market, and reading it precisely is how a buyer separates a real data layer from a tidy database with a new label. The substantive version is semantic interoperability (different systems agreeing on what a value means, not just on a file format), which the data-management book shows resting on standards a vendor can be asked about by name: ISA-95 (the equipment-and-material hierarchy that fixes what a "batch," a "unit," and a "material lot" are), OPC UA (the modern industrial protocol that carries typed, modeled values off the plant floor rather than bare floats), and B2MML (the XML schema that moves a batch record between MES and ERP). A data layer that lands a historian reading as BR101.Temp.PV = 36.5 and nothing more has done syntactic work; one that lands it as a typed temperature in degrees Celsius, on the production phase of BATCH-2026-001, against its normal operating range, has done the semantic work an ML feature actually depends on. Ask a "data layer" vendor which of those it delivers, and the answer separates a queryable warehouse from genuinely model-ready data.

The next rung up — and the one the louder "AI-native" pitches gesture at without committing to — is an ontology / knowledge graph: a web of typed nodes and named edges, governed by an OWL/RDF model the way the ontology book builds for the same mAb campaign. The distinction the buyer must hold is the one that chapter draws: a vendor's "AI-native schema, taxonomy, or ontology" spans a wide rigor range, from a BFO-grounded OWL model that reuses public upper-level terms down to a private agreement about column names called an ontology because the marketing word is shared. Three things a real semantic layer buys an ML program are concrete enough to verify, not slideware. First, semantically-grounded features: a model that pulls "monomer purity" by its ontology IRI (bp:monomerPct) rather than by a fragile column name does not silently break when a source system renames a tag — the identifiers-and-units discipline is what makes a feature portable across the four systems that name one lot four ways. Second, SHACL-validated training data: the same closed-world release-gate shape that refuses a non-conformant lot also refuses a non-conformant training row, certifying that every required CQA is present, singular, typed, and in range before it becomes a feature — a model handed a lot whose HMW result silently never loaded will impute around the hole and report a confident wrong number, which is exactly the failure SHACL exists to forbid. Third, lineage as the grouping key: because the graph records bp:derivedFrom edges, a model can be split by batch with a grouped, leave-one-batch-out cross-validation rather than at random — the lineage that scopes a recall is, for free, the schema that keeps a learning curve honest, as the ontologies-and-AI chapter details. None of this is hypothetical: it is the same machinery the companion ontology book runs as passing tests, and it is the substance underneath a "data layer" sales claim. A graph is also what a GraphRAG copilot — a language model grounded against retrieved, typed facts rather than its training memory — must stand on to answer a lineage question truthfully instead of fluently inventing one. So when a vendor says "ontology," the buyer's move is the chapter's move: ask whether it is a formal, reusable OWL model or a private schema wearing the word, because only the first delivers the semantic features, the shape-validated data, and the lineage-grouped splits that make a model trustworthy.

The automation and MES incumbents: ML layered on, not native

The biggest installed base in the plant belongs to companies that have been there for decades, and their AI story is almost always a layer added to an existing control or execution backbone — which is both its strength (it is where the data and the GMP discipline already live) and its limit (the ML is rarely the product's core, and the analytics roadmap is rarely the reason anyone bought the platform).

Korber (Werum PAS-X MES) is the canonical case: a review-by-exception execution layer (where reviewers inspect only the records that flag a deviation, not every routine one) that is unambiguously (production) in commercial biomanufacturing, onto which ML and "Agentic AI" (K.AI, B.R.A.I.N.) are now being layered. PAS-X's "up to 98% right first time" and review-by-exception language is an exact quote of the vendor page — and it is vendor-self-reported, with "up to 98%" a best-case ceiling rather than a typical result, and "right first time" a property of the electronic-batch-record (eBR — the digital record of how a batch was made) workflow, not of any AI; the install-base figures (1000-plus installations, a large share of top-20 biotech) are real Korber claims but appear on different pages and are easy to conflate into a single inflated impression [10]. The honest reading is that PAS-X is production-grade as an MES — and the "right first time" win is mostly the discipline of a structured eBR, not a model — while the ML and agentic features on top are early.

Siemens (PCS 7/neo and SIPAT for PAT, Opcenter for MES, gPROMS for modeling), Rockwell (FactoryTalk PharmaSuite), Emerson (DeltaV with Syncade, plus AspenTech analytics), and AVEVA (PI data infrastructure and predictive analytics) round out the automation incumbents. All are (production) as control, historian, and execution platforms — AVEVA's PI System is the very historian (the time-series database that records every plant sensor reading) the data-management book builds its data shadow on, which is itself the point: these companies own the data source, which is a real and durable advantage when the field's binding constraint is data readiness. All also have a thinner, more (pilot) or aspirational ML story bolted on. Rockwell's ML in PharmaSuite is the thinnest of the group; Siemens (SIPAT plus gPROMS mechanistic modeling) and Emerson (AspenTech analytics plus DeltaV PredictPro) have the most credible analytics layers, and Siemens is notable for selling mechanistic modeling (gPROMS) rather than ML — another case where "model" on a slide does not mean "machine learning." The pattern holds across the tier: the backbone is real and validated; the intelligence on top is being layered in, and the buyer should price the backbone, not the brochure.

Consumables, CDS, and the validation stack

Three more clusters complete the map, and each repeats the chapter's central lesson — production-grade core, marketing-grade AI layer — in a different corner of the plant.

The consumables-and-intelligence houses sell manufacturing intelligence alongside their hardware. MilliporeSigma/Merck KGaA offers BioContinuum and Bio4C ProcessPad — manufacturing-intelligence and CPV platforms that are (production) and built mostly on multivariate and statistical methods, with limited genuine deep-learning content under the branding; "Bio4C" on a slide is, in practice, productized MVDA and CPV, not a neural network. Thermo Fisher is more ecosystem than product: OSDPredict plus a web of OpenAI, NVIDIA, and TetraScience partnerships, mostly at the (pilot) / announcement stage — a portfolio of partner logos is not a deployed model, and should be read as direction, not delivery.

The chromatography-data-system and QC corner belongs to Waters Empower, the dominant CDS, which has added ML-flavored anomaly detection on top of its deterministic ApexTrack peak integration — (production) as a CDS, with the ML as an additive feature rather than a reinvention. The deterministic integrator is what a QC lab validates and trusts; the anomaly detection is a convenience layer, and conflating the two would mis-state what the lab is actually relying on for a release result. Peak integration is itself a validated, locked part of the assay — its parameters are fixed in the method and policed by system suitability, and manual reintegration is a frequent OOS (out-of-specification result) and data-integrity finding — which is exactly why the reportable result (the size-exclusion-chromatography monomer percentage — the fraction of intact, un-aggregated antibody — that a lot must clear to be released for sale) must come from the deterministic integrator and the ML anomaly layer must stay advisory; conflating them would put a learning model in the data-integrity path of a release decision.

The validation and regulated-content stack is where AI meets the paperwork. ValGenesis (VLMS validation lifecycle management, plus a "Smart GxP" / VAL AI platform) is (production) for digital validation, with its "80% faster" figures vendor-self-reported — and, per the denominator test above, faster than an undefined paper-based baseline [11]. Veeva (Vault Quality/QMS, with quality AI agents on a 2026 roadmap) is (production) for regulated content management and (pilot) for its AI agents. This stack is where the most consequential near-term AI actually lands — drafting, review, and change control of GxP documents — and also where the regulator drew its first hard line, as the next section describes. It is worth holding both facts at once: the document-drafting use case is genuinely the most valuable near-term AI in biomanufacturing, and it is the exact use case the FDA has already cited a firm for doing without human review.

The vendor map by honest maturity, not marketing: a production-grade core of multivariate monitoring, mechanistic and hybrid modeling, and computer vision, with the agentic-AI offerings drawn as hollow pills crossing the dashed frontier into territory that is announced but not yet demonstrated in routine GMP. Original diagram by the authors, created with AI assistance.

Agentic marketing versus demonstrated production

The defining tension of the 2026 market is the word agentic. Aizon's Agentic Studio, Korber's K.AI and B.R.A.I.N., Veeva's quality agents, Ganymede/Apprentice's scientific agents — nearly every vendor now sells an autonomous-AI story. The reality, consistent across the ISPE survey, the BioPhorum maturity model (BioPhorum is a biopharma industry collaboration; its model rates how far adopters have progressed), and the regulatory record, is that demonstrated production AI clusters in four narrow places: multivariate monitoring, predictive maintenance, computer-vision inspection, and human-in-the-loop documentation. Autonomous control of a critical quality attribute is not on that list, and neither is an unsupervised agent generating GMP records [3]. The asymmetry to internalize is that the four production uses share a property the agentic pitch lacks: a human remains the decision-maker. MVDA flags and a human investigates; vision rejects and a human dispositions; an LLM drafts and a quality unit approves. The moment a vendor's story removes that human from a critical decision, it has crossed from the demonstrated zone into the marketed one.

Two anchors keep that distinction from being merely an opinion. First, the draft Annex 22 (joint EU/PIC/S consultation, July to October 2025) — the first manufacturing-specific AI rule — draws the line in regulation: for critical GMP applications it permits only static, deterministic models (frozen after validation, giving the same output for the same input every time) and excludes dynamic, continuously-learning, probabilistic, and generative AI/LLM models (which keep changing, or give varying outputs) from critical use, requiring locked models with a predetermined change-control plan and human oversight [12]. Read against the vendor map, that single sentence reclassifies the marketing: a continuously-learning "self-improving" model and a generative-AI record author are, by the draft's own terms, barred from the critical decisions where their value is pitched, and confined to non-critical, human-supervised uses. It is a draft, expected to finalize around mid-2026. The exclusion is provisional — but it tells a buyer exactly which "agentic" claims cannot legally touch a critical decision today, and why the production core is overwhelmingly static-deterministic MVDA and locked vision models. Second, the Purolea cGMP warning letter (2 April 2026) — the FDA's first AI-citing warning letter — turned the gap into concrete enforcement: a firm used AI agents to generate drug-product specifications, SOPs, and master production records without adequate quality-unit review, the agents omitted process-validation requirements, the quality unit did not catch the omission, and the FDA cited it under 21 CFR 211.22(c) — the rule making the quality unit responsible for approving production records [13][14]. The lesson for reading a vendor slide is blunt: an "agentic" platform that promises to author GMP records autonomously is selling something the regulator has already named non-compliant. The agent that drafts and a human approves is a product; the agent that decides is a liability.

Beyond those two enforcement anchors sits the FDA's broader risk lens. The honest synthesis of the FDA's posture is its 2023 discussion paper Artificial Intelligence in Drug Manufacturing under the FRAME initiative of CDER (the FDA's Center for Drug Evaluation and Research), together with its January 2025 draft framework for AI model credibility [15][16]. Both share a risk-based, seven-step "context of use" approach: the required scrutiny scales with how much the model influences a decision and how serious the consequence is. A buyer can use that same risk lens on a vendor: the more a product touches a critical decision, the more it must show locked models, validation evidence, and a human in the loop — and the less a press release should be allowed to substitute for any of it. The lens cuts both ways across the map — it explains why MVDA monitoring (high influence, but inspectable static models with a recognized path) ships in production, while autonomous CQA control (high influence, opaque dynamic models, severe consequence) does not.

# examples/platform/ml/run_suite.py  (illustrative wrapper over the Book-5 suite;
# the real, committed harness is examples/platform/ml/run_all.py)
# The open-source counterpoint: every capability the vendors sell, runnable from
# the committed datasets, with no license and no service. Each row maps a market
# category to the suite module that implements its core method in the open.
from importlib import import_module

CAPABILITY_MAP = {
    # market category (what the vendors sell)      ->  open suite module + method
    "MVDA / golden-batch monitoring (SIMCA, ProMV)": ("mspc",            "PCA + T2/SPE, contribution plots"),
    "Raman soft sensing (SIMCA, BioPAT, Insilico)":  ("soft_sensor_pls", "PLS + VIP + AD gate (within-batch interpolation; cross-batch split lives in transfer/drift)"),
    "deep soft sensing (research-tier)":             ("soft_sensor_deep","1D-CNN, beaten by PLS in small data"),
    "hybrid digital twin (DataHow, Insilico)":       ("hybrid_model",    "IVCD x qP backbone + NN residual"),
    "computer-vision AVI (Stevanato, Syntegon)":     ("vision_avi",      "CNN reject classifier on fill events"),
    "predictive maintenance (Korber, Siemens)":      ("pdm",             "anomaly score on equipment signals"),
    "release / OOS prediction (Bio4C, iCPV)":        ("release_predict", "logistic classifier, nested 5x5 CV over a 120-batch cohort"),
    "drift / MLOps (the validation gap)":            ("drift",           "PSI + residual drift on the soft sensor"),
}

def survey():
    print(f"{'market category':48s}  module             open method")
    print("-" * 96)
    for category, (module, method) in CAPABILITY_MAP.items():
        import_module(module)                      # each module imports + runs standalone
        print(f"{category:48s}  {module:18s} {method}")
    print(f"\n{len(CAPABILITY_MAP)} vendor categories shown here map onto open modules; "
          f"the full run_all.py harness gates 21 of the suite's 33 model modules, 0 licenses, 0 services.")

if __name__ == "__main__":
    survey()

market category                                   module             open method
------------------------------------------------------------------------------------------------
MVDA / golden-batch monitoring (SIMCA, ProMV)     mspc               PCA + T2/SPE, contribution plots
Raman soft sensing (SIMCA, BioPAT, Insilico)      soft_sensor_pls    PLS + VIP + AD gate (within-batch interpolation; cross-batch split lives in transfer/drift)
deep soft sensing (research-tier)                 soft_sensor_deep   1D-CNN, beaten by PLS in small data
hybrid digital twin (DataHow, Insilico)           hybrid_model       IVCD x qP backbone + NN residual
computer-vision AVI (Stevanato, Syntegon)         vision_avi         CNN reject classifier on fill events
predictive maintenance (Korber, Siemens)          pdm                anomaly score on equipment signals
release / OOS prediction (Bio4C, iCPV)            release_predict    logistic classifier, nested 5x5 CV over a 120-batch cohort
drift / MLOps (the validation gap)                drift              PSI + residual drift on the soft sensor

8 vendor categories shown here map onto open modules; the full run_all.py harness gates 21 of the suite's 33 model modules, 0 licenses, 0 services.

The split-aware notes in that table are the validation discipline earlier chapters built: VIP (variable-importance-in-projection) ranks which inputs a PLS model leans on, an AD (applicability-domain) gate refuses to score samples that fall outside the training envelope, a within-batch versus cross-batch split prevents the data leakage that would otherwise flatter a soft sensor, and a nested 5x5 cross-validation (CV) keeps model selection from peeking at the test fold — all defined in the models-and-validation chapter. The three counts in the table's footer are not in tension: the eight capability-map rows above are a curated subset of the 21 model modules that run_all.py gates, which are themselves the runnable subset of the suite's 33 total modules.

The point of the mapping is not that the open suite replaces a validated commercial platform — it emphatically does not, as the MLOps chapter and every governance section in this book insist. The point is that the methods the vendors sell are, in almost every case, open and reproducible: PCA, PLS, gradient-boosted trees, a small CNN, a mechanistic-plus-residual hybrid. The real harness, examples/platform/ml/run_all.py, makes the deeper point: it runs 21 of the suite's 33 modules as subprocesses, captures whether each module's end-of-script assert over its acceptance criterion held, and emits a single ledger of which model cleared which pre-stated gate on which SHA-256-pinned dataset — the script-level analogue of a validation protocol's acceptance criteria and frozen-at-validation data pin. The committed run reports 21/21 models cleared their acceptance gate on the pinned datasets, a concrete, re-runnable counterpoint to any unverifiable vendor headline. One of those gated modules, batch_mvda.py, is the open analogue of AspenTech ProMV's golden-batch monitoring (DTW-aligned trajectories, batch-wise unfolding, then MPCA), and it makes ProMV's own "fallacy of the golden batch" critique concrete: online golden-batch MSPC needs trajectory alignment before unfolding, because misaligned trajectories are the classic false-alarm source. What you buy from a vendor is not usually a secret algorithm; it is the validated wrapper — the IQ/OQ/PQ package (installation, operational, and performance qualification — the documented proof the system is built, runs, and performs as specified), the audit trail, the change control, the accountable support contract, the locked model with its predetermined change-control plan — around methods you can otherwise run for free. Knowing that is how you tell a fair price from a marketing premium: you are paying for the wrapper and the accountability, not for the math.

Anatomy of one vendor claim, fact-checked

The series signature is to unpack one record field by field. Here the record is a vendor claim — the unit of currency in this whole market — and the dissection is the fact-check that turns a slide bullet into something a quality unit can actually rely on.

One vendor claim, fully fact-checked: the claimant with its ownership corrected, the headline number shown in both its vendor-page and peer-reviewed forms, the evidence tier and maturity marker that bound how far the number can travel, and the verdict that turns a marketing figure into a defensible one. Original diagram by the authors, created with AI assistance.

Read the card top to bottom and the chapter's whole method is laid out as fields a buyer can fill in for any claim they meet.

Claimant. Who is making the claim, and is the company even named correctly? Half of misattributed claims start with the wrong company name — DataHow is independent, GoSilico is Cytiva, Insilico is Yokogawa — so the very first field carries an ownership correction. If the name is wrong, every field below it is, too.
Customer / process. Whose process produced the number? A figure measured on one customer's idiosyncratic, confidential process (here, Bristol Myers Squibb) is a data point, not an industry constant. The customer field is where "30 to 80 percent" silently becomes "30 percent on a process unlike yours."
Headline figure. Shown twice, in its vendor-page form (22% / 3x) and its peer-reviewed form (33% / about half the data), because the gap between them is the single most actionable thing on the card: prefer the journal's figure to the product page's, every time [2]. And always ask the denominator — better than what baseline, over what scope.
Evidence tier. Peer-reviewed-self-authored here: a real journal (Biotechnology Journal, 2024), but DataHow and BMS are the co-authors, so it is stronger than a product page and weaker than an independent replication that does not exist.
Maturity marker. Pilot, at process-development scale (5 L, 12 CPPs, 18 CQAs) — explicitly not GMP production. The number is real at development scale and says nothing about a commercial line.
Scope. The most common over-read, stopped cold: a prediction-accuracy gain against a black-box baseline is not a yield gain and is not closed-loop control. The model predicts better; it does not run the bioreactor.
Verdict. The deliverable — a sentence a quality unit can act on, with the number, its limits, and its source attached: real method, self-authored evidence, pilot maturity, quote the journal not the product page.

That seven-field pass is the difference between citing a market and being sold one, and it is portable: the same card, filled in for Korber's "98% right first time" or ValGenesis's "80% faster," exposes the missing denominator, the best-case ceiling, and the self-reported tier in the same three moves. Run the card on Korber's "98% right first time" and it falls out fast: claimant correct, but the headline is "up to 98%" (a best-case ceiling, not a typical result), the scope is the eBR workflow rather than any AI, the tier is vendor-self-reported, and the denominator — right first time versus what prior rate? — is absent; verdict, a real MES property mislabeled as an AI win.

The unsolved part: there is almost no independent evidence

The honest open problem of this whole chapter is that the market grades its own homework. Survey the evidence tiers across every vendor here and the pattern is stark: a thick band of vendor-self-reported product-page figures, a thin band of peer-reviewed-self-authored papers where the vendor or customer is a co-author, and — for commercial efficiency headlines — essentially no peer-reviewed-independent replication at all. The numbers that drive purchasing decisions (Korber's 98% right-first-time, DataHow's 80% fewer experiments, ValGenesis's 80% faster validation, Amgen's 95% auto-released, Aizon's 30% fewer deviations) are every one of them generated by a party with a stake in the result, on its own process, without an outside replication [1][5][10][11][17].

This is not (mostly) dishonesty; it is structural. Biomanufacturing processes are confidential, expensive, and idiosyncratic, so a head-to-head independent benchmark — the kind that grounds claims in computer vision or natural-language processing, where ImageNet and GLUE let anyone re-run the leaderboard — is almost impossible to assemble on a real GMP line: the data cannot leave the company, the process cannot be shared, and no two cell lines are the same. The consequence is that a buyer cannot resolve "is this 30% real?" by reading the literature; they can only resolve "is this party credible, is this figure peer-reviewed even if self-authored, and does the maturity marker match the deployment I need?" The field's missing institution is an independent, neutral benchmark for bioprocess models — a shared, realistic, public dataset and an agreed scoring protocol against which any vendor's method could be measured by someone with no stake in the answer. The open suite in this book is a deliberately small gesture in that direction: a public dataset, a stated acceptance gate per module, and a reproducible harness that anyone can re-run — not a benchmark of vendors, but an existence proof that the methods can be scored transparently when the data is allowed to be public. Until a real shared benchmark exists, the most rigorous thing a buyer can do is exactly the anatomy above: demand the peer-reviewed figure over the product page, match maturity to use, ask for the denominator, and treat every headline percentage as a hypothesis, not a fact.

What this chapter adds to the model suite

This is a survey chapter, so it contributes no new modeling method; instead it reframes the entire examples/platform/ml/ suite as the open-source counterpoint to the proprietary market it maps. The illustrative run_suite.py wrapper above is the chapter's reading aid — a capability map that pins each vendor category (MVDA monitoring, Raman soft sensing, deep soft sensing, hybrid twins, vision AVI, predictive maintenance, release prediction, drift/MLOps) to the open module that implements its core method (mspc.py, soft_sensor_pls.py, soft_sensor_deep.py, hybrid_model.py, vision_avi.py, pdm.py, release_predict.py, drift.py) — and the committed run_all.py is the artifact that makes the counterpoint rigorous, running every module as a subprocess, checking each one's assert-encoded acceptance gate, and emitting a SHA-256-pinned evidence ledger. All the suite's runnable modules read the same committed datasets and simulator-backed cohort the rest of the book uses, run standalone with no services, and demonstrate — module by module — that the methods behind the vendor catalog are open. What the suite deliberately does not provide, and what the vendors genuinely sell, is the validated GxP wrapper around those methods: the locked model, the IQ/OQ/PQ package, the audit trail, the predetermined change-control plan, and the accountable support contract. The suite is the method made transparent; the validated wrapper is the work, and the MLOps chapter is where that work is specified.

Why it matters

A buyer who cannot read this market overpays for marketing and underinvests in the boring infrastructure — data readiness, validation, change control — that actually determines whether any of it works. The map in this chapter is a defense against three specific, expensive mistakes. The first is attribution error: shelving GoSilico under "AI" (it is mechanistic), crediting Sartorius with DataHow's hybrid models (DataHow is independent), or naming the wrong acquirer for Insilico (Yokogawa, not Cytiva) — each of which quietly distorts a build-versus-buy decision, because a mechanistic tool, a hybrid model, and a black box carry different validation obligations and different regulatory postures. The second is tier confusion: letting a production-grade platform's marketing percentage masquerade as established fact, when it is a self-reported figure on the vendor's own process with no stated denominator. The third is the agentic over-buy: paying for autonomous-AI promises that the draft Annex 22 forbids in critical use and that the Purolea warning letter has already shown the regulator will cite. Get the map right and the spend follows the value: a proven MVDA or vision platform where the production evidence is real, an honest pilot where the technology is genuinely early, the data-readiness layer that every later model depends on, and human-in-the-loop guardrails everywhere a model touches a critical decision.

In the real world

The market is consolidating fast, and the consolidation map is itself a buyer's tool, because it changes the name on the contract and the support behind the product. The moves to keep straight: GoSilico went to Cytiva (2021); Insilico went to Yokogawa (2021); 908 Devices' bioprocessing analytics portfolio went to Repligen (2025); Ganymede folded into Apprentice.io; and Emerson absorbed AspenTech (a roughly fifteen-billion-dollar move) [6][7][8]. Each move re-attributes a product on a slide a year later, and the two name-collisions — GoSilico/Insilico, Cytiva/Yokogawa — are the ones that survive into a procurement document and quietly mislead it. The structural gap underneath all of it is the one the open suite exists to illustrate: vendor platforms are proprietary, and reproducible open-source GMP-grade bioprocess ML tooling remains immature, with DeepChem (an open-source deep-learning library for chemistry and life sciences) and academic libraries dominating the open side and no neutral benchmark anywhere. The strongest single production case in the whole market — deep-learning automated visual inspection (cameras and a vision model checking each filled syringe or vial for particles, cracks, and fill defects), with Amgen reporting roughly 95% of syringes and vials auto-released (cleared by the model with no human re-inspection) — took years and direct conversations with the FDA to qualify, and even that headline is trade-press and vendor-self-reported [17]. That is the market in one sentence: the genuinely production-grade wins are narrow, hard-won, and older than the hype; the agentic future is real as a direction and oversold as a product; and the buyer's edge is the discipline to tell the two apart.

Key terms

Evidence tier — how a claim was established: peer-reviewed-independent, peer-reviewed-self-authored, vendor-self-reported, or press-release-only. The market is dominated by the latter two.
Maturity marker — how far a product actually got: (production) in GMP/commercial, (pilot) demonstrated at scale, (research) academic/early. Independent of the evidence tier.
Denominator / counterfactual test — the third question every headline number must answer: faster, fewer, or better than what baseline, over what scope, on whose process. A figure with no stated comparator is a target, not a result.
MVDA / MSPC incumbents — Sartorius SIMCA/Umetrics and AspenTech ProMV (Emerson): productized PCA/PLS monitoring, the genuinely production-grade core of the market — and the part with no agentic gap, because it flags for a human rather than deciding.
Mechanistic vs hybrid vs ML — GoSilico (Cytiva) is mechanistic (physics, not learning); DataHow and Insilico (Yokogawa) are hybrid (physics plus a learned residual); pure ML is the rarest in production. Each class carries different validation obligations.
Attribution corrections — DataHow is independent (not Sartorius); GoSilico is Cytiva and mechanistic; Insilico is Yokogawa (not Cytiva).
GxP-AI SaaS / data layer — Aizon (regulated AI service, with a rare peer-reviewed validation paper), TetraScience and Ganymede (the AI-ready data layer that addresses the field's number-one barrier).
"AI-ready" / semantic interoperability — the under-specified core of a data-layer claim: systems agreeing on what a value means, grounded in named standards (ISA-95 hierarchy, OPC UA typed values, B2MML batch records). Ask which a vendor delivers — syntactic plumbing or genuinely semantic data — because a feature depends on the latter.
Ontology / knowledge graph (as ML ground truth) — a formal OWL/RDF model of typed nodes and named edges that buys an ML program three verifiable things: semantically-grounded features (pulled by IRI, not a fragile column name), SHACL-validated training data (the release-gate shape certifies completeness), and lineage as the grouping key (bp:derivedFrom enables honest leave-one-batch-out splits). A vendor's "ontology" may be this or a private schema wearing the word — only the first delivers these.
MES/automation backbone — Korber PAS-X, Siemens, Rockwell, Emerson, AVEVA: production-grade execution, control, and historian platforms onto which ML is layered, not native; they own the data source, which is the durable advantage.
Agentic frontier — the line between demonstrated production AI (monitoring, maintenance, vision, human-in-the-loop docs) and announced-but-undemonstrated autonomous agents; drawn in regulation by draft Annex 22 and in enforcement by the Purolea warning letter.
Consolidation map — who bought whom; changes the name on the contract (the full five-deal list is in "In the real world"). The two name-collisions that survive into a procurement document and mislead it are GoSilico/Insilico (different products) and Cytiva/Yokogawa (different acquirers).
Validated wrapper — the IQ/OQ/PQ, audit trail, change control, and locked model with a predetermined change-control plan that a vendor sells around methods that are otherwise open. The premium is the wrapper and the accountability, not the algorithm.

Where this leads

The map tells you who sells what; it does not tell you whether the named deployments behind the brochures actually held up. The next chapter, Case Studies: Named Deployments and Their Evidence, takes the specific companies and numbers — Amgen Juncos, the DataHow/BMS hybrid model, WuXi's autonomous lab, Sanofi's yield analytics, the visual-inspection deployments — and grades each against its actual evidence, applying the very anatomy this chapter built to the headline stories the market tells about itself.

What this chapter covers​

How to read a vendor claim: tiers, maturity, and the headline number​

The MVDA and PAT incumbents: the production-grade core​

The mechanistic and hybrid specialists: where attribution goes wrong​

The GxP-AI SaaS and data-layer players​

The automation and MES incumbents: ML layered on, not native​

Consumables, CDS, and the validation stack​

Agentic marketing versus demonstrated production​

Anatomy of one vendor claim, fact-checked​

The unsolved part: there is almost no independent evidence​

What this chapter adds to the model suite​

Why it matters​

In the real world​

Key terms​

Where this leads​