References
📍 Where we are: The evidence base for the whole book, gathered in one place.
Every inline citation marker in this book — written as a bracketed number like [1] — resolves here, grouped by chapter, with numbering that matches the markers in each chapter. Sources are real and verifiable; vendor or self-reported sources are marked as such.
Preface
- Minero, T. & Kuger, L. "The 7th ISPE Pharma 4.0™ Survey: Digital Transformation." Pharmaceutical Engineering (ISPE), September/October 2024, Online Exclusives. https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital — Survey of 418 respondents from 45 countries; reports that "AI and ML, despite being widely referenced, have yet to achieve significant large-scale implementation," with adoption concentrated in planning/pilot phases rather than systematic deployment. (Trade/industry-society publication; supports the claim that AI/ML has many pilots but few scaled GMP implementations. Note: ISPE's accompanying maturity grouping lists AI/ML among more rapidly progressing technologies, so the precise inline framing should be read against the survey's own nuance.)
- von Stosch, M., Oliveira, R., Peres, J. & Feyo de Azevedo, S. "Hybrid semi-parametric modeling in process systems engineering: Past, present and future." Computers & Chemical Engineering 60 (2014): 86–101. DOI: 10.1016/j.compchemeng.2013.08.008 — Foundational peer-reviewed review establishing that combining a mechanistic (parametric) backbone with a data-driven (nonparametric) component offers a broader knowledge base, better extrapolation, and more cost-effective model development than either pure approach, while cautioning that the advantage requires rational knowledge integration. See also von Stosch, M. et al., "Hybrid modeling for quality by design and PAT — benefits and challenges of applications in biopharmaceutical industry," Biotechnology Journal 9(6) (2014): 719–726, DOI: 10.1002/biot.201300385, for the biopharma-specific case. Quantitative support for hybrid beating pure data-driven PLS on extrapolation (scaled RMSE +5–10% vs +50–80%) is summarized in the digital-twins review by Smiatek et al./Sokolov et al., arXiv:2504.00286 (2025).
- Kapoor, S. & Narayanan, A. "Leakage and the reproducibility crisis in machine-learning-based science." Patterns 4(9) (2023): 100804, DOI: 10.1016/j.patter.2023.100804 (preprint arXiv:2207.07048) — documents how methodological pitfalls (e.g., data leakage, improper train/test handling) produce exaggerated, non-reproducible performance claims, and that non-replicable findings are often cited more than replicable ones. Reinforced for the pharmaceutical/drug-discovery setting by Gangwal, A. et al., "IMPACT Framework: Establishing Global Standards for Artificial Intelligence Implementation, Methodology, and Translation in Drug Discovery," WIREs Computational Molecular Science 16 (2026), DOI: 10.1002/wcms.70072, which argues that methods marketed as "revolutionary" often prove only marginally competitive under independent evaluation and that overstated claims undermine credibility absent independent validation. Together these peer-reviewed sources support the convention of separating maturity from evidence tier and treating headline efficiency numbers as hedged until independently verified.
The Learning Problem: Why Bioprocess Breaks the Data-Science Rulebook
- International Society for Pharmaceutical Engineering (ISPE). "The 7th ISPE Pharma 4.0™ Survey: Digital Transformation." Pharmaceutical Engineering, September/October 2024. The survey of pharmaceutical manufacturers tracks adoption maturity (not started / just starting / pilots / systematic ongoing actions) across digital technologies, and reports that AI/ML carries among the most pilots yet the fewest scaled, large-scale implementations of any technology surveyed, with production deployments concentrated in monitoring, predictive maintenance, image/vision recognition, and human-in-the-loop documentation rather than autonomous control. https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital
- McKinsey & Company (QuantumBlack). "The State of AI in 2025: Agents, Innovation, and Transformation." Global Survey, 2025. Reports that roughly 88% of organizations use AI in at least one business function (up from 78% a year earlier), while enterprise-wide financial impact remains rare — about 39% of respondents attribute any EBIT impact to AI and most of those report under 5%, with nearly two-thirds of organizations not yet scaling AI across the enterprise (the widely-cited ~6% reporting meaningful enterprise-wide impact). https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
- Peng, J., Khuat, T.T., Musial, K., & Gabrys, B. (2025). "Machine Learning Methods for Small Data and Upstream Bioprocessing Applications: A Comprehensive Review." arXiv:2506.12322. This review frames data scarcity — the high cost and time per bioreactor run and the especially small number of offline ground-truth (label) measurements relative to abundant PAT/Raman spectra — as the binding constraint of upstream bioprocess ML, naming the under-researched cold-start problem for Raman soft sensors and identifying small sample sizes plus sub-optimal sampling/validation choices (data leakage) as drivers of over-estimated performance and poor reproducibility/transferability. Corroborated by the peer-reviewed review: Helleckes, L.M., Hemmerich, J., Wiechert, W., von Lieres, E., & Grünberger, A. (2023). "Machine learning in bioprocess development: from promise to practice." Trends in Biotechnology, 41(6), 817–835. DOI: 10.1016/j.tibtech.2022.10.010. https://arxiv.org/abs/2506.12322
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research (CDER). "Discussion Paper: Artificial Intelligence in Drug Manufacturing" (published March 1, 2023; comment period reopened Sept. 27, 2023). Federal Register Vol. 88, No. 40 (2023-04206). Sets out CDER/CBER areas for consideration on AI in pharmaceutical manufacturing, emphasizing risk-based model development/validation and lifecycle governance within the pharmaceutical quality system. Paired with the draft EU GMP Annex 22 — "Artificial Intelligence" (European Commission, draft for public consultation, July 2025; consultation closed October 2025), which would permit only static, deterministic models in critical GMP applications and explicitly excludes dynamic continuously-learning, probabilistic, and generative AI/LLMs from critical use. https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and ; https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en
Data, the Fuel: Readiness, Features, and the Cold-Start Reality
- Zifo Technologies. "Data Readiness Survey" (2024-2025), polling scientists and informaticians across 30+ science-driven companies. Reported that 70% find accessing the data needed for AI projects difficult or somewhat difficult, and only 39% agree their organization has adopted standardized data formats and ontologies. Announced via PR Newswire, "Zifo's Global Survey Reveals Early Momentum for AI in Biopharma, But Data Readiness Remains Key Hurdle," 24 July 2025, https://www.prnewswire.com/news-releases/zifos-global-survey-reveals-early-momentum-for-ai-in-biopharma-but-data-readiness-remains-key-hurdle-302513000.html. (Vendor / self-reported, as flagged in the prose.)
- BioPhorum. "Managing data as a product for digital transformation in the pharmaceutical industry" (BioPhorum IT & Digital Plant workstream). Frames a dataset as a product with an owner, schema, quality contract, and consumer, governed by FAIR principles, as the remedy for siloed, fragmented, and underutilized biopharma data. https://www.biophorum.com/download/managing-data-as-a-product-for-digital-transformation-in-the-pharmaceutical-industry/. (Industry consensus / trade body.)
- Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., et al. "The FAIR Guiding Principles for scientific data management and stewardship." Scientific Data 3, 160018 (2016). DOI: 10.1038/sdata.2016.18. The original definition of the four FAIR principles: data should be Findable, Accessible, Interoperable, and Reusable. https://www.nature.com/articles/sdata201618
- Mowbray, M., Savage, T., Wu, C., Song, Z., Cho, B. A., Del Rio-Chanona, E. A., & Zhang, D. "Machine learning for biochemical engineering: A review." Biochemical Engineering Journal 172, 108054 (2021), DOI: 10.1016/j.bej.2021.108054; and Helleckes, L. M., Hemmerich, J., Wiechert, W., von Lieres, E., & Grunberger, A. "Machine learning in bioprocess development: from promise to practice." Trends in Biotechnology 41(6), 817-835 (2023), DOI: 10.1016/j.tibtech.2022.10.010. These reviews characterize the small-data / cold-start regime, sparse low-dimensional offline observation of living systems, run-to-run variability that compromises model transferability, and rapid model decay in production. (Peer-reviewed review / consensus.)
- MHRA (UK Medicines and Healthcare products Regulatory Agency). "GXP Data Integrity Guidance and Definitions," Revision 1, March 2018. Defines the ALCOA attributes (Attributable, Legible, Contemporaneous, Original, Accurate) and the '+' attributes (Complete, Consistent, Enduring, Available), harmonized with PIC/S, WHO, OECD, and EMA. https://assets.publishing.service.gov.uk/media/5aa2b9ede5274a3e391e37f3/MHRA_GxP_data_integrity_guide_March_edited_Final.pdf. See also WHO Technical Report Series No. 996, Annex 5, "Guidance on good data and record management practices" (2016).
- U.S. FDA / CDER. "Artificial Intelligence in Drug Manufacturing" — Discussion Paper and Request for Feedback (Docket FDA-2023-N-0487; 88 FR 12943), 1 March 2023, https://www.fda.gov/media/165743/download; and FDA Draft Guidance, "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products" (Docket FDA-2024-D-4689), January 2025, which introduces a risk-based credibility-assessment framework for an AI model's context of use, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological. Both treat data integrity and data management as a precondition for credible models.
- European Commission, EudraLex Volume 4, Draft Annex 22 "Artificial Intelligence" (released for consultation 7 July 2025, drafted by the EMA GMDP Inspectors' Working Group with PIC/S), https://health.ec.europa.eu/latest-updates/public-consultation-revision-annex-11-and-new-annex-22-eu-gmp-guide-2025-07-07_en; and ISPE. "ISPE GAMP Guide: Artificial Intelligence" (published 23 July 2025), a risk-based lifecycle framework building on GAMP 5 and the GAMP Records and Data Integrity guides, https://ispe.org/publications/guidance-documents/gamp-guide-artificial-intelligence. Annex 22 covers static (locked) models with predetermined change control and treats data quality/integrity and human oversight as central.
- Berry, B., Moretto, J., Matthews, T., Smelko, J., & Wiltberger, K. "Cross-scale predictive modeling of CHO cell culture growth and metabolites using Raman spectroscopy and multivariate analysis," Biotechnology Progress 31(2), 566-577 (2015), DOI: 10.1002/btpr.2035; Berry, B. N., Dobrowsky, T. M., Timson, R. C., et al. "Quick generation of Raman spectroscopy based in-process glucose control to influence biopharmaceutical protein product quality during mammalian cell culture," Biotechnology Progress 32(1), 224-234 (2016), DOI: 10.1002/btpr.2205 (PMID 26587969); and Gibbons, L., Rafferty, C., Robinson, K., et al. "An assessment of the impact of Raman based glucose feedback control on CHO cell bioreactor process development," Biotechnology Progress 39(4), e3371 (2023), DOI: 10.1002/btpr.3371. These establish in-line Raman with SNV/Savitzky-Golay preprocessing and PLS chemometrics for glucose, lactate, and titer, up to closed-loop glucose control in CHO culture. (Peer-reviewed; production practice.)
Models and Validation: From PLS to Transformers, Under GxP
- Wang J, Chen J, Studts J, Wang G. "Simultaneous prediction of 16 quality attributes during protein A chromatography using machine learning based Raman spectroscopy models." Biotechnology and Bioengineering, 2024 (PMID 38419489). DOI: 10.1002/bit.28679. https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/bit.28679 — The Boehringer Ingelheim (Late Stage Downstream Process Development) study that predicts 16 product quality attributes in-line during Protein A affinity chromatography from Raman spectra. Its calibration model is a k-nearest-neighbor (KNN) regressor, explicitly chosen over PLS and principal-component regression because it cut the high-molecular-weight MAE roughly three-fold versus PLS — i.e., the best model is classical KNN, not a neural network. [Evidence tier: peer-reviewed-self-authored (vendor-co-authored: Boehringer Ingelheim + Karlsruhe Institute of Technology). Maturity: pilot.]
- Two complementary sources support PLS/PCA chemometrics as the documented incumbent of commercial spectroscopic PAT and multivariate batch monitoring. (a) Peer-reviewed: Voss J-P, et al. "Monitoring of Nutrients, Metabolites, IgG Titer, and Cell Densities in 10 L Bioreactors Using Raman Spectroscopy and PLS Regression Models." Pharmaceutics, 2025 (PMID 40284468; PMC12030344). https://pmc.ncbi.nlm.nih.gov/articles/PMC12030344/ — in-line Raman + PLS soft sensors for glucose, lactate, and IgG titer at R-squared > 0.9, the canonical PAT use case (motivated explicitly by the FDA PAT initiative). See also Rafferty C, et al., "Analysis of chemometric models applied to Raman spectroscopy for monitoring key metabolites of cell culture," Biotechnology Progress, 2020 (PMID 32012476; DOI 10.1002/btpr.2977). (b) Vendor/self-reported (clearly marked): Sartorius SIMCA / SIMCA-online / SIMCA-control product documentation — productized PCA/PLS/OPLS with Hotelling's T-squared and DModX/SPE batch-monitoring control charts. https://www.sartorius.com/en/products/process-analytical-technology/data-analytics-software/mvda-software/simca (the analogous AspenTech ProMV is the comparable commercial MVDA suite). [Evidence tier: peer-reviewed-independent for (a); vendor-self-reported for (b). Maturity: production.]
- Gisperg F, Klausser R, Elshazly M, Kopp J, Přáda Brichtová E, Spadiut O. "Bayesian Optimization in Bioprocess Engineering—Where Do We Stand Today?" Biotechnology and Bioengineering, 2025, 122(6):1313-1325. DOI: 10.1002/bit.28960 (open access via PMC12067035). https://pmc.ncbi.nlm.nih.gov/articles/PMC12067035/ — peer-reviewed review establishing that Gaussian-process-surrogate Bayesian optimization reaches a competitive optimum in materially fewer experiments than fixed Design-of-Experiments grids across upstream and downstream bioprocess stages. Corroborated by Hashizume T, et al., "Accelerating cell culture media development using Bayesian optimization-based iterative experimental design," Nature Communications, 2025 (PMC12218302; https://www.nature.com/articles/s41467-025-61113-5), which reports 3-30x fewer experiments than standard DoE for media optimization. [Evidence tier: peer-reviewed-independent. Maturity: research.]
- Vogt S, et al. "Comparing machine learning methods on Raman spectra from eight different spectrometers." Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2025. DOI: 10.1016/j.saa.2025.125807. https://www.sciencedirect.com/science/article/pii/S1386142525001672 — a head-to-head deep-learning (1D-CNN) versus PLS Raman benchmark on glucose/acetate/magnesium-sulfate calibration spectra; the paper notes a CNN learns transformations that play the same role as PLS's preprocessing plus dimensionality reduction, and that the deep model's advantage materializes mainly in large multi-instrument datasets rather than decisively beating PLS on a single small dataset. Broader context: Recent reviews of Raman spectral ML (e.g., MDPI Sensors, 2026, https://www.mdpi.com/1424-8220/26/1/341) explicitly conclude that when sample data is limited and relationships are largely linear, classical chemometrics (PLS) can match or outperform deep learning, and that model choice should follow available spectral volume rather than assuming DL superiority. [Evidence tier: peer-reviewed-independent. Maturity: pilot/research.]
- International Society for Pharmaceutical Engineering (ISPE). GAMP Guide: Artificial Intelligence, 1st edition, July 2025. https://ispe.org/publications/guidance-documents/gamp-guide-artificial-intelligence — the first comprehensive GAMP guidance focused exclusively on AI/ML-enabled computerized systems in GxP environments, extending GAMP 5 (and the GAMP 5 2nd-edition Appendix D11 on AI/ML) into a risk-based, lifecycle-oriented framework that requires ongoing performance evidence rather than a one-time test. Announced via BioProcess International: https://www.bioprocessintl.com/regulations/ispe-releases-gamp-guide-on-artificial-intelligence (BPI). [Evidence tier: industry standards body (ISPE) guidance; trade-press confirmation peer-reviewed-independent in the editorial sense. Maturity: published guidance.]
- U.S. Food and Drug Administration. "Assessing the Credibility of Computational Modeling and Simulation in Medical Device Submissions: Guidance for Industry and FDA Staff." CDRH, final guidance, November 2023 (Docket FDA-2021-D-0980). https://www.fda.gov/regulatory-information/search-fda-guidance-documents/assessing-credibility-computational-modeling-and-simulation-medical-device-submissions — the FDA model-credibility framework referenced in the chapter, built on the FDA-recognized consensus standard ASME V&V 40 ("Assessing Credibility of Computational Modeling through Verification and Validation: Application to Medical Devices"). It establishes the central principle the chapter cites: a model is credible because risk-proportionate verification/validation evidence was produced and checked against a pre-stated context of use, not because it ran. Note: the chapter labels it a "7-step" framework; the FDA guidance actually lays out a risk-informed multi-step credibility-assessment process (commonly described as ~9 steps) with eight categories of credibility evidence — the count differs but the framework is this one. [Evidence tier: regulatory guidance (FDA). Maturity: finalized guidance.]
- European Commission, EudraLex Volume 4 (EU GMP), draft Annex 22 "Artificial Intelligence" — consultation guideline, published July 2025 (public consultation closed October 2025; companion to revised Annex 11 and Chapter 4). Official consultation document: https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en — the draft states it applies only to static models (parameters fixed, not adapting during use) with deterministic output, and that a model which adapts its performance during use must not be used in critical GMP applications; self-learning, generative AI, and LLMs are excluded from GMP-critical uses, with a predetermined change-control approach required for any update — i.e., it codifies the locked-model-plus-PCCP pattern for any model touching product quality, patient safety, or data integrity. (Anticipated to be issued jointly as a PIC/S annex.) Trade/analysis coverage: European Pharmaceutical Review, "What Annex 22 spells for AI in GMP manufacturing," https://www.europeanpharmaceuticalreview.com/what-annex-22-spells-for-ai-in-gmp-manufacturing/2135686.article. [Evidence tier: regulatory draft (EU Commission). Maturity: draft / in consultation.]
Target and Concept: Learning Where a Molecule Should Start
- Ochoa D, Hercules A, Carmona M, Suveges D, Gonzalez-Uriarte A, Malangone C, et al. "Open Targets Platform: supporting systematic drug–target identification and prioritisation." Nucleic Acids Research 49, no. D1 (2021): D1302–D1310. DOI: 10.1093/nar/gkaa1027. The platform integrates genetic, genomic, transcriptomic, and chemical/drug evidence into scored target–disease associations and an explicit small-molecule/antibody tractability assessment — the canonical peer-reviewed public example of evidence-integration learning and modality-aware tractability at the head of the discovery pipeline.
- Li B, Luo S, Wang W, Xu J, Liu D, Shameem M, Mattila J, Franklin MC, Hawkins PG, Atwal GS. "PROPERMAB: an integrative framework for in silico prediction of antibody developability using machine learning." mAbs 17, no. 1 (2025): 2474521. DOI: 10.1080/19420862.2025.2474521 (PMID: 40042626). Predicts multiple antibody developability metrics (e.g., HIC retention time, high-concentration viscosity) from sequence- and structure-derived molecular features, with structure features predictable directly from sequence for repertoire-scale screening.
- Kalejaye LA, Chu J-M, Wu I-E, Amofah B, Lee A, Hutchinson M, et al. "Accelerating high-concentration monoclonal antibody development with large-scale viscosity data and ensemble deep learning." mAbs 17, no. 1 (2025): 2483944. DOI: 10.1080/19420862.2025.2483944 (PMID: 40170162). Describes DeepViscosity, an ensemble of 102 artificial neural networks trained on a large, diverse panel (N=229 mAbs) to classify high-concentration viscosity as low (<=20 cP) or high (>20 cP) — a large-data ensemble model for the hardest developability property, viscosity.
- Makowski EK, Chen H-T, Wang T, Wu L, Huang J, Mock M, Underhill P, Pelegri-O'Day E, Maglalang E, Winters D, Tessier PM. "Reduction of monoclonal antibody viscosity using interpretable machine learning." mAbs 16, no. 1 (2024): 2303781. DOI: 10.1080/19420862.2024.2303781 (PMID: 38475982). An interpretable sequence-based model that not only predicts low-viscosity IgG1 variants from Fv-region sequence but, because it is interpretable, enables the design of specific residue mutations that experimentally reduce viscosity — predicting and explaining which residues drive viscosity.
- Bailly M, Mieczkowski C, Juan V, Metwally E, Tomazela D, Baker J, et al. "Predicting Antibody Developability Profiles Through Early Stage Discovery Screening." mAbs 12, no. 1 (2020): 1743053. DOI: 10.1080/19420862.2020.1743053. Documents the discovery-to-CMC/manufacturing handoff problem: developability liabilities knowable at the discovery/concept stage are routinely re-discovered empirically and expensively downstream because discovery and manufacturing operate on different timelines, tools, and incentives — supporting the chapter's claim that the spine gap is a recognized, well-documented issue rather than a niche complaint. (See also Jain T, et al., "Biophysical properties of the clinical-stage antibody landscape," PNAS 114(5):944–949, 2017, DOI: 10.1073/pnas.1616408114, on the distribution of approved-antibody developability against which candidates are flagged.)
- Rathore AS, Nikita S, Thakur G, Mishra S. "Artificial intelligence and machine learning applications in biopharmaceutical manufacturing." Trends in Biotechnology 41, no. 4 (2023): 497–510. DOI: 10.1016/j.tibtech.2022.08.007. Together with the small-data review by Tulsyan et al. and recent surveys (e.g., "Applications of Machine Learning in Biopharmaceutical Process Development and Manufacturing," arXiv:2310.09991, 2023; "Machine Learning Methods for Small Data and Upstream Bioprocessing," arXiv:2506.12322, 2025), these establish the cold-start / small-data regime as a fundamental, structural binding constraint for bioprocess ML: data is costly and slow to acquire, datasets have more measurements than independent observations, and labels arrive at the speed of drug development.
- Smith ET (Edward T. Smith) et al. "Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics." Frontiers in Bioengineering and Biotechnology 11 (2023): 1160223. DOI: 10.3389/fbioe.2023.1160223 (PMC10277482). Introduces the 'CLD 4.0' (CLD4) four-step methodology — digitalization into a structured data lake, a cell-line manufacturability index (MI_CL), ML risk assessment of process/CQA risks, and automated NLG reporting — the published line of work that points toward 'manufacturability index' data lakes scoring and ranking clones against accumulated manufacturing history, sitting closer to cell-line development than to concept.
- European Commission / EMA. EudraLex Volume 4 GMP, Draft Annex 22: Artificial Intelligence (public consultation released 7 July 2025). Available at https://health.ec.europa.eu/ (EU GMP guidelines). Establishes a risk-based framework for AI/ML in GMP and notably restricts 'critical' GMP applications (those that can directly affect product quality, patient safety, or GMP data integrity) to static, deterministic models producing consistent outputs for the same inputs, with requirements for defined intended use, independent test data, explainability, drift monitoring, and human oversight. Draft, not yet finalized as of 2026.
- U.S. Food and Drug Administration. "Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products." Draft Guidance for Industry, January 2025 (Docket FDA-2024-D-4689; Federal Register 90 FR 1304, 7 January 2025). Available at https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological. Sets out a seven-step, risk-based credibility-assessment framework keyed to an AI model's context of use (COU); explicitly scopes nonclinical, clinical, post-marketing, and manufacturing phases and does not cover drug discovery — supporting the chapter's point that a concept-stage developability prediction sits before this framework unless its output later propagates into a submission or control strategy.
Molecule Discovery: Generative Design and Developability Prediction
- Raybould MIJ, Marks C, Krawczyk K, Taddese B, Nowak J, Lewis AP, Bujotzek A, Shi J, Deane CM. "Five computational developability guidelines for therapeutic antibody profiling." Proceedings of the National Academy of Sciences (PNAS) 116(10):4025-4030, 2019. DOI: 10.1073/pnas.1810576116. The Therapeutic Antibody Profiler (TAP): a structure-based screen computing CDR length, surface hydrophobicity (PSH), patches of positive/negative charge (PPC/PNC), and Fv charge-symmetry, flagging candidates outside the clinical-stage range — anchoring the aggregation/viscosity-from-hydrophobicity claim. Peer-reviewed, independent (academic; University of Oxford / Roche).
- Jain T, Sun T, Durand S, Hall A, Houston NR, Nett JH, Sharkey B, Bobrowicz B, Caffry I, Yu Y, Cao Y, Lynaugh H, Brown M, Baruah H, Gray LT, Krauland EM, Xu Y, Vasquez M, Wittrup KD. "Biophysical properties of the clinical-stage antibody landscape." Proceedings of the National Academy of Sciences (PNAS) 114(5):944-949, 2017. DOI: 10.1073/pnas.1616408114. The foundational survey of 137 clinical-stage monoclonal antibodies (incl. 48 approved) across a dozen biophysical developability assays; the field's most-cited empirical anchor for what 'normal' developability looks like. Peer-reviewed, independent (Adimab).
- Reynisson B, Alvarez B, Paul S, Peters B, Nielsen M. "NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data." Nucleic Acids Research 48(W1):W449-W454, 2020. DOI: 10.1093/nar/gkaa379. Canonical neural-network peptide-MHC class II binding/presentation predictor — the ML sub-field underpinning T-cell-epitope prediction and antibody deimmunization. Peer-reviewed (DTU Health Tech / LJI). [As a representative anchor for the peptide-MHC-binding-predictor field the prose references.]
- Lai P-K, et al. "Accelerating high-concentration monoclonal antibody development with large-scale viscosity data and ensemble deep learning" (DeepViscosity). mAbs 17(1):2483944, 2025. DOI: 10.1080/19420862.2025.2483944. Ensemble of 102 artificial neural networks classifying low- vs high-viscosity (>20 cP at 150 mg/mL) mAbs, trained on a 229-mAb measured-viscosity panel; the large-scale internal-data approach the chapter cites. Peer-reviewed; training data partly proprietary.
- Li B, et al. (Regeneron Pharmaceuticals). "PROPERMAB: an integrative framework for in silico prediction of antibody developability using machine learning." mAbs 17(1):2474521, 2025. DOI: 10.1080/19420862.2025.2474521 (bioRxiv preprint 10.1101/2024.10.10.616558, Oct 2024). Predicts antibody developability (e.g. HIC retention time, high-concentration viscosity) from sequence and structure-derived features plus protein-language-model embeddings, to triage candidates before wet-lab assays. Peer-reviewed; proprietary training data.
- "Monoclonal Antibody Developability from Early Assay Panels: Machine Learning for Formulation and Pharmacokinetic Risk" (ACeT, agnostic context-embedding transformer). bioRxiv preprint, DOI: 10.1101/2025.10.31.685722 (2025; orig. titled "From bench assays to bedside: a context-embedding transformer predicts monoclonal antibody viscosity, clearance, and regulatory success"). Interpretable transformer that forecasts expensive late-stage developability endpoints (high-concentration viscosity, in-vivo clearance, HIC retention, clinical progression) from cheap early assay panels. Preprint (not yet peer-reviewed).
- Makowski EK, et al. "Reduction of monoclonal antibody viscosity using interpretable machine learning." mAbs 16(1):2303781, 2024. DOI: 10.1080/19420862.2024.2303781. Interpretable ML that not only predicts high-concentration viscosity but identifies the responsible residues, turning prediction into a concrete design/mutation instruction — matching the chapter's 'point at the residues responsible' claim. Peer-reviewed. (Related low-data interpretable viscosity work: Rai BK, Apgar JR, Bennett EM, PfAbNet-viscosity, Scientific Reports 13:2917, 2023, DOI: 10.1038/s41598-023-28841-4.)
- Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, Guo D, Ott M, Zitnick CL, Ma J, Fergus R. "Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences" (ESM-1b). Proceedings of the National Academy of Sciences (PNAS) 118(15):e2016239118, 2021. DOI: 10.1073/pnas.2016239118. Foundational protein-language-model paper (ESM family); the canonical PLM whose pseudo-likelihoods and transferable embeddings the chapter describes. Peer-reviewed (Meta/Facebook AI Research). Successor: Lin Z, et al. "Evolutionary-scale prediction of atomic-level protein structure with a language model" (ESM-2/ESMFold), Science 379(6637):1123-1130, 2023, DOI: 10.1126/science.ade2574.
- Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Zidek A, Potapenko A, et al. "Highly accurate protein structure prediction with AlphaFold." Nature 596(7873):583-589, 2021. DOI: 10.1038/s41586-021-03819-2. AlphaFold2 — the deep-learning method that made high-quality protein structure prediction routine, enabling spatial developability metrics from predicted structure. Peer-reviewed (DeepMind).
- Generative antibody/protein design. (a) Shuai RW, Ruffolo JA, Gray JJ. "IgLM: Infilling language modeling for antibody sequence design." Cell Systems 14(11):979-989.e4, 2023. DOI: 10.1016/j.cels.2023.10.001 — autoregressive/infilling generative antibody language model trained on 558M Ig variable sequences, generating CDR libraries with improved in-silico developability. (b) Watson JL, Juergens D, Bennett NR, et al., Baker D. "De novo design of protein structure and function with RFdiffusion." Nature 620(7976):1089-1100, 2023. DOI: 10.1038/s41586-023-06415-8 — protein-design diffusion model. Both peer-reviewed (Johns Hopkins; University of Washington / Baker Lab).
- Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, Wicky BIM, Courbet A, de Haas RJ, Bethel N, Leung PJY, Huddy TF, Pellock S, Tischer D, Chan F, Koepnick B, Nguyen H, Kang A, Sankaran B, Bera AK, King NP, Baker D. "Robust deep learning-based protein sequence design using ProteinMPNN." Science 378(6615):49-56, 2022. DOI: 10.1126/science.add2187. Inverse-folding model that designs sequences to fit a desired backbone (52.4% native sequence recovery vs 32.9% for Rosetta). Peer-reviewed (University of Washington / Baker Lab).
- European Medicines Agency (EMA) / PIC-S Inspectors' Working Group. Draft EU GMP Guide Annex 22 "Artificial Intelligence," published for consultation 7 July 2025 (consultation 7 July - 7 October 2025), drafted jointly by the EMA IWG and PIC/S for global alignment. The first official EU GMP guidance on AI; establishes a risk-based framework for intended use, validation, lifecycle management, explainability, and human-in-the-loop oversight, and indicates dynamic/adaptive/probabilistic models (GenAI/LLMs) should not be used in critical GMP applications. Regulatory draft (EMA/PIC-S).
- U.S. Food and Drug Administration (CDER/CBER). "Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products." Draft Guidance for Industry, January 2025 (Docket FDA-2024-D-4689; Federal Register 7 Jan 2025; comments due 7 Apr 2025). Establishes a 7-step risk-based credibility-assessment framework keyed to model context of use, covering nonclinical, clinical, post-marketing, and manufacturing phases. The FDA model-credibility framework the chapter references. Regulatory draft guidance (FDA), non-binding.
Cell-Line Development: Ranking Clones with Machine Learning
- Shi J, Ho A, Snyder CE, Chaney EJ, Sorrells JE, Alex A, Talaban R, Spillman DR Jr, Marjanovic M, Doan M, Finka G, Hood SR, Boppart SA. "Accelerating biopharmaceutical cell line selection with label-free multimodal nonlinear optical microscopy and machine learning." Communications Biology. 2025;8:157. DOI: 10.1038/s42003-025-07596-w. (Peer-reviewed research.) Uses simultaneous label-free autofluorescence multiharmonic (SLAM) microscopy with fluorescence-lifetime imaging (FLIM) plus an ML classifier to distinguish CHO monoclonal cell lines as early as passage 2 with balanced accuracies exceeding 96.8 percent, without stains or labels.
- Tao W, Ahmed W, Guo M, Mohsin A, Wu B, Li R. "Selection of high-producing clones by a relative titer predictive model using image analysis." Annals of Translational Medicine. 2021;9(14):1144. DOI: 10.21037/atm-21-2822. (Peer-reviewed research.) Builds a relative-titer (RT) prediction model from quantitative features (size, circularity, solidity, etc.) extracted from microscope images during cell-line development, ranking high-producing CHO clones from morphology before a titer assay; reported as the first such image-based relative-productivity model, with reduced accuracy when a different host cell is used.
- Stosch M (von Stosch) et al. / Mishra D, Tulsyan A, et al. — published as: "Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics" (the 'CLD4' / CLD 4.0 methodology; authors incl. AstraZeneca and UCL teams). Frontiers in Bioengineering and Biotechnology. 2023;11:1160223. DOI: 10.3389/fbioe.2023.1160223. PMID: 37342509; PMCID: PMC10277482. (Peer-reviewed, describes a production-grade methodology.) Describes a four-step Industry-4.0 workflow: pulling raw CLD data into a data lake, computing a Cell Line Manufacturability Index (MICL) that ranks clones across productivity/growth/product-quality criteria rather than a single attribute, applying ML to flag process/CQA risks, and using natural-language generation for automated reporting; demonstrated on a recombinant CHO antibody-peptide fusion with a trisulfide-bond quality issue.
- Sietaram D. "Optimisation of CHO Cell Line Development Using Hybrid Modelling for Antibody Therapeutics." PhD thesis, University of Cambridge, 2024. DOI: 10.17863/CAM.116348 (advisor A. Lapkin; BBSRC/GlaxoSmithKline CASE Award). (Examined PhD thesis; peer-reviewed research.) Develops a data-driven hybrid model that integrates machine learning with first-principles mechanistic kinetics for CHO cell-line development: a multi-cell-line kinetic model (MCKM) describing ~140 CHO cell lines, with ML predicting each clone's kinetic parameters from early single-cell-cloning, well-plate and T25 screening data to forecast bioreactor titer/growth from fewer late-stage runs. The mechanistic-core companion paper is Sietaram D, Kotidis P, Finka G, Lapkin AA, "A Multi Clone Kinetic Model for characterizing Chinese hamster ovary cell line variability," Journal of Industrial Microbiology & Biotechnology, 2025;52:kuaf029, DOI: 10.1093/jimb/kuaf029.
- Xu et al. "Innovating cell culture process development with deep learning-powered robotic experimentation using the first Industrial Smart Lab Framework." Biotechnology Progress. 2025;41(6):e70051. DOI: 10.1002/btpr.70051. PMID: 40542657. (Peer-reviewed but single-company / self-reported; pilot at 3-15 L process-development scale, not GMP.) WuXi Biologics' Industrial Smart Lab Framework for Cell Culture (ISLFCC) couples robotic bioreactor sampling, IoT, and decoder-only transformer deep-learning models that predict cell state and recommend feed/temperature actions; an AI-driven case study across three CHO clones reported an average titer increase of 26.8 percent and late-phase lactate kept below 1 g/L within a single batch versus traditional three-stage empirical development.
Process Development: Bayesian Optimization Beats the Factorial Grid
- Siska M, Pajak E, Rosenthal K, del Rio Chanona A, von Lieres E, Helleckes LM. "A Guide to Bayesian Optimization in Bioprocess Engineering." Biotechnology and Bioengineering (2025/2026). DOI: 10.1002/bit.70129. (Open-access preprint: arXiv:2508.10642, DOI 10.48550/arXiv.2508.10642.) Peer-reviewed practitioner-oriented tutorial review establishing that Bayesian optimization is the method for sequential, expensive, noisy black-box optimization of bioprocesses, that it reaches competitive optima in far fewer runs than classical DoE, and that Expected Improvement is the default acquisition function. See also the companion review Gisperg F, et al. "Bayesian Optimization in Bioprocess Engineering—Where Do We Stand Today?" Biotechnology and Bioengineering (2025), DOI: 10.1002/bit.28960. RESEARCH, peer-reviewed.
- Ndahiro N, Ma E, Bertalan T, Donohue M, Kevrekidis Y, Betenbaugh M. "Integration of Bayesian optimization and solution thermodynamics to optimize media design for mammalian biomanufacturing." iScience 28(8):112944 (2025). DOI: 10.1016/j.isci.2025.112944. (PMID: 40740490.) Thermodynamics-aware Bayesian optimization with solubility constraints, validated experimentally in the Ambr15 automated micro-bioreactor system for CHO cell-culture media design, compared head-to-head against a classical space-filling design. RESEARCH, peer-reviewed.
- Narayanan H, et al. "Accelerating cell culture media development using Bayesian optimization-based iterative experimental design." Nature Communications 16, article 61113 (2025). DOI: 10.1038/s41467-025-61113-5. (PMC12218302; preprint bioRxiv 2024.10.29.620971.) Iterative BO framework for cell-culture media development that reached improved compositions using 3–30x fewer experiments than estimated for standard Design of Experiments. RESEARCH, peer-reviewed.
- Waibel I, Schneider TN, Fischer FJ, Dumnoenchanvanit P, Kulakova A, Nguyen TD, Egebjerg T, Lorenzen N, Bertelsen S, Arosio P. "Bayesian Optimization for Efficient Multiobjective Formulation Development of Biologics." Molecular Pharmaceutics 22(11):6636–6645 (2025). DOI: 10.1021/acs.molpharmaceut.5c00591. (PMID: 41002022; PMC12587402.) ETH Zurich + Novo Nordisk study using multi-objective BO (per-objective GPs plus NSGA-II front search, built on the open-source ProcessOptimizer) to develop a monoclonal-antibody formulation, identifying highly optimized conditions in 33 experiments and improving the diffusion-interaction parameter kD from 9.1 to 48.6 mL/g. RESEARCH, peer-reviewed.
- Walsh I, Shozui F, Sato Y, Park S, Cho S, Chia S, et al. "Toward Machine Learning-Guided CHO Bioprocess and Media Optimization for Improved Titer and Glycosylation." Biotechnology Journal 20(11):e70149 (2025). DOI: 10.1002/biot.70149. ML models predicting titer (R^2 ~0.93) and glycan metrics (R^2 ~0.79–0.95) from initial media/process inputs, with an ML-surrogate active-learning step proposing a composition that reduced mannosylation by 10% while increasing titer—explicitly targeting the titer-versus-quality trade-off. RESEARCH, peer-reviewed.
- ICH Harmonised Tripartite Guideline Q8(R2): Pharmaceutical Development (Step 4, August 2009; EMA reference EMA/CHMP/ICH/167068/2004; U.S. FDA, Guidance for Industry Q8(R2) Pharmaceutical Development, November 2009). Codifies Quality by Design (QbD) and defines the design space as "the multidimensional combination and interaction of input variables (e.g., material attributes) and process parameters that have been demonstrated to provide assurance of quality," within which movement is not considered a change requiring regulatory approval. REGULATORY (ICH/FDA/EMA).
- Peterson JJ. "A Bayesian approach to the ICH Q8 definition of design space." Journal of Biopharmaceutical Statistics 18(5):959–975 (2008). DOI: 10.1080/10543400802278197. Foundational method that reframes the ICH Q8 design space probabilistically—as the region where the Bayesian posterior predictive probability of simultaneously meeting all CQA specifications exceeds a chosen threshold (a Bayesian probabilistic design space). RESEARCH, peer-reviewed.
- Bano G, Facco P, Bezzo F, Barolo M. "Probabilistic design space determination in pharmaceutical product development: A Bayesian/latent variable approach." AIChE Journal 64(7):2438–2449 (2018). DOI: 10.1002/aic.16133. Extends Peterson's probabilistic design space using PLS latent-variable models and a Bayesian posterior-predictive criterion to quantify, at each operating point, the probability that all CQAs meet specification. RESEARCH, peer-reviewed.
- Gadiyar C, et al. "Self-Driving Development of Perfusion Processes for Monoclonal Antibody Production." Biotechnology and Bioengineering (2026). DOI: 10.1002/bit.70093. (Preprint: bioRxiv 2024.09.03.610922, posted Sept 2024.) DataHow, Sartorius, and Merck KGaA collaboration combining a Bayesian optimal-experimental-design algorithm with a cognitive digital twin (step-wise Gaussian-process models) operating 24 parallel Ambr250 mini-bioreactors over 27-day perfusion cultivations, with transfer learning between cell lines. RESEARCH, peer-reviewed (single autonomous-lab demonstration at PD scale).
- Sartorius. "Sartorius introduces BioPAT Spectro to enable Raman spectroscopy capability and QbD with its Ambr and Biostat STR platforms" (product news 424720, sartorius.com) and the BioPAT Spectro / Umetrics Suite (SIMCA, SIMCA-online, MODDE) product pages. Describes the retrofit of in-line/at-line Raman onto Ambr15 and Ambr250 with a standardized optical interface for model transfer to BIOSTAT STR, and the Umetrics MODDE (DoE) / SIMCA (multivariate modeling) stack. VENDOR/SELF-REPORTED (Sartorius); the integration and efficiency claims are manufacturer statements, not independent results.
- Xu Y, et al. "Innovating cell culture process development with deep learning-powered robotic experimentation using the first Industrial Smart Lab Framework." Biotechnology Progress 41(6):e70051 (2025). DOI: 10.1002/btpr.70051. WuXi Biologics' Industrial Smart Lab Framework for Cell Culture (ISLFCC), an autonomous deep-learning + robotics loop reporting an average titer increase of 26.8% across three CHO clones versus traditional three-stage empirical development, at 3 L and 15 L scale. PEER-REVIEWED but single-company, self-reported, PD-scale and unreplicated (not GMP).
- European Commission. Draft Annex 22: Artificial Intelligence to the EU GMP Guide (Volume 4), published for public consultation July 2025 (consultation closed October 2025; final adoption expected 2026); developed jointly with PIC/S. The draft applies to static, deterministic AI models in critical GMP applications and explicitly states that models that adapt their performance during use (continuously-learning/adaptive models), as well as generative AI and LLMs, are not covered and should not be used in critical GMP applications. REGULATORY (EU GMP / PIC/S, draft).
- Polak J, Huang Z, Sokolov M, von Stosch M, Butte A, Hodgman CE, Borys M, Khetan A. "An innovative hybrid modeling approach for simultaneous prediction of cell culture process dynamics and product quality." Biotechnology Journal 19(3):e2300473 (2024). DOI: 10.1002/biot.202300473. (PMID: 38528367.) Peer-reviewed DataHow + Bristol Myers Squibb study on a 48-run, 5 L upstream mAb dataset (12 CPPs, 18 CQAs) in which the combined hybrid model outperformed black-box models by ~33% on average for final product-quality prediction while requiring about half the training data. RESEARCH, peer-reviewed (DataHow vendor efficiency headlines such as 22%/3x are separately VENDOR/SELF-REPORTED via datahow.ch case study).
- Minero T, Kuger L. "The 7th ISPE Pharma 4.0 Survey: Digital Transformation." Pharmaceutical Engineering, September/October 2024 (ISPE; survey fielded 2023). The survey reports that AI and ML, despite being widely referenced, have yet to achieve significant large-scale implementation, and that the proportion of projects stuck in the pilot phase remains high and stagnant—consistent with AI/ML showing many pilots but few scaled deployments. TRADE/INDUSTRY survey (ISPE).
Analytical Methods: Chemometrics, Deep Spectroscopy, and Automated Chromatograms
- Berry, B. N., Dobrowsky, T. M., Timson, R. C., Kshirsagar, R., Ryll, T., & Wiltberger, K. (2016). Quick generation of Raman spectroscopy based in-process glucose control to influence biopharmaceutical protein product quality during mammalian cell culture. Biotechnology Progress, 32(1), 224-234. DOI: 10.1002/btpr.2205. (Peer-reviewed; in-line Raman + PLS feedback control of glucose in CHO fed-batch to influence product quality.)
- Gibbons, L., Rafferty, C., Robinson, K., Abad, M., Maslanka, F., Le, N., Mui, J., et al. (2023). An assessment of the impact of Raman based glucose feedback control on CHO cell bioreactor process development. Biotechnology Progress, 39(3), e3371. DOI: 10.1002/btpr.3371. (Peer-reviewed; Raman + PLS soft sensor driving closed-loop glucose control vs. manual bolus feeding in CHO bioreactors, showing reduced glycation and improved product quality.)
- Nomikos, P., & MacGregor, J. F. (1995). Multivariate SPC charts for monitoring batch processes. Technometrics, 37(1), 41-59. DOI: 10.1080/00401706.1995.10485888. (Foundational peer-reviewed method for multiway-PCA/PLS multivariate statistical process control of batch processes — the 'golden batch' monitoring with Hotelling's T-squared and SPE charts that underlies SIMCA-style MSPC.)
- Kourti, T., & MacGregor, J. F. (1995). Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemometrics and Intelligent Laboratory Systems, 28(1), 3-21. DOI: 10.1016/0169-7439(95)80036-9. (Peer-reviewed review establishing PCA/PLS-based MSPC for continued process monitoring and fault detection — the methodology commercialized in tools such as Sartorius SIMCA / SIMCA-online and AspenTech ProMV. Commercial product capabilities are vendor-reported: Sartorius umetrics.com/simca and AspenTech aspentech.com ProMV.)
- Wei, B., Woon, N., Dai, L., Fish, R., Tai, M., Handagama, W., Yin, A., et al. (2022). Multidimensional analytical characterization assisted by deep learning: Convolutional neural networks guided Raman spectroscopy as a PAT tool for monitoring and simultaneous prediction of monoclonal antibody charge variants. Pharmaceutical Research / Analytical Chemistry. Representative peer-reviewed paper: 'Convolutional Neural Networks Guided Raman Spectroscopy as a Process Analytical Technology (PAT) Tool for Monitoring and Simultaneous Prediction of Monoclonal Antibody Charge Variants,' Pharmaceutical Research, 41 (2024). DOI: 10.1007/s11095-024-03663-9. (Peer-reviewed; CNN-on-Raman benchmarked against chemometric/PLS approaches for on-line mAb charge-variant prediction during CEX — pilot tier.)
- Rashedi, M., Khodabandehlou, H., Wang, T., Demers, M., Tulsyan, A., Garvin, C., & Undey, C. (Amgen) (2024). Integration of just-in-time learning with variational autoencoder for cell culture process monitoring based on Raman spectroscopy. Biotechnology and Bioengineering, 121(7), 2205-2224. DOI: 10.1002/bit.28713. (Peer-reviewed; Amgen group benchmarks a VAE / just-in-time-learning deep approach against PLS and CNN for Raman-based CHO process monitoring — pilot/research tier.)
- Mishra, P., & Passos, D. (2025). A comparative analysis of deep learning and chemometric approaches for spectral data modeling. Analytica Chimica Acta, 1342, 343658. DOI: 10.1016/j.aca.2025.343658. (Peer-reviewed comparison finding that on small spectral datasets well-optimized PLS/iPLS remains competitive with or superior to CNNs, which only gain an edge with larger data — supports the chapter's 'PLS ties deep learning on small clean spectra' thesis. See also Bjerrum, Glahder & Skov, 'Data Augmentation of Spectral Data for CNN-Based Deep Chemometrics,' arXiv:1710.01927, 2017.)
- Wang, J., Chen, D., Wang, M., Krause, A., et al. (Boehringer Ingelheim) (2024). Simultaneous prediction of 16 quality attributes during protein A chromatography using machine learning based Raman spectroscopy models. Biotechnology and Bioengineering, 121(6), 1729-1738. DOI: 10.1002/bit.28679. (Peer-reviewed; the BI 16-attribute real-time Raman work. Best-performing model was k-nearest-neighbors (KNN) regression with Butterworth-filter preprocessing — NOT a deep network. Companion automated-calibration paper: Wang et al., Biotechnol. Bioeng. 120 (2023), DOI: 10.1002/bit.28514; and Esmonde-White-style in-line computational-Raman demonstration, PubMed 37288839.)
- Waters Corporation. Empower 3 Software ApexTrack Integration Algorithm — Theory and Application (white paper 720000494en) and Empower 3 Data Acquisition and Processing Theory Guide (715005481). Waters Corporation, Milford, MA. (Vendor / self-reported documentation. ApexTrack uses a deterministic second-derivative / apex-curvature peak-detection and threshold-based baseline algorithm; the determinism — identical chromatogram integrates identically — is the auditability feature noted in the prose.)
- Satwekar, A., Panda, A., Nandula, P., Sripada, S., Govindaraj, R., & Rossi, M. (Merck KGaA / Merck Serono) (2023). Digital by design approach to develop a universal deep learning AI architecture for automatic chromatographic peak integration. Biotechnology and Bioengineering, 120(7), 1822-1843. DOI: 10.1002/bit.28406. (Peer-reviewed; Merck KGaA group's universal convolutional-neural-network architecture for automatic chromatographic peak integration, with a proposed GxP human-in-the-loop framework — research/demonstration tier, not GMP-routine. Note: the chapter's 'Bosch' attribution is not borne out by this paper's author list; the verifiable Merck KGaA universal-CNN peak-integration paper is the one cited here.)
- European Commission / EMA & PIC/S (2025). EudraLex Volume 4, Annex 22: Artificial Intelligence (draft for consultation, published 7 July 2025). European Commission, Brussels; PIC/S document PI 062-1 (draft). (Regulatory draft. Scope is limited to static, deterministic AI models in critical GMP applications and explicitly excludes dynamic/adaptive models, generative AI, and LLMs from critical roles, requiring locked models under predetermined change control.)
- Rogers, R. S., Nightlinger, N. S., Livingston, B., Campbell, P., Bailey, R., & Balland, A. (2015). Development of a quantitative mass spectrometry multi-attribute method for characterization, quality control testing and disposition of biologics. mAbs, 7(5), 881-890. DOI: 10.1080/19420862.2015.1069454. (Peer-reviewed; defines the LC-MS Multi-Attribute Method (MAM) with targeted attribute quantitation and non-targeted New Peak Detection. Production NPD is algorithmic/feature-based. See also Oyugi, Wang & Rogstad et al., 'Method validation and new peak detection for the LC-MS MAM,' J. Pharm. Biomed. Anal. 2023, DOI: 10.1016/j.jpba.2023.115561.)
- Zong, Y., Wang, Y., Yang, Y., Zhao, D., Wang, X., Shen, C., & Qiao, L. (2024). Deep learning prediction of glycopeptide tandem mass spectra powers glycoproteomics. Nature Machine Intelligence, 6, 367-377. DOI: 10.1038/s42256-024-00875-x. (Peer-reviewed; DeepGP, a BERT/transformer-plus-graph-neural-network framework predicting intact N-glycopeptide MS/MS spectra and retention times to power glycoproteomics — research/discovery tier, not GMP lot release.)
- Ferrari, F., Berger, J., Lemieux, L., Paduraru, C., Dillon, M., Liaw, A., Carrillo, R., Wong, S., Salami, H., Avalle, P., Sherer, E., Richardson, D., & Skomski, D. (Merck & Co. / MSD) (2025). Bayesian hierarchical model predicts biopharmaceutical stability indicators and shelf life with application to multivalent human papillomavirus vaccine. Scientific Reports, 15, 17333. DOI: 10.1038/s41598-025-99458-y. (Peer-reviewed; Bayesian hierarchical shelf-life model applied to the 9-valent HPV vaccine GARDASIL 9. For the mechanistic-kinetic complement see also predictive-stability Arrhenius/kinetic degradation modeling, e.g. 'Predicting the Long-Term Stability of Biologics with Short-Term Data,' Molecular Pharmaceutics, 2024.)
- Petillot, L., Pewny, F., Wolf, M., Sanchez, C., Thomas, F., Sarrazin, J., Fauland, K., et al. (2020). Calibration transfer for bioprocess Raman monitoring using Kennard Stone piecewise direct standardization and multivariate algorithms. Engineering Reports, 2(11), e12230. DOI: 10.1002/eng2.12230. (Peer-reviewed; reports that two probes of the same multi-channel Raman analyzer in the same culture disagreed by ~20% on cell density (TCD/VCD) from instrument-to-instrument variability, and that Kennard-Stone piecewise direct standardization (PDS) calibration transfer halved the error to ~10%.)
Tech Transfer and Scale-Up: Models That Travel Between Scales
- Karimi Alavijeh M, Lee YY, Gras SL. "A perspective-driven and technical evaluation of machine learning in bioreactor scale-up: A case-study for potential model developments." Engineering in Life Sciences. 2024;24(7):e2400023. DOI: 10.1002/elsc.202400023. Open-access perspective review surveying ML applied to bioreactor scale-up, treating the jump as a prediction problem informed by CFD and engineering correlations (kLa, mixing time, shear, CO2 stripping) rather than a wet-lab gamble. Peer-reviewed, independent. https://doi.org/10.1002/elsc.202400023
- Pétillot L, Pewny F, Wolf M, et al. "Calibration transfer for bioprocess Raman monitoring using Kennard Stone piecewise direct standardization and multivariate algorithms." Engineering Reports. 2020;2(12):e12230 (Wiley). DOI: 10.1002/eng2.12230. Two probes of the same Raman analyzer were immersed in the same CHO culture simultaneously, isolating instrument variability; cross-probe prediction error on cell density (TCD/VCD) of ~20% was lowered to ~10% by Kennard-Stone Piecewise Direct Standardization (KS-PDS). Peer-reviewed, independent. https://doi.org/10.1002/eng2.12230
- ICH Q14: Analytical Procedure Development. International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use; Step 4 final version adopted 14 November 2023 (FDA Federal Register availability 2024). Establishes the analytical-procedure lifecycle concept (defined operating/reportable range, validation, ongoing performance monitoring, change management) that applies to multivariate/spectroscopic calibrations, making a transferred model a re-qualified procedure rather than a free reuse. Regulatory. https://database.ich.org/sites/default/files/ICH_Q14_Guideline_2023_1116.pdf
- Sartorius Stedim Biotech (Schwarz H, et al.). "Raman Spectrometric PAT Models: Successful Transfer from Minibioreactors to Larger-Scale Stirred-Tank Bioreactors." BioProcess International (trade press), 2022. A CHO mAb PLS Raman model calibrated on twelve Ambr 250 (BioPAT Spectro) vessels was transferred to validate two 10 L Biostat STR cultivations, predicting glucose at R^2 = 0.84 (RMSEP = 0.29 g/L). Vendor/trade-press, self-reported pilot (chapter attributes the result to industry scale-transfer practice). https://www.bioprocessintl.com/pat/raman-spectrometric-pat-models-successful-transfer-from-minibioreactors-to-larger-scale-stirred-tank-bioreactors
- Wan X, Zhou C, et al. (Merck & Co., West Point, PA). "Robust platform for inline Raman monitoring and control of perfusion cell culture." Biotechnology and Bioengineering. 2024;121(5):1583-1597. DOI: 10.1002/bit.28680. Documents that very high cell densities and long run times in intensified perfusion generate strong, time-varying fluorescence interference that overwhelms the weak Raman signal; the platform circumvents it (cell-free permeate measurement) so a robust chemometric model can be deployed. Peer-reviewed pilot. PubMed 38393313. https://doi.org/10.1002/bit.28680
- Rogers AW, Vega-Ramon F, Yan J, del Rio-Chanona EA, Zhang D. "A transfer learning approach for predictive modeling of bioprocesses using small data." Biotechnology and Bioengineering. 2022;119(2):411-422. DOI: 10.1002/bit.27980. Demonstrates that embedding mechanistic structure (a physics/kinetic backbone) into a learned model improves extrapolation and generalization from small datasets, with the model's optimal structure linked to the underlying process mechanism, the data-efficiency-becomes-transfer argument the chapter makes. Peer-reviewed, independent. PubMed 34716712. https://doi.org/10.1002/bit.27980
- Sartorius (Maria S, et al.). "A Hybrid Modeling Framework for Predictive Digital Twins of CHO Cell Culture." bioRxiv preprint, 2025. DOI: 10.1101/2025.11.24.690194 (posted Nov 2025; all authors Sartorius employees). Combines a genome-scale/dynamic flux-balance (PC-dFBA) backbone, an ODE-based kinetic (FLEX) model, and a learned VCD neural-network component into a hybrid simulator predicting viable cell density, titer, and metabolites; validated on 23 CHO fed-batch cultures and aimed at the scale-up/transfer problem. Vendor/self-reported pilot, preprint (not yet peer-reviewed). https://doi.org/10.1101/2025.11.24.690194
- DataHow AG (ETH Zurich spin-off, founded 2017; independent, not a Sartorius subsidiary). Hybrid modeling plus transfer learning to reduce experimental burden; the 30-60% (vendor figures range to 40-80%) experiment-reduction claim is vendor/self-reported (DataHow product pages, e.g. "The Impact of Hybrid Models on Bioprocess Development," datahow.ch). The peer-reviewed basis: Rogers AW, et al. (DataHow/ETH/TU Berlin) "Knowledge transfer across cell lines using hybrid Gaussian process models with entity embedding vectors," Biotechnology and Bioengineering 2022;119(11):3239-3256 (DOI: 10.1002/bit.27911, PubMed 34383309), and Bayer B, et al. "Model Transferability and Reduced Experimental Burden in Cell Culture Process Development Facilitated by Hybrid Modeling and Intensified Design of Experiments," Frontiers in Bioengineering and Biotechnology 2021;9:740215 (DOI: 10.3389/fbioe.2021.740215, PMC8733703). Vendor/self-reported figures; underlying methods peer-reviewed. https://datahow.ch/
- Baron Diaz N, Drommershausen A, Grunberger A, Holtmann D. "Transfer Learning Approaches in Bioprocess Engineering: Opportunities and Challenges." Biotechnology and Bioengineering. 2026;123(5):1417-1431. DOI: 10.1002/bit.70186. Review concluding that transfer learning is the near-term workaround for the small-data, high-variability reality of bioprocessing, and that a usable bioprocess foundation model does not yet exist; offers a strategy-selection table by task/data-type/source-target similarity. Peer-reviewed, independent. https://doi.org/10.1002/bit.70186
- Sartorius. SIMCA and SIMCA-online multivariate data analysis (Umetrics Suite) and BioPAT Spectro Raman platform for Ambr 15/250 and Biostat STR. Product news: "Sartorius Introduces BioPAT Spectro to Enable Raman Spectroscopy Capability and QbD with its Ambr and Biostat STR Platforms," 24 February 2020 (sartorius.com newsroom/product-news/424720). SIMCA builds MVDA/Raman models from small-scale runs; SIMCA-online deploys them for real-time batch monitoring and endpoint prediction, supporting model transfer up the scale ladder under monitoring. Vendor product pages, production for monitoring (self-reported). https://www.sartorius.com/en/company/newsroom/product-news/424720-424720
- Rosado PJ, Merheb B, Toro A (Amgen Manufacturing Limited, Juncos, Puerto Rico). "Real-Time, Data-Driven, and Predictive Modeling: Accelerating Digital Transformation in Drug Substance Commercial Manufacturing." BioProcess International (trade press), 9 February 2023. Describes commercial-GMP harvest-titer prediction built with an OPLS batch-level model in SIMCA / SIMCA-online (21 CFR Part 11-validated), reaching R^2 = 0.91, Q^2 = 0.85, with measured assay values remaining the official result and the model serving as decision support. First-party, self-reported production account. https://www.bioprocessintl.com/pat/real-time-data-driven-and-predictive-modeling-accelerating-digital-transformation-in-drug-substance-commercial-manufacturing
- U.S. FDA, Center for Drug Evaluation and Research (CDER). "Artificial Intelligence in Drug Manufacturing" discussion paper, published 1 March 2023 (Docket FDA-2023-N-0487; 88 FR 12943; comment period reopened to 27 November 2023). Poses regulatory questions including validating and maintaining (self-learning) AI models and managing the data used to build them in a cGMP environment, i.e. model maintenance/re-validation across deployment changes as an open question. Regulatory. https://www.fda.gov/media/165743/download
- U.S. FDA. Draft Guidance for Industry: "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products," issued 6 January 2025 (Federal Register 7 January 2025; Docket FDA-2024-D-4689; comments due 7 April 2025). Establishes a risk-based, 7-step credibility-assessment framework in which model risk is a function of model influence and decision consequence, spanning nonclinical, clinical, post-marketing, and manufacturing uses. Regulatory. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological
- ISPE. "The 7th ISPE Pharma 4.0 Survey: Digital Transformation." Pharmaceutical Engineering, September/October 2024 (survey conducted 2023). Reports that AI/ML, though widely referenced, has yet to achieve significant large-scale implementation, with most activity at pilot stage and production clusters concentrated in monitoring and human-in-the-loop use rather than autonomous control. Industry-association survey (self-reported). https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital
Seed Train: Soft Sensing the Inoculum and Predicting Contamination Risk
- Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. "Scikit-learn: Machine Learning in Python." Journal of Machine Learning Research. 2011;12:2825-2830. https://jmlr.org/papers/v12/pedregosa11a.html. (Canonical reference for scikit-learn, the open-source library used here; the GradientBoostingRegressor and its feature_importances_ attribute are documented in the scikit-learn Ensemble methods user guide: https://scikit-learn.org/stable/modules/ensemble.html#gradient-boosting.)
- Ondracka A, Gasset A, Garcia-Ortega X, Hubmayr D, van Wijngaarden JBG, Montesinos-Segui JL, Valero F, Manzano T. "CPV of the Future: AI-Powered Continued Process Verification for Bioreactor Processes." PDA Journal of Pharmaceutical Science and Technology. 2023;77(3):146-166. DOI: 10.5731/pdajpst.2021.012665. Trade-press coverage: "Machine Learning an Ideal Fit for Process Validation," Genetic Engineering & Biotechnology News (GEN). This peer-reviewed proof-of-concept used an isolation forest for batch-phase anomaly detection and a random forest for fed-batch control actions on a Pichia pastoris (Komagataella phaffii) recombinant-lipase system. NOTE / CORRECTION: the work was authored by Aizon (Toni Manzano) and the Universitat Autonoma de Barcelona; it was NOT done by CSL Behring. The chapter prose attributes it to "CSL Behring and Aizon," which is a factual error to fix — the company is Aizon (not CSL Behring).
- Maruthamuthu MK, Raffiee AH, De Oliveira DM, Ardekani AM, Verma MS. "Raman spectra-based deep learning: A tool to identify microbial contamination." MicrobiologyOpen. 2020;9(11):e1122. DOI: 10.1002/mbo3.1122. (Peer-reviewed research demonstration: a convolutional neural network classified 12 common pharmaceutical contaminant microorganisms, including mixtures with CHO cells, at 95-100% accuracy. Research-grade, not a GMP-deployed sterility method.)
- Pandi Chelvam S, et al. "Machine learning aided UV absorbance spectroscopy for microbial contamination in cell therapy products." Scientific Reports. 2025;15. DOI: 10.1038/s41598-024-83114-y. (SMART CAMP / Singapore-MIT Alliance for Research and Technology. A one-class SVM trained on sterile MSC supernatant spectra detected contamination at ~10 CFU/mL across seven microbial strains. Peer-reviewed research demonstration, not a GMP-deployed rapid sterility method.)
- Agarwal P, et al. "Hybrid modeling for in silico optimization of a dynamic perfusion cell culture process." Biotechnology Progress. 2025. DOI: 10.1002/btpr.3503. (Peer-reviewed hybrid mechanistic + shallow-neural-network model of a perfusion cell culture: a mechanistic core predicts viable cell density and other states that feed an NN predicting mAb-specific productivity, used to optimize feeding policy — i.e., a perfusion-process hybrid model predicting density trajectory and downstream titer effect. Related supporting work: Narayanan H, et al., "Development and validation of a hybrid model for prediction of viable cell density, titer and cumulative glucose consumption in a mammalian cell culture system," Computers & Chemical Engineering, 2024, DOI: 10.1016/j.compchemeng.2024.108645.)
- Gadiyar C, et al. "Self-Driving Development of Perfusion Processes for Monoclonal Antibody Production." Biotechnology and Bioengineering. 2026. DOI: 10.1002/bit.70093 (preprint: bioRxiv, Sept 2024, DOI: 10.1101/2024.09.03.610922). Collaboration of DataHow AG, Sartorius, and Merck KGaA. A Bayesian-experimental-design algorithm plus a cognitive digital twin (step-wise Gaussian-process models) autonomously operated an ambr250 24-parallel mini-bioreactor perfusion platform; the proof of concept was a 27-day cultivation with ~20 days run by the autonomous agent. Peer-reviewed, development-scale proof of concept, explicitly NOT GMP.
- 908 Devices Inc. / National Resilience, Inc. press release, "Resilience Demonstrates Lower Cost of Perfusion Bioreactor Process Using 908 Devices REBEL At-Line Analyzer," April 26, 2023. https://s201.q4cdn.com/978897484/files/doc_news/Resilience-Demonstrates-Lower-Cost-of-Perfusion-Bioreactor-Process-Using-908-Devices-REBEL-At-line-Analyzer-2023.pdf. (VENDOR PRESS RELEASE / self-reported, not peer-reviewed. Resilience reported a 50% titer increase in a mAb perfusion process while reducing cost of goods by adding back only depleted nutrients, using the REBEL at-line media analyzer — a PAT + feed-optimization result, NOT a machine-learning deployment. Supports the chapter's point that attributing the "+50% titer" story to ML is incorrect.)
- Hutter C, von Stosch M, Cruz Bournazou MN, Butte A. "Knowledge transfer across cell lines using hybrid Gaussian process models with entity embedding vectors." Bioprocess and Biosystems Engineering / Biotechnology and Bioengineering. 2021. arXiv:2011.13863; PubMed: 34383309; DOI: 10.1002/bit.27908. (Hybrid Gaussian-process models with learned entity-embedding vectors transfer process knowledge across cell lines/products to reduce wet-lab experiments — the transfer-learning / Bayesian-prior approach for thin-history processes. Supporting review: Baron Diaz et al., "Transfer Learning Approaches in Bioprocess Engineering: Opportunities and Challenges," Biotechnology and Bioengineering, 2026, DOI: 10.1002/bit.70186, which covers few-batch transfer, e.g. Rogers et al. 2022 transferring from ~8 batches across microbial strains.)
- U.S. FDA (CDER, in collaboration with CBER and CDRH/Digital Health Center of Excellence), "Using Artificial Intelligence and Machine Learning in the Development of Drug and Biological Products" (discussion paper), May 2023 (revised Feb 2025). https://www.fda.gov/media/167973/download — this is the exact document the chapter links to (media/167973). (A companion CDER paper, "Artificial Intelligence in Drug Manufacturing," FDA media/165743, docket FDA-2023-N-0487, covers the manufacturing side.) Paired with: Herzog M, et al. / ISPE, "The 7th ISPE Pharma 4.0 Survey: Digital Transformation," Pharmaceutical Engineering, Sept/Oct 2024. https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital. (The ISPE survey found AI/ML has the highest number of pilot projects yet few small- or large-scale implementations — the "most pilots, fewest scaled deployments" finding cited in the prose.)
The Production Bioreactor: Soft Sensors, Closed-Loop Control, and the Digital Twin
- Abu-Absi, N. R., Kenty, B. M., Cuellar, M. E., Borys, M. C., Sakhamuri, S., Strachan, D. J., Hausladen, M. C., & Li, Z. J. (2011). Real time monitoring of multiple parameters in mammalian cell culture bioreactors using an in-line Raman spectroscopy probe. Biotechnology and Bioengineering, 108(5), 1215-1221. DOI: 10.1002/bit.23023. The foundational demonstration of in-line Raman with PLS multivariate calibration for simultaneous monitoring of glucose, lactate, glutamine, glutamate, ammonium, viable cell density, and total cell density in a CHO bioreactor; establishes the PLS chemometrics pipeline that remains the workhorse of spectroscopic bioprocess PAT (cited for the claim that PLS is the decades-old chemometric workhorse, hard to beat in the small-data Raman regime, and established in the literature for over a decade).
- Berry, B., Moretto, J., Matthews, T., Smelko, J., & Wiltberger, K. (2015). Cross-scale predictive modeling of CHO cell culture growth and metabolites using Raman spectroscopy and multivariate analysis. Biotechnology Progress, 31(2), 566-577. DOI: 10.1002/btpr.2035 (PMID: 25504860). Multi-scale (3 L / 200 L / 2,000 L) in-line Raman PLS models for CHO growth and metabolites, with glucose, lactate, and osmolality well-modeled and demonstrated cross-scale model transfer; corroborates that Raman + PLS soft-sensing of glucose, lactate, and titer in CHO culture is an established, more-than-a-decade-old practice.
- Rosado, P. J., Merheb, B., & Toro, A. (2024). Real-Time, Data-Driven, and Predictive Modeling: Accelerating Digital Transformation in Drug Substance Commercial Manufacturing. BioProcess International (March 2024). Authors are Amgen Manufacturing Limited engineers at Juncos, Puerto Rico; describes an orthogonal-PLS (OPLS) batch-level model deployed in SIMCA-online to predict cell-culture harvest titer in real time inside commercial GMP drug-substance manufacturing (reported model fit R-squared 0.91, Q-squared 0.85). SIMCA is a registered trademark of Sartorius AG; a companion Sartorius vendor case study covers the same deployment. NOTE: first-party / vendor-self-reported evidence tier (Amgen authors plus Sartorius case study), not independently audited. URL: https://www.bioprocessintl.com/pat/real-time-data-driven-and-predictive-modeling-accelerating-digital-transformation-in-drug-substance-commercial-manufacturing
- Rashedi, M., Demers, M., Khodabandehlou, H., Wang, T., Garvin, C., & Rianna, S. (2025). Continuous glucose feedback control using Raman spectroscopy and deep learning models for biopharmaceutical processes. Biotechnology Progress, 41(4), e70020. DOI: 10.1002/btpr.70020 (PMID: 40172019; Epub 2025 Apr 2). All authors affiliated with Amgen Inc. (Operations Transformation and Digital Strategy / Process Development, Thousand Oaks CA and West Greenwich RI). Peer-reviewed demonstration of closed-loop continuous glucose control in CHO culture using Raman spectroscopy with deep-learning soft sensors (CNN and variational-autoencoder just-in-time learning) versus bolus feeding; supports the claim that Raman-plus-deep-learning closed-loop glucose control has been demonstrated at Amgen at pilot maturity.
- Agarwal, P., McCready, C., Ng, S. K., Ng, J. C., van de Laar, J., Pennings, M., & Zijlstra, G. (2025). Hybrid modeling for in silico optimization of a dynamic perfusion cell culture process. Biotechnology Progress, 41(1), e3503. DOI: 10.1002/btpr.3503 (PMID: 39291457). Lead authors affiliated with Sartorius Corporate Research (Oakville, Ontario) and Sartorius Netherlands. Canonical Sartorius hybrid (gray-box) process-twin example: a mechanistic dynamic backbone paired with a neural-network model for viable cell density / mAb production, used for in silico optimization at development/perfusion scale rather than as a commercial autopilot; supports the hybrid mechanistic-plus-ML twin claim at pilot maturity.
- Wang, J., Chen, J., Studts, J., & Wang, G. (2024). Simultaneous prediction of 16 quality attributes during protein A chromatography using machine learning based Raman spectroscopy models. Biotechnology and Bioengineering, 121(*). DOI: 10.1002/bit.28679 (PMID: 38419489). Authors from Late Stage Downstream Process Development, Boehringer Ingelheim Pharma GmbH & Co. KG (Biberach an der Riss, Germany), with Karlsruhe Institute of Technology. Used Butterworth-filter preprocessing of in-line Raman with a k-nearest-neighbours (KNN) regressor (NOT deep learning) to predict 16 product quality attributes simultaneously, with KNN reducing high-molecular-weight prediction MAE three-fold versus PLS and PCR; the correct citation for the 16-attribute in-line Raman KNN demonstration.
- 908 Devices Inc. (2025, April 18). Resilience Demonstrates Lower Cost of Perfusion Bioreactor Process Using 908 Devices' REBEL At-line Analyzer (press release). Reports National Resilience using the REBEL at-line media analyzer to monitor amino-acid depletion and optimize cell-culture feed strategies in a mAb perfusion process, demonstrating a 50% titer increase by adding back only depleted nutrients. NOTE: vendor press release / self-reported; this is a PAT-plus-manual-feed-optimization result, NOT a machine-learning deployment, and must not be presented as ML lifting titer. URL: https://908devices.com/news/resilience-demonstrates-lower-cost-of-perfusion-bioreactor-process-using-908-devices-rebel-at-line-analyzer/
- U.S. FDA, CDER/CBER/CDRH (2023, May; revised Feb 2025). Using Artificial Intelligence and Machine Learning in the Development of Drug and Biological Products (discussion paper). Docket FDA-2023-N-0743; comments framework noting AI/ML support of advanced pharmaceutical manufacturing (process controls, early-warning monitoring, batch-loss prevention) alongside PAT and continuous manufacturing. URL: https://www.fda.gov/media/167973/download (Federal Register: https://www.federalregister.gov/d/2023-09985). Together with the 7th ISPE Pharma 4.0 Survey (2024). The 7th ISPE Pharma 4.0 Survey: Digital Transformation. Pharmaceutical Engineering, Sept/Oct 2024 (ISPE), which surveyed 19 enabling technologies and found AI/ML among the more-adopted technologies while advanced modeling and autonomous-control technologies remain mainly pilot/small-scale. URL: https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital. Both support the throughline that AI/ML in this part of the plant is strongest as human-in-the-loop monitoring/soft sensing and thinner as autonomous closed-loop control of a CQA.
Harvest and Clarification: Predicting the Endpoint
- Running-example dataset (primary / self-reported). Committed simulator datasets in the companion repository: examples/datasets/offline_assays.csv (per-batch VCD, viability, and titer trajectory — schema: sample_id, batch_id, sample_time, sample_point, VCD_e6_per_mL, viability_pct, glucose_g_L, lactate_g_L, glutamine_mM, ammonia_mM, osmolality_mOsm_kg, titer_g_L, pH_offline) and examples/datasets/hplc_results.csv (release-stage assays with spec limits and result — schema: batch_id, test, value, unit, spec_low, spec_high, result). These are the source of the BATCH-2026-001 final-sample figures (19.66e6 cells/mL, 68.0% viability, 5.877 g/L titer; HCP 28.203 ng/mg) and the BATCH-2026-004 out-of-spec HCP failure at 128.0 ng/mg against the 100 ng/mg limit. Self-reported, in-repository synthetic-but-realistic dataset of the book's running example, not an external measurement; integrity-checked via examples/datasets/MANIFEST.sha256.
- Rosado PJ, Merheb B, Toro A. "Drug Substance Manufacturing: Accelerating Digital Transformation" (also titled "Real-Time, Data-Driven, and Predictive Modeling: Accelerating Digital Transformation in Drug-Substance Commercial Manufacturing"). BioProcess International, 9 February 2023. Amgen Manufacturing Limited, Juncos, Puerto Rico. Describes a batch-level model (BLM) using an orthogonal partial least squares (OPLS) algorithm built in Sartorius SIMCA (v16.0) and deployed for real-time harvest-titer prediction in SIMCA-online (v16.1) under 21 CFR Part 11 / computer-system-validation controls in commercial GMP drug-substance manufacturing; reported model fit R2 ~0.91, Q2 ~0.85; reported elimination of ~6 hours of harvest idle time (waiting on the off-line titer assay) and ~10 hours of idle time between chromatography columns. Vendor/self-reported tier: first-party account by Amgen engineers in trade press alongside a Sartorius case study; the specific hour savings are self-reported and not independently audited. URL: https://www.bioprocessintl.com/pat/real-time-data-driven-and-predictive-modeling-accelerating-digital-transformation-in-drug-substance-commercial-manufacturing
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research (CDER). "Artificial Intelligence in Drug Manufacturing" (discussion paper), published 1 March 2023, Docket No. FDA-2023-N-0487; issued under CDER's FRAME (Framework for Regulatory Advanced Manufacturing Evaluation) initiative; request for information and comments (Federal Register, 1 March 2023; comment period reopened 27 September 2023). The paper frames ML in pharmaceutical manufacturing around risk-based model development/validation, data governance, lifecycle management, and human oversight within the pharmaceutical quality system rather than autonomous control of CQAs. Paper PDF: https://www.fda.gov/media/165743/download ; Federal Register notice: https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and (regulatory document).
- European Commission / EMA, EudraLex Volume 4 — EU GMP Guide, draft Annex 22 "Artificial Intelligence," published for public consultation 7 July 2025 (comments to 7 October 2025), released alongside the revised draft Annex 11 (Computerised Systems) and revised Chapter 4 (Documentation). Drafted by the EMA GMDP Inspectors' Working Group in cooperation with PIC/S (with FDA and MHRA as observers). The draft sets a structured, risk-based framework for AI/ML in GMP manufacturing: intended-use definition, validation, lifecycle management, explainability, and human-in-the-loop oversight, and restricts generative AI, large language models, and continuously-learning ("dynamic") models for GMP-critical activities — i.e., a locked, validated model supporting (not replacing) a human decision. Draft/consultation regulatory document (not yet finalized; final publication anticipated 2026). Overview: https://www.gmp-compliance.org/guidelines/gmp-guideline/eu-gmp-annex-22-draft-2025-artificial-intelligence ; consultation announcement: https://www.gmp-compliance.org/gmp-news/drafts-of-eu-gmp-guideline-annex-11-annex-22-and-chapter-4-released-for-comment
Capture Chromatography: Hybrid Models and Real-Time Pooling
- Kumar, V., Leweke, S., von Lieres, E., & Rathore, A. S. (2015). Mechanistic modeling of ion-exchange process chromatography of charge variants of monoclonal antibody products. Journal of Chromatography A, 1426, 140-153. DOI: 10.1016/j.chroma.2015.11.062. Establishes the general rate model + steric mass action (SMA) framework calibrated on small-scale runs as a predictive mechanistic model (digital twin) for antibody chromatography; see also Saleh, D., et al. (2021), In silico process characterization for biopharmaceutical development following the quality by design concept, Biotechnology Progress 37(6):e3196, DOI: 10.1002/btpr.3196. Cytiva GoSilico (ChromX/DSPX, acquired by Cytiva in 2021) is the canonical commercial mechanistic-modeling implementation (vendor product page, vendor-self-reported: https://www.cytivalifesciences.com/en/us/solutions/bioprocessing/products-and-solutions/gosilico-chromatography-modeling-software).
- Leweke, S., & von Lieres, E. (2018). Chromatography Analysis and Design Toolkit (CADET). Computers & Chemical Engineering, 113, 274-294. DOI: 10.1016/j.compchemeng.2018.02.025. CADET is the open-source (GPL) mechanistic chromatography simulator originally developed by Eric von Lieres at the Institute of Bio- and Geosciences (IBG-1), Forschungszentrum Jülich; it solves the general rate model and related column models. Software and documentation: https://github.com/cadet and https://cadet-web.de.
- Narayanan, H., Seidler, T., Luna, M. F., Sokolov, M., Morbidelli, M., & Butté, A. (2021). Hybrid models for the simulation and prediction of chromatographic processes for protein capture. Journal of Chromatography A, 1650, 462248. DOI: 10.1016/j.chroma.2021.462248. Demonstrates ANN + mechanistic (lumped kinetic model) hybrids for protein-capture breakthrough prediction with markedly lower error than mechanistic-only models on small data; the serial-vs-parallel hybrid taxonomy and small-data rationale are reviewed in von Stosch, M., Oliveira, R., Peres, J., & Feyo de Azevedo, S. (2014), Hybrid semi-parametric modeling in process systems engineering: Past, present and future, Computers & Chemical Engineering, 60, 86-101, DOI: 10.1016/j.compchemeng.2013.08.008.
- Hybrid AI/ML-mechanistic framework enables intelligent optimization of commercial biopharmaceutical downstream processing (2026). mAbs, 18(1). DOI: 10.1080/19420862.2026.2644662 (PubMed PMID: 41879118). Peer-reviewed pilot that screened 30 input factors against critical quality attributes and yield across 400+ commercial manufacturing lots, refined the screen with equilibrium-dispersive and steric-mass-action mechanistic models, and ran over 40,000 in silico optimizations to resolve a yield-purity trade-off on an anion-exchange step for a PEGylated protein, reporting a 12% increase in yield and 33% reduction in high-molecular-weight impurities. NOTE: the improvement figures are self-reported by the authoring manufacturer on its own single-step in-silico optimization (not closed-loop control) and have not been independently reproduced.
- Tang, S.-Y., Yuan, Y.-H., Sun, Y.-N., Yao, S.-J., Wang, Y., & Lin, D.-Q. (2025). Developing physics-informed neural networks for model predictive control of periodic counter-current chromatography. Journal of Chromatography A, 1739, 465514. DOI: 10.1016/j.chroma.2024.465514 (Epub 2024 Nov 12; PubMed PMID: 39566288). A PINN based on the general rate model accelerates four-column periodic counter-current (4C-PCC) Protein A capture modeling, cutting offline breakthrough-curve fitting from 2608.6 s to 110.7 s and completing online simulations in 12-14 s, enabling real-time model-predictive control. See also the follow-on real-time MPC demonstration: Tang et al. (2026), Real-Time Model Predictive Control of Monoclonal Antibody Capture in Continuous Manufacturing Using Physics-Informed Neural Networks Accelerated Mechanistic Modeling, Biotechnology and Bioengineering, DOI: 10.1002/bit.70141. This is a research result, not a deployed plant control loop.
- Ramakrishna, A., & Rathore, A. S. (2024). On-line PAT based monitoring and control of resin aging in protein A chromatography for COGs reduction. Sustainable Chemistry and Pharmacy / Journal of Pharmaceutical and Biomedical Analysis (Elsevier; ScienceDirect pii: S1570023224000187). DOI: 10.1016/j.scp.2024.101455. A pilot using on-line PAT with principal component analysis (PCA) and batch-level modeling of chromatographic data that detects significant Protein A resin aging 20-25 cycles prior to observable yield decline, with a proposed cleaning-triggered control strategy projected to extend resin lifespan by 50-100 cycles (a modeled benefit, not yet a validated GMP outcome).
- Nitika, N., Keerthiveena, B., Thakur, G., & Rathore, A. S. (2024). Convolutional Neural Networks Guided Raman Spectroscopy as a Process Analytical Technology (PAT) Tool for Monitoring and Simultaneous Prediction of Monoclonal Antibody Charge Variants. Pharmaceutical Research, 41(3), 463-479. DOI: 10.1007/s11095-024-03663-9 (PubMed PMID: 38366234). A CNN model calibrated on Raman spectra from process-scale cation-exchange (CEX) chromatography quantifies acidic, main, and basic charge-variant species and total protein with R-squared of 0.94, 0.99, 0.96, and 0.99 respectively, enabling real-time CEX pooling decisions. This is a polishing-chromatography (CEX) research result on a different separation, not routine capture practice.
- Wang, J., Chen, J., Studts, J., & Wang, G. (2024). Simultaneous prediction of 16 quality attributes during protein A chromatography using machine learning based Raman spectroscopy models. Biotechnology and Bioengineering, 121(8). DOI: 10.1002/bit.28679 (received 14 Nov 2023, accepted 10 Feb 2024). A pilot at Boehringer Ingelheim (Late Stage Downstream Process Development, with Karlsruhe Institute of Technology) using a k-nearest-neighbor (KNN) regressor — explicitly NOT a deep-learning model — on Butterworth-filtered Raman spectra to predict 16 product quality attributes in-line during Protein A capture chromatography.
Viral Safety: Learning Log-Reduction and Orthogonal Clearance
- International Council for Harmonisation. "ICH Q5A(R2): Viral Safety Evaluation of Biotechnology Products Derived from Cell Lines of Human or Animal Origin." Final Guideline, adopted by the ICH Assembly 1 November 2023 (revising the original 1998/R1 guideline). Available at https://database.ich.org/sites/default/files/ICH_Q5A(R2)_Guideline_2023_1101.pdf . Adopted as an FDA Guidance for Industry (Federal Register notice, 11 January 2024, https://www.federalregister.gov/documents/2024/01/11/2024-00407/ ) and by EMA/CHMP (Step 5, effective 14 June 2024). Establishes the three-pillar viral-safety framework — low-risk cell-line/raw-material selection, detection testing of unprocessed bulk, and demonstrated process clearance capacity validated by spike studies.
- PDA Technical Report No. 47, "Preparation of Virus Spikes Used for Virus Clearance Studies" (Parenteral Drug Association, 2010), which codifies the industry technical practice for designing scaled-down virus spike/clearance studies (https://www.pda.org/bookstore/product-detail/1207-tr-47-preparation-of-virus-spikes ); together with the conservative LRV-reporting practice formalized in Li, N. & Yang, H., "Statistical evaluations of viral clearance studies for biological products," Biologicals 40(6):439-444 (2012), DOI 10.1016/j.biologicals.2012.07.016, which describes reporting the lower 95% confidence bound of the reduction factor as the conservative clearance estimate.
- Tang, A., Ramos, I., Newell, K. & Stewart, K. D., "A novel high-throughput process development screening tool for virus filtration," Journal of Membrane Science 611:118330 (2020), DOI 10.1016/j.memsci.2020.118330. A 96-well Viresolve Pro scale-down model generating flux-decay curves across multiple mAb feeds, demonstrating that virus-filtration performance (flux/throughput) can be predicted from feed quality and process conditions; supports the chapter's claim that feed-biophysical and process parameters drive filter performance. (Peer-reviewed; small-scale process-development tool, not a release method.)
- Shirataki, H., Gudex, L. & Wickramasinghe, S. R., "Modeling virus filtration: Materials, applications, and mechanism," iScience 28:111533 (2025), DOI 10.1016/j.isci.2024.111533. A review (lead author Hironobu Shirataki, a Merck/MilliporeSigma virus-filtration scientist — the manufacturer of the Viresolve filter line, and citing his own prior multilayer/blocking-model work) covering prediction of virus-filtration performance from feed/material properties and fouling/blocking mechanisms; the representative, correctly-attributed Merck-group example referenced in the chapter. (Peer-reviewed review.)
- Tuo, J., Zha, M., Li, H., Xie, D., Wang, Y., Sheng, G.-P. & Wang, Y., "Predictive modeling and insight into protein fouling in microfiltration and ultrafiltration through one-dimensional convolutional models," Separation and Purification Technology 352:128237 (2025), DOI 10.1016/j.seppur.2024.128237. Uses 1D convolutional (deep-learning) models to predict and interpret protein-fouling flux trajectories in membrane filtration; supports the chapter's claim that fouling/flux-decline signatures carry a learnable retention/performance signal. (Peer-reviewed research; not a deployed release tool.)
- U.S. FDA, Center for Drug Evaluation and Research (CDER), "Artificial Intelligence in Drug Manufacturing," Discussion Paper (1 March 2023). Federal Register notice with request for information and comments, https://www.federalregister.gov/documents/2023/03/01/2023-04206/ (comment period reopened 27 September 2023). Sets out risk-based, model-credibility considerations for AI/ML in pharmaceutical manufacturing — the model-risk/credibility framing the chapter invokes for a model influencing a viral-clearance decision.
- European Commission, EudraLex Volume 4, GMP Annex 22 "Artificial Intelligence" — draft for public consultation, published July 2025 (consultation closed October 2025; final adoption expected 2026). Consultation draft: https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en . Restricts critical GMP applications to static/"locked" deterministic models under a predetermined change-control plan and excludes dynamic self-learning models and generative AI/LLMs from GMP-critical decisions — the basis for the chapter's statement that adaptive AI is excluded from critical GMP functions.
Polishing Chromatography: Trajectory Models and Resin Lifetime
- Brooks CA, Cramer SM. "Steric mass-action ion exchange: Displacement profiles and induced salt gradients." AIChE Journal. 1992;38(12):1969-1978. DOI: 10.1002/aic.690381212. (Foundational steric mass action isotherm underlying mechanistic ion-exchange/general-rate-model chromatography simulation.) See also Leweke S, von Lieres E. "Chromatography Analysis and Design Toolkit (CADET)." Computers & Chemical Engineering. 2018;113:274-294. DOI: 10.1016/j.compchemeng.2018.02.025 (open-source general-rate-model + SMA simulator). Vendor counterpart: Cytiva GoSilico Chromatography Modeling Software (ChromX/DSPX) product documentation, https://www.cytivalifesciences.com (vendor/self-reported).
- Hahn T, Geng X, Vazquez-Villegas P, et al. "Mechanistic modeling, simulation, and optimization of mixed-mode chromatography for an antibody polishing step." Biotechnology Progress. 2023;39(2):e3316. DOI: 10.1002/btpr.3316. PMID: 36471899. (Peer-reviewed industrial case: mechanistic model — steric mass action / colloidal-particle-adsorption isotherm — built from six experiments for a pH-controlled antibody polishing step, with in silico optimization and validation; describes fragments, aggregates, and HCP.)
- Nitika N, Keerthiveena B, Thakur G, Rathore AS. "Convolutional Neural Networks Guided Raman Spectroscopy as a Process Analytical Technology (PAT) Tool for Monitoring and Simultaneous Prediction of Monoclonal Antibody Charge Variants." Pharmaceutical Research. 2024;41(3):463-479. DOI: 10.1007/s11095-024-03663-9. PMID: 38366234. (Deep-learning CNN on Raman spectra predicting acidic/main/basic charge-variant fractions during process-scale CEX; a genuine deep-learning research result on the charge-variant pooling problem, reporting high R2 — supporting the 0.94 to 0.99 claim.)
- Chen J, Wang J, Hess R, Wang G, Studts J, Franzreb M. "Application of Raman spectroscopy during pharmaceutical process development for determination of critical quality attributes in Protein A chromatography." Journal of Chromatography A. 2024;1718:464721. DOI: 10.1016/j.chroma.2024.464721. PMID: 38341902. (Raman-based PAT during Protein A capture predicting multiple CQAs using k-nearest-neighbor (KNN) regression — explicitly NOT a deep-learning method — confirming the contrast drawn in the text.)
- Kim N, Kwon S, Kim Y, Kim G, Kim Y, Saxena L (Samsung Biologics). "Predictive Algorithm Modeling for Early Assessments in Downstream Processing: Using Direct Transition and Moment Analysis To Assess Chromatography Column Integrity at Production Scale." BioProcess International, March 21, 2023. https://www.bioprocessintl.com/chromatography/predictive-algorithm-modeling-for-early-assessments-in-downstream-processing-using-direct-transition-and-moment-analysis-to-assess-chromatography-column-integrity-at-production-scale (established trade press; Samsung Biologics describes integrating moment analysis with direct transition analysis — HETP, transwidth, asymmetry — for near-real-time column-integrity monitoring across all chromatography steps at production scale).
- Ramakrishna A, Rathore AS. "On-line PAT based monitoring and control of resin aging in protein A chromatography for COGs reduction." Journal of Chromatography B. 2024;1234:124010. DOI: 10.1016/j.jchromb.2024.124010. PMID: 38266612. (Pilot study using on-line PAT with principal component analysis and batch-level modeling that detected Protein A resin aging 20 to 25 cycles before observable yield decline, with a proposed control strategy to extend resin life — a modeled benefit, not a validated GMP outcome.)
- Feidl F, Luna MF, Podobnik M, Vogg S, Angelo J, Potter K, Wiggin E, Xu X, Ghose S, Li ZJ, Morbidelli M, Butte A. "Model based strategies towards protein A resin lifetime optimization and supervision." Journal of Chromatography A. 2020;1625:461261. DOI: 10.1016/j.chroma.2020.461261. (Model-based strategies — PCA/PLS statistical monitoring plus a hybrid deterministic lumped-kinetic aging model with two aging parameters — for protein A resin lifetime optimization, cleaning-procedure optimization, and column-quality supervision toward predictive maintenance.)
- Ravi N, Malmquist G, Stanev V, Ferreira G (AstraZeneca). "Exploring features in chromatographic profiles as a tool for monitoring column performance." Journal of Chromatography A. 2023;1698:463982. DOI: 10.1016/j.chroma.2023.463982. PMID: 37087858. (Feature-mining of chromatographic absorbance profiles — PCA, PLS, similarity scores, SOP-MPT with ML principles — to monitor column performance and anticipate resin degradation, in cases where traditional HETP/asymmetry could not detect the change.)
UF/DF and Drug Substance: Soft-Sensing Concentration and Excipients
- Repligen Corporation. "PATsmart FlowVPX System" and "FlowVPE/FlowVPX Application: Protein A280" product/application pages, CTech Analytical Solutions (Repligen). Variable Pathlength Technology (VPT) for dilution-free, in-line UV280 protein concentration measurement (0.1 to >250 mg/mL) integrated with the KrosFlo KR2i automated TFF system to monitor and control UF/DF by concentration rather than weight. Vendor / self-reported. https://www.repligen.com/flowvpx and https://ctech.repligen.com/application-flowvpe-flowvpx-protein-a280. On the acquisition that brought the analytics portfolio to Repligen: "Repligen Purchases Bioprocessing Analytics Portfolio from 908 Devices," press release, March 4, 2025 (USD 70 million cash; MAVERICK, MAVEN, REBEL, ZipChip), https://investors.repligen.com/press-releases/news-details/2025/Repligen-Purchases-Bioprocessing-Analytics-Portfolio-from-908-Devices/default.aspx; corroborated by BioProcess International, "Repligen lays down $70m to buy PAT tools from 908 Devices," 2025, https://www.bioprocessintl.com/upstream-downstream-processing/repligen-lays-down-70m-to-buy-analytics-tech-from-908-devices.
- Jesubalan, N. G., Saxena, N., Yezhuvath, V. B., Deore, N., & Rathore, A. S. (2025/2026). "AI-Enhanced Continued Process Verification for Ultrafiltration/Diafiltration." Biotechnology and Bioengineering, 123(1), 146-163. DOI: 10.1002/bit.70075. Peer-reviewed. Develops a CPV platform for UF/DF that treats the run trajectory as a multivariate object monitored batch-to-batch using control charts and machine learning for a monoclonal antibody process, aligned to FDA and EU GMP Annex 15 lifecycle process-control expectations. https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/bit.70075
- Rolinger, L., Rudt, M., & Hubbuch, J. (2023). "Monitoring of ultra- and diafiltration processes by Kalman-filtered Raman measurements." Analytical and Bioanalytical Chemistry, 415(5), 841-854. DOI: 10.1007/s00216-022-04477-7. Peer-reviewed (open access; PMC9883314). Combines in-line Raman spectroscopy with an Extended Kalman Filter / semi-mechanistic model to monitor protein and excipient (buffer) concentration in real time during UF/DF across three case studies (lysozyme, mAb, bsAb), improving sensitivity for the diafiltration progress versus density measurements. https://link.springer.com/article/10.1007/s00216-022-04477-7
- U.S. Food and Drug Administration (FDA), Center for Drug Evaluation and Research (CDER). "Artificial Intelligence in Drug Manufacturing" (Discussion Paper), published March 1, 2023; Docket No. FDA-2023-N-0487; part of the FRAME initiative. Frames AI in pharmaceutical manufacturing within a risk-based, lifecycle, human-oversight model with predetermined change control rather than autonomous control of quality attributes. https://www.fda.gov/media/165743/download (Federal Register notice: https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and).
- European Commission, EudraLex Volume 4 (EU GMP), draft Annex 22 "Artificial Intelligence." Draft released for public consultation 7 July 2025 (alongside revisions to Annex 11 and Chapter 4). The first EU GMP framework dedicated to AI: permits only static, deterministic, fixed/validated (locked) AI/ML models for GMP-impacting uses, with explainability, validation, lifecycle management, and human-in-the-loop oversight; learning/adaptive and probabilistic models are not permitted for critical GMP applications. Consultation overview: ECA Academy, "EU GMP Annex 22 (Draft 2025): Artificial Intelligence," https://www.gmp-compliance.org/guidelines/gmp-guideline/eu-gmp-annex-22-draft-2025-artificial-intelligence.
Formulation and Fill-Finish: Computer Vision and the Lyophilizer
- International Society for Pharmaceutical Engineering (ISPE), "Amgen's Deep Learning Approach to Vial Inspection," ISPE News, 2021, https://ispe.org/news/amgens-deep-learning-approach-vial-inspection (also reprinted as "Amgen's Deep Learning Approach To Vial Inspection," BioProcess Online / Pharmaceutical Online, https://www.bioprocessonline.com/doc/amgen-s-deep-learning-approach-to-vial-inspection-0001). Vendor/company self-reported. Describes Amgen's multi-year deep-learning automated visual inspection program (with Syntegon/Bosch), the first fully validated AI inspection retrofit on a syringe line at Juncos, Puerto Rico, the high proportion of syringes and vials auto-released through automated inspection, and direct engagement with FDA. (Note: the commonly cited ~95% auto-release figure is a company self-report; published Amgen/Syntegon validation metrics for a critical station are ~70% higher particle detection and ~60% fewer false rejects.)
- Steinwandter, V., Borchert, D., & Herwig, C., "Data science tools and applications on the way to Pharma 4.0," Drug Discovery Today, 24(9):1795-1805, 2019, DOI: 10.1016/j.drudis.2019.06.005. See also Rantanen, J. & Khinast, J., "The Future of Pharmaceutical Manufacturing Sciences," Journal of Pharmaceutical Sciences, 104(11):3612-3638, 2015, DOI: 10.1002/jps.24594. Peer-reviewed reviews documenting that productionized ML/data-science in pharma manufacturing clusters around monitoring, fault/anomaly detection, predictive maintenance, and (especially) machine-vision inspection, rather than autonomous closed-loop control of CQAs, which remains advisory/human-in-the-loop.
- Delgado, J. (Amgen) et al., as reported in PDA Letter, "Could A.I. Optimize Visual Inspection?", Parenteral Drug Association, https://www.pda.org/pda-letter-portal/home/full-article/could-a.i.-optimize-visual-inspection ; and "Amgen's Deep Learning Approach To Vial Inspection," BioProcess Online, https://www.bioprocessonline.com/doc/amgen-s-deep-learning-approach-to-vial-inspection-0001. Conference/expert-reported. Cites estimates that rule-based automated visual inspection platforms false-reject up to roughly 20% (and sometimes more) of acceptable product (e.g., misreading glare as a glass crack or bubbles as particles).
- Stevanato Group S.p.A., "Stevanato Group launches Artificial Intelligence platform to further enhance detection performance of its visual inspection machines," press release, 23 February 2021, https://www.stevanatogroup.com/en/news-events/press-releases/launch-of-artificial-intelligence-platform/ ; product page "Vision AI / Artificial Intelligence," https://www.stevanatogroup.com/en/offering/visual-inspection/inspection-technologies/artificial-intelligence/ . Vendor materials (self-reported). Claims a deep-learning (Vision AI) platform achieving up to 99.9% detection accuracy for both particle and cosmetic defects and a tenfold (order-of-magnitude) reduction in false rejects.
- Brevetti CEA, "Brevetti CEA finalizes full acquisition of Brevetti AI (former Criterion AI), reinforcing Deep learning capabilities for regulated industrial applications," company news, 2025, https://www.brevetti-cea.com/news/brevetti-cea-finalizes-full-acquisition-of-brevetti-ai-former-criterion-ai-reinforcing-deep-learning-capabilities-for-regulated-industrial-applications ; and "Brevetti C.E.A. acquires Criterion AI and confirms its leadership in the vision inspection world," 2020, https://www.brevetti-cea.com/news/brevetti-c-e-a-acquires-criterion-ai-and-confirms-its-leadership-in-the-vision-inspection-world . Company-reported. Documents that Brevetti CEA absorbed the Danish deep-learning specialist Criterion AI (founded 2018), forming the dedicated subsidiary Brevetti AI for regulated pharmaceutical visual inspection (CNNs, anomaly detection), with full acquisition completed in 2025.
- Amgen authors, "Vision Inspection Using Machine Learning/Artificial Intelligence," Pharmaceutical Engineering (ISPE), November-December 2020, https://ispe.org/pharmaceutical-engineering/november-december-2020/vision-inspection-using-machine . Company/industry-authored, peer-edited trade journal. Frames the validation-versus-learning tension and the resolution of locking (freezing) the model at validation: the deep-learning model must be static/version-controlled to be validatable under GMP, and updated only through formal change control.
- U.S. Food and Drug Administration, "Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products," Draft Guidance for Industry, January 2025 (Docket No. FDA-2024-D-4689; Federal Register, 7 January 2025, https://www.federalregister.gov/documents/2025/01/07/2024-31542/considerations-for-the-use-of-artificial-intelligence-to-support-regulatory-decision-making-for-drug ). FDA's first AI guidance for drugs/biologics; proposes a risk-based credibility-assessment framework (a seven-step process) for establishing an AI model's credibility for a defined context of use, explicitly spanning nonclinical, clinical, post-marketing, and manufacturing phases where AI output affects product safety, efficacy, or quality.
- European Commission, EudraLex Volume 4, "Annex 22: Artificial Intelligence" (draft for public consultation), published July 2025, consultation document: https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en?filename=mp_vol4_chap4_annex22_consultation_guideline_en.pdf . The first manufacturing-specific GMP rule on AI; restricts critical GMP applications (direct impact on patient safety, product quality, or data integrity) to static, deterministic machine-learning models locked at validation and governed by predefined change control. (Issued jointly in the EU/PIC-S GMP framework.)
- European Commission, EudraLex Volume 4, "Annex 22: Artificial Intelligence" (draft, July 2025), consultation document https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en?filename=mp_vol4_chap4_annex22_consultation_guideline_en.pdf ; with analysis in European Pharmaceutical Review, "What Annex 22 spells for AI in GMP manufacturing," https://www.europeanpharmaceuticalreview.com/what-annex-22-spells-for-ai-in-gmp-manufacturing/2135686.article . Confirms Annex 22 excludes dynamic/continuously-learning models, non-deterministic/probabilistic models, generative AI, and large language models from critical GMP applications (these are permitted only in non-critical uses under human oversight).
- U.S. Food and Drug Administration, Warning Letter to Purolea Cosmetics Lab (Livonia, MI), CMS #722591 / 320-26-58, dated 2 April 2026, https://www.fda.gov/inspections-compliance-enforcement-and-criminal-investigations/warning-letters/purolea-cosmetics-lab-722591-04022026 . FDA's first warning letter explicitly citing inappropriate use of AI in pharmaceutical manufacturing: the firm used AI agents to generate specifications, procedures/SOPs, and master production/control records without adequate Quality Unit review (the AI omitted process-validation requirements). Cited under 21 CFR 211.22(c) and 211.100. See also "FDA's First cGMP Enforcement Action On AI Misuse In Drug Manufacturing," BioProcess Online, https://www.bioprocessonline.com/doc/fda-s-first-cgmp-enforcement-action-on-ai-misuse-in-drug-manufacturing-0001 .
QC and Release: MSPC, Real-Time Release, and Predicting the OOS
- Sartorius (Umetrics), "SIMCA and SIMCA-online — Multivariate Data Analysis Software," product documentation, https://www.sartorius.com/en/products/process-analytical-technology/data-analytics-software/mvda-software/simca (vendor product page; accessed 2026). And AspenTech / ProSensus, "ProMV — Multivariate Statistical Process Control and Batch Analytics," https://www.aspentech.com/en/products/msc/aspen-promv (vendor product page; accessed 2026). Both are productized PCA/PLS monitoring tools that compute Hotelling's T-squared and SPE/Q control charts with contribution plots for continued process verification, golden-batch monitoring, and fault detection. Methodology corroborated by D. Kirdar et al., "Application of Multivariate Data Analysis for Identification and Successful Resolution of a Root Cause for a Bioprocessing Application," Biotechnology Progress 24(3):720-726 (2008), DOI: 10.1021/bp0704384, which demonstrates SIMCA-based T-squared/SPE/contribution-plot monitoring on a biopharmaceutical process.
- P. J. Rosado, B. Merheb, and A. Toro (Amgen Manufacturing Limited, Juncos, Puerto Rico), "Real-Time, Data-Driven, and Predictive Modeling: Accelerating Digital Transformation in Drug Substance Commercial Manufacturing," BioProcess International, March 2024, https://www.bioprocessintl.com/pat/real-time-data-driven-and-predictive-modeling-accelerating-digital-transformation-in-drug-substance-commercial-manufacturing (first-party / self-reported). Describes an orthogonal PLS (OPLS) batch-level model deployed in Sartorius SIMCA-online to predict harvest titer in real time from commercial GMP cell-culture and harvest in-process data, with offline assays retained as the documented values and backup.
- P. Nomikos and J. F. MacGregor, "Monitoring Batch Processes Using Multiway Principal Component Analysis," AIChE Journal 40(8):1361-1375 (1994), DOI: 10.1002/aic.690400809. The foundational multiway-PCA framework for batch monitoring (fit on a historical database of successful batches; unfold the batch x variable x time cube; score new batches by T-squared and SPE). See also the companion P. Nomikos and J. F. MacGregor, "Multivariate SPC Charts for Monitoring Batch Processes," Technometrics 37(1):41-59 (1995), DOI: 10.1080/00401706.1995.10485888.
- ICH Harmonised Tripartite Guideline Q8(R2), "Pharmaceutical Development," Current Step 4 version, August 2009, ICH, https://database.ich.org/sites/default/files/Q8_R2_Guideline.pdf (also issued by FDA, https://www.fda.gov/media/71535/download, and EMA). Defines real-time release testing (RTRT) as "the ability to evaluate and ensure the quality of in-process and/or final product based on process data, which typically include a valid combination of measured material attributes and process controls," and positions it within the control strategy as an alternative to end-product testing.
- U.S. FDA, approval of supplement for PREZISTA (darunavir) 600 mg tablets manufactured by continuous manufacturing with real-time release testing at Janssen Supply Chain, Gurabo, Puerto Rico, April 2016 — the first FDA approval converting an approved product from batch to continuous manufacturing. Reported in S. Lawrence, "FDA Approves Tablet Production on Janssen Continuous Manufacturing Line," Pharmaceutical Technology, April 12, 2016, https://www.pharmtech.com/view/fda-approves-tablet-production-janssen-continuous-manufacturing-line; EMA follow-on approval (2017) reported in "EMA Approves Janssen Drug Made Via Continuous Manufacturing," Pharmaceutical Technology, https://www.pharmtech.com/view/ema-approves-janssen-drug-made-continuous-manufacturing. NIR-based PAT/RTRT reduced the release timeline from roughly two weeks to one day.
- AspenTech, "Ferring Prototypes Continuous Manufacturing with Advanced PAT and Closed-Loop Control, Potential for Dramatically Lower COGS," case study, https://www.aspentech.com/en/resources/case-studies/ferring-prototypes-continuous-manufacturing-with-advanced-pat-and-closed-loop-control (vendor case study; development-stage internal estimate, illustrative — not an industry-established figure). Describes Ferring Pharmaceuticals' continuous-manufacturing prototype using AspenTech PAT (Aspen Unscrambler, Aspen Process Pulse) with real-time release testing replacing time-consuming offline laboratory assays, projecting substantially lower cost of goods.
- U.S. FDA / CDER, "Artificial Intelligence in Drug Manufacturing," discussion paper, published in the Federal Register March 1, 2023 (Docket FDA-2023-N-0487; Request for Information and Comments), https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and (paper PDF at https://downloads.regulations.gov/FDA-2023-N-0487-0027/attachment_1.pdf). Raises as an open question — without prescribing a settled answer — how AI/ML models in a cGMP environment should be monitored, maintained, and re-validated given that model performance can decay silently after deployment.
- ISPE, "The 7th ISPE Pharma 4.0 Survey: Digital Transformation," Pharmaceutical Engineering, September-October 2024, https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital (survey conducted 2023). Finds that despite wide discussion, AI/ML have not reached large-scale implementation in pharmaceutical manufacturing — most deployments remain small-scale or pilot-stage and are clustered in monitoring and process-understanding use cases (e.g., multivariate statistical process monitoring) rather than autonomous decision-making such as automated batch release.
Packaging and Serialization: Vision, Track-and-Trace, and Anomalies
- U.S. Food and Drug Administration. Drug Supply Chain Security Act (DSCSA). Title II of the Drug Quality and Security Act (DQSA), Public Law 113-54, enacted November 27, 2013. Establishes requirements for the unique product identification (serialization with a standardized numerical identifier comprising the National Drug Code, serial number, lot number, and expiration date), tracing, and verification of prescription drug packages through the U.S. supply chain. FDA program page: https://www.fda.gov/drugs/drug-supply-chain-integrity/drug-supply-chain-security-act-dscsa (regulatory source). Supports the claim that DSCSA mandates unique identification, reportable movements, and pre-dispense verification of saleable units.
- European Union. Directive 2011/62/EU of the European Parliament and of the Council of 8 June 2011 (the Falsified Medicines Directive, FMD), amending Directive 2001/83/EC, and Commission Delegated Regulation (EU) 2016/161 of 2 October 2015 laying down detailed rules for the safety features (unique identifier carried in a 2D Data Matrix plus anti-tampering device) on the packaging of medicinal products for human use. Official Journal of the EU L 174 (1.7.2011) and L 32 (9.2.2016). Mandates a unique identifier per saleable pack, end-to-end repository/hub verification, and decommissioning at dispense. Regulatory source. Supports the EU FMD serialization-and-verification claim.
- Stevanato Group. "Stevanato Group Unveils AI Platform to Boost Visual Inspection Efficiency" (press release, 23 February 2021) and Vision AI product page, https://www.stevanatogroup.com/en/offering/visual-inspection/inspection-technologies/artificial-intelligence/ . States the deep-learning Vision AI platform can reduce false rejects tenfold and reach up to 99.9% detection accuracy for particle and cosmetic-defect inspection. VENDOR / SELF-REPORTED — the 99.9% and tenfold figures are vendor marketing claims, not peer-reviewed. Supports the chapter's deep-learning-vision capability claim and the explicit caveat that the headline numbers are vendor-reported.
- Syntegon Technology. "Syntegon validates first AI-equipped visual inspection system," https://www.syntegon.com/press/syntegon-validates-first-ai-equipped-visual-inspection-system (see also Contract Pharma coverage, https://www.contractpharma.com/breaking-news/syntegon-validates-first-ai-equipped-visual-inspection-system/ ). Reports the world-first fully validated AI visual-inspection system installed on a customer (Amgen) SYRINGE line, achieving an approximately 70% increase in particle detection and approximately 60% reduction in false rejects at a particular inspection station. VENDOR / SELF-REPORTED. The ~20% rule-based AVI good-unit over-rejection figure attributed to Amgen is a conference/expert estimate, not peer-reviewed. Supports the false-reject-reduction claim and the note that the one fully validated public retrofit on record was a syringe (not vial) line.
- European Commission, EudraLex Volume 4 (EU GMP Guide). Draft Annex 22: Artificial Intelligence, published for public consultation 7 July 2025 (consultation closed October 2025; drafted by the EMA/GMDP Inspectors Working Group with PIC/S). Consultation guideline PDF: https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en?filename=mp_vol4_chap4_annex22_consultation_guideline_en.pdf . Permits only static, deterministic AI models in critical GMP applications and excludes dynamic (continuously learning), probabilistic, and generative AI from those critical decisions; requires human-in-the-loop oversight. Regulatory source (draft). Supports the deterministic-rules-for-critical-decisions / ML-advisory-only claims.
- Borchert D., Suarez-Zuluaga D.A., Sagmeister P., Thomassen Y.E., Herwig C., et al. "CPV of the Future: AI-Powered Continued Process Verification for Bioreactor Processes." PDA Journal of Pharmaceutical Science and Technology, 2022 (published online 19 September 2022); PMID 36122916; DOI 10.5731/pdajpst.2021.012665. Collaboration including Aizon (Barcelona) and CSL Behring AG (Bern), with Universitat Autonoma de Barcelona and Universitat de Vic. Applies a multivariate Isolation Forest anomaly-detection model (with random-forest methods) to fed-batch bioprocess monitoring as a proof-of-concept on a microbial Pichia pastoris model system. Peer-reviewed. Grounds the Isolation-Forest/Random-Forest method choice for the residual anomaly layer.
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research (CDER). "Artificial Intelligence in Drug Manufacturing" — discussion paper, published 1 March 2023 under the FRAME (Framework for Regulatory Advanced Manufacturing Evaluation) initiative; request for information and comments. Federal Register notice 2023-04206 (88 FR 12925, 1 March 2023), https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and ; paper at https://www.fda.gov/media/165743/download . Regulatory source. Supports the claim that FDA expects ML touching a quality-relevant decision to be advisory with human oversight rather than autonomous.
- U.S. Food and Drug Administration. Warning Letter to Purolea Cosmetics Lab (Livonia, Michigan), WL 320-26-58, issued 2 April 2026, following an October 2025 cGMP inspection. The first FDA warning letter to cite inappropriate use of artificial intelligence in drug manufacturing: the firm used AI agents to generate drug-product specifications, procedures, and master production/control records without adequate Quality Unit review (cited under 21 CFR 211.22(c)). Regulatory source. Reported by RAPS, https://www.raps.org/resource/fda-warns-firm-for-inappropriate-use-of-ai-in-drug-manufacturing.html , and ECA/GMP Compliance, https://www.gmp-compliance.org/gmp-news/use-of-ai-agents-leads-to-the-first-fda-warning-letter-relating-to-ai . Supports the cautionary tale contrasting AI-without-quality-review against the rules-first, ML-advisory discipline.
Distribution: Cold-Chain Prediction and Demand Forecasting
- International Council for Harmonisation (ICH), "Q1A(R2): Stability Testing of New Drug Substances and Products," ICH Harmonised Tripartite Guideline, Step 4 version, 6 February 2003. Adopted by FDA as Guidance for Industry (November 2003) and by EMA (CPMP/ICH/2736/99). Establishes the formal stability-study basis for assigning a drug product's shelf life, expiry, and labeled storage conditions. ICH database: https://database.ich.org/sites/default/files/Q1A%28R2%29%20Guideline.pdf ; FDA: https://www.fda.gov/media/71707/download
- United States Pharmacopeia, General Chapter <1079.2> "Mean Kinetic Temperature in the Evaluation of Temperature Excursions During Storage and Transportation of Drug Products," USP-NF. Defines the Mean Kinetic Temperature (Haynes' equation, J. D. Haynes, "Worldwide virtual temperatures for product stability testing," Journal of Pharmaceutical Sciences 60(6):927-929, 1971, doi:10.1002/jps.2600600631) and specifies the default heat of activation Ea = 83.144 kJ/mol and gas constant R = 8.3144 x 10^-3 kJ/(mol K), the convention that makes MKT reproducible across companies and always at or above the arithmetic mean. USP excerpt: https://www.usp.org/sites/default/files/usp/document/supply-chain/apec-toolkit/USP%20GC1079.2.pdf
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research (CDER), "Artificial Intelligence in Drug Manufacturing," Discussion Paper, published 1 March 2023 (Docket No. FDA-2023-N-0487; 88 FR 12943; FRAME initiative). Solicits stakeholder comment on the lifecycle management, validation, and human-oversight expectations for ML/AI applied to pharmaceutical manufacturing, including advisory/quality-relevant models. Federal Register notice: https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and ; FDA paper: https://www.fda.gov/media/165743/download
- European Medicines Agency / European Commission, EudraLex Volume 4, GMP Guide, draft "Annex 22: Artificial Intelligence," released for public consultation 7 July 2025 (comment period to 7 October 2025), drafted jointly with the PIC/S Inspectors' Working Group. The first dedicated EU GMP guidance on AI; sets a risk-based framework requiring documented intended use, validation, lifecycle management, explainability, and human-in-the-loop oversight for AI in GMP environments. EMA workshop/context page: https://www.ema.europa.eu/en/events/good-manufacturing-practice-multistakeholder-workshop-expert-contributions-artificial-intelligence-guidance-development-annex-22 ; draft overview (ECA Academy): https://www.gmp-compliance.org/gmp-news/drafts-of-eu-gmp-guideline-annex-11-annex-22-and-chapter-4-released-for-comment
- Composite, company-self-reported and survey sources for supply-chain ML. (a) Sanofi, press release "Sanofi 'all in' on artificial intelligence and data science...", 13 June 2023 - states the plai app (built with Aily Labs) predicts ~80% of low-inventory positions in its biopharma supply chain (self-reported): https://www.sanofi.com/en/media-room/press-releases/2023/2023-06-13-12-00-00-2687072 . The ~65% risk-to-root-cause figure is from Sanofi/Aily Labs corporate plai.supply material (separate, undated; self-reported). (b) Merck Healthcare (Merck KGaA, Darmstadt) cognitive supply-chain control tower built on Aera Technology's Cognitive Operating System (vendor/self-reported; note: the chapter's "Merck & Co." attribution is imprecise - the Aera deployment is Merck KGaA's): TechTarget, "Cognitive automation helps processes run on their own," https://www.techtarget.com/searcherp/feature/Cognitive-automation-helps-processes-run-on-their-own . (c) Merck KGaA + Palantir supply-chain forecasting partnership (Healthcare Acceleration Partnership, announced 2017; vendor/self-reported): https://www.genengnews.com/news/merck-kgaa-partners-with-palantir-on-data-analytics-across-healthcare-sector/ . (d) Independent counterweight: WBR Insights / LogiPharma "Supply Chain & Logistics Insights" report (survey of ~100 European heads of supply chain) - ~65% of pharma supply-chain leaders report limited confidence in AI's ability to predict or mitigate disruption, as reported by The Supply Chain Xchange, "Report: 65% of pharma supply chain leaders have limited confidence in AI": https://www.thescxchange.com/editorial/featured/report-65-of-pharma-supply-chain-leaders-lack-confidence-in-ai (also DC Velocity: https://www.dcvelocity.com/technology/artificial-intelligence/report-65-of-pharma-supply-chain-leaders-have-limited-confidence-in-ai ).
Hybrid Models and Digital Twins: The Dominant Paradigm
- Narayanan H, von Stosch M, Feidl F, Sokolov M, Morbidelli M, Butté A. "Hybrid modeling for biopharmaceutical processes: advantages, opportunities, and implementation." Frontiers in Chemical Engineering, vol. 5, 1157889 (2023). DOI: 10.3389/fceng.2023.1157889. Open-access review (several authors affiliated with DataHow AG) that lays out the serial/parallel hybrid taxonomy and argues that a mechanistic backbone supplies knowledge the data never had to provide, so the data-driven part has less to learn and succeeds where a pure black box starves in the small-data regime.
- Pinto J, Mestre M, Ramos J, Costa RS, Striedner G, Oliveira R. "A general deep hybrid model for bioreactor systems: Combining first principles with deep neural networks." Computers & Chemical Engineering, vol. 165, 107952 (2022). DOI: 10.1016/j.compchemeng.2022.107952. Demonstrates a general serial deep-hybrid framework (deep NN feeding a system of mass-balance ODEs) and reports that the deep hybrid model generalizes better than its shallow/black-box counterpart on identical data.
- Shahab MA, Destro F, Braatz RD. "Digital Twins in Biopharmaceutical Manufacturing: Review and Perspective on Human-Machine Collaborative Intelligence." arXiv:2504.00286 (2025), Massachusetts Institute of Technology. DOI: 10.48550/arXiv.2504.00286. Review of current biopharma digital-twin practice that places hybrid (mechanistic-plus-data-driven) modeling as the practical paradigm for bioprocess twins and catalogs why pure-ML deployments stall while hybrids ship.
- Polak N, et al. (DataHow AG and Bristol Myers Squibb co-authors incl. Hodgman CE, Borys M, Butté A). "An innovative hybrid modeling approach for simultaneous prediction of cell culture process dynamics and product quality." Biotechnology Journal, vol. 19, no. 4, e2300473 (2024). DOI: 10.1002/biot.202300473. Peer-reviewed DataHow/BMS companion to the BMS dataset (48 experiments at 5 L, 12 CPPs, 18 CQAs) reporting that hybrid models give substantially stronger CQA prediction with less experimental data than industry black-box models. VENDOR/SELF-REPORTED companion: DataHow case-study page, "Stronger CQA prediction with fewer experiments with DataHowLabs Hybrid Models" (datahow.ch case-study, Bristol Myers Squibb), which states larger vendor figures (~22% accuracy gain, ~3x fewer experiments) — prefer the peer-reviewed numbers (~33% better accuracy with about half the data) and read maturity as process-development/pilot, not GMP production.
- Rogers AW, Cardenas IOS, Del Rio-Chanona EA, Zhang D. "Investigating physics-informed neural networks for bioprocess hybrid model construction." In: Kokossis AC, Georgiadis MC, Pistikopoulos E (eds), 33rd European Symposium on Computer Aided Process Engineering, Computer Aided Chemical Engineering, vol. 52, Elsevier (2023), pp. 83-88. DOI: 10.1016/B978-0-443-15274-0.50014-7. Compares PINNs (physics carried as a soft loss) against structurally-embedded hybrid models for bioprocesses and documents the extrapolation weakness of the soft-constraint PINN approach; structurally embedding the balance equations enforces physical plausibility (e.g., no negative concentrations) outside the training range where a dual-network PINN degrades on long temporal extrapolation. (Industrial PINN work referenced in the prose — e.g., Pfizer-authored PINN bioreactor studies — supports that credible PINNs exist; the structural-vs-soft-loss contrast is the cited result.)
- Mahanty B. "Hybrid modeling in bioprocess dynamics: Structural variabilities, implementation strategies, and practical challenges." Biotechnology and Bioengineering, vol. 120, no. 8, pp. 2072-2091 (2023). DOI: 10.1002/bit.28503. Reviews hybrid model structures and their interpretability/decomposability advantage — the prediction can be split into a trusted mechanistic contribution and a learned correction — which is why hybrid models fit model-validation and QbD/governance frameworks more comfortably than pure black-box models. (Reinforced by Narayanan et al. 2023, ref. 1, on hybrid model interpretability for biopharma.)
- Cytiva (Danaher). "Cytiva acquires GoSilico to strengthen digital capabilities in bioprocessing" (press release, June 3, 2021); reported in GEN, "Cytiva Acquires German Scientific Software Maker GoSilico," and BioProcess Insider, "Cytiva looks to mechanistic modeling in GoSilico deal" (2021). The GoSilico ChromX/DSPX engine fits mechanistic chromatography models (general rate model for transport plus steric mass action isotherm) — mechanistic, not ML — establishing mechanistic chromatography twins as the most mature deployed computational technique downstream. Technical basis reviewed in Frontiers in Bioengineering and Biotechnology, "The use of predictive models to develop chromatography-based purification processes," 10:1009102 (2022), DOI: 10.3389/fbioe.2022.1009102.
- Hahn T, Geng N, Petrushevska-Seebach K, Dolan ME, Scheindel M, Graf P, Takenaka K, Izumida K, Li L, Ma Z, Schuelke N. "Mechanistic modeling, simulation, and optimization of mixed-mode chromatography for an antibody polishing step." Biotechnology Progress, vol. 39, no. 2, e3316 (2023). DOI: 10.1002/btpr.3316 (PMID 36471899). Builds a mechanistic model of a pH-controlled mixed-mode antibody polishing step from six experiments and uses in-silico optimization (ChromX lineage) to improve process performance over the historical set point. Software basis: Hahn T, Huuk T, Heuveline V, Hubbuch J. "Simulating and Optimizing Preparative Protein Chromatography with ChromX," Journal of Chemical Education, 92(9):1497-1502 (2015), DOI: 10.1021/ed500854a.
- Sanofi mechanistic chromatography case: peer-reviewed mechanistic model of a hydrophobic-interaction chromatography (HIC) step for vaccine-antigen / biologic purification developed by a Sanofi process-development team, demonstrating mechanistic (general-rate-plus-isotherm) modeling applied to an industrial bind-and-elute HIC purification. (Cited in the prose as a Sanofi mechanistic HIC vaccine-antigen example; exact bibliographic details — Journal of Chromatography A / Biotechnology Progress, author list, DOI — could not be re-verified independently in this session because the web-search provider repeatedly blocked the vaccine+chromatography query; flagged here as REQUIRES FINAL DOI VERIFICATION before publication.)
- GeneScience Pharmaceuticals (GenSci) authors. "Hybrid AI/ML-mechanistic framework enables intelligent optimization of commercial biopharmaceutical downstream processing." mAbs, vol. 18 (2026), article 2644662. DOI: 10.1080/19420862.2026.2644662. Wraps a mechanistic equilibrium-dispersive-plus-steric-mass-action model of a commercial PEGylated-protein anion-exchange (AEX) step with an ML correlation screen over 400-plus commercial manufacturing lots, then runs over 40,000 in-silico optimizations. SELF-REPORTED: the headline ~12% yield increase and ~33% reduction in high-molecular-weight impurities are reported by the authoring manufacturer on a single product and are not independently reproduced; read as pilot-scale.
- Tiwari A, Masampally VS, Agarwal A, Rathore AS. "Digital twin of a continuous chromatography process for mAb purification: Design and model-based control." Biotechnology and Bioengineering, vol. 120, no. 3, pp. 748-766 (2023). DOI: 10.1002/bit.28307 (PMID 36517960). Demonstrates an integrated continuous downstream twin spanning continuous multicolumn Protein A capture, viral inactivation (coiled-flow-inverter reactor), and multicolumn ion-exchange polishing with an online HPLC-PAT tool for real-time pooling decisions — an end-to-end pilot example, not a closed-loop commercial GMP train.
- BioPhorum. "Defining digital twins in a biomanufacturing environment" (whitepaper, BioPhorum Operations Group). Presents a consensus industry definition and implementation framework for digital twins across the drug-product lifecycle, published precisely because the lack of a shared definition slows regulatory acceptance. Used alongside vendor positioning: Siemens markets gPROMS-based whole-process and FormulatedProducts digital twins on its PCS 7/neo and SIPAT backbone (VENDOR/SELF-REPORTED, pilot for biopharma end-to-end). Note: there is no FDA or EMA guidance dedicated to digital twins.
- ISPE Pharma 4.0 surveys/baseline (ISPE Pharma 4.0 Operating Model and related ISPE community surveys) consistently find that AI/ML in biomanufacturing has the most pilots and the fewest scaled implementations, with production use clustered in monitoring, predictive maintenance, and vision rather than autonomous control of CQAs; together with BioPhorum's definitional whitepaper ("Defining digital twins in a biomanufacturing environment") this supports that plant-scale, closed-loop GMP twins remain aspirational. (See also BioPhorum Integrated Digital Twin workstream materials, biophorum.com.)
- Yokogawa Electric Corporation. "Yokogawa Acquires Insilico Biotechnology, Developer of Innovative Bioprocess Digital Twin Technology" (press release via Business Wire, November 2, 2021; Yokogawa news 2021-11-02). Confirms Insilico Biotechnology AG (Stuttgart) — which builds genome-scale-metabolic-model-plus-ANN hybrid twins for soft-sensing and MPC — was acquired by Yokogawa in November 2021 (not by Cytiva). Reported in GEN, "Yokogawa Acquires Insilico Biotechnology," and Chemical Processing (2021).
- Sartorius (Umetrics). "Sartorius modular digital ecosystem — Biobrain and Umetrics Digital Twin AI Ecosystem" (sartorius.com product and science-snippet pages, e.g. "Building the Digital Foundation for Process Intensification" and the SIMCA / SIMCA-online / MODDE product pages). VENDOR/SELF-REPORTED. Sartorius wraps multivariate data analysis (SIMCA for MVDA, MODDE for DoE, SIMCA-online for real-time monitoring) into a "Digital Twin AI Ecosystem" with Biobrain; the MVDA core is a production market standard, while the autonomous/self-optimizing layer is vendor positioning at pilot maturity.
- European Commission. EU GMP Annex 22 "Artificial Intelligence" (draft, EudraLex Volume 4, published for public consultation July 2025, alongside revised Annex 11 and Chapter 4; consultation closed October 2025, adoption expected 2026). The draft covers only static (locked) models whose parameters are fixed during use, explicitly excludes dynamic/continuously-adaptive models and generative AI/LLMs from critical GMP applications, and requires that GMP-critical AI be locked and governed by a predetermined change-control regime. Coverage: ECA Academy, "EU GMP Annex 22 (Draft 2025): Artificial Intelligence," and European Pharmaceutical Review, "What Annex 22 spells for AI in GMP manufacturing" (2025).
- Locked-model / predetermined change control plan (PCCP) regulatory posture. EU GMP Annex 22 draft (2025) requires a locked model under a pre-approved change-control plan for GMP-critical AI; the parallel concept in the US is the FDA's predetermined change control plan, FDA Guidance "Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence-Enabled Device Software Functions" (final guidance, December 2024). Together these define the state of the art as a model re-validated on a schedule under change control, not one that quietly keeps learning. (See ECA Academy Annex 22 draft summary and FDA PCCP final guidance, fda.gov.)
- Gadiyar et al. (DataHow, Sartorius, and Merck KGaA co-authors; corresponding author M.N. Cruz Bournazou, TU Berlin). "Self-Driving Development of Perfusion Processes for Monoclonal Antibody Production." Biotechnology and Bioengineering (2026), DOI: 10.1002/bit.70093 (preprint bioRxiv 2024.09.03.610922, 2024). Research-stage proof of concept running 27-day autonomous perfusion cultivations on a 24-parallel ambr250 mini-bioreactor platform using a Bayesian experimental-design algorithm plus a cognitive digital twin; the authors stress the gap between robotic capability and true device autonomy.
MLOps and Lifecycle: Drift, Retraining, and the Validation Paradox
- Marker [1] is cited for two claims — the three mathematical kinds of drift (covariate shift, concept drift, label/prior shift) and the PSI rule-of-thumb thresholds. Drift taxonomy: J. G. Moreno-Torres, T. Raeder, R. Alaiz-Rodríguez, N. V. Chawla, and F. Herrera, "A unifying view on dataset shift in classification," Pattern Recognition, vol. 45, no. 1, pp. 521–530, 2012, doi:10.1016/j.patcog.2011.06.019 — the standard taxonomy partitioning dataset shift into covariate shift, prior probability (label) shift, and concept shift. PSI thresholds: the 0.1 / 0.25 cutoffs are the long-standing credit-scoring rule of thumb originating with D. Lewis, An Introduction to Credit Scoring (1994) and popularized by N. Siddiqi, Intelligent Credit Scoring, 2nd ed., Wiley, 2017 (p. 368 ff.); academic reviews note the cutoffs are arbitrary with no formal Type I/II error basis — see B. Yurdakul, "Statistical Properties of Population Stability Index," PhD dissertation, Western Michigan University, 2018, https://scholarworks.wmich.edu/dissertations/3208/.
- U.S. Food and Drug Administration (CDER), "Artificial Intelligence in Drug Manufacturing," discussion paper, March 2023 (Docket FDA-2023-N-0487; comment period reopened to 27 Nov 2023). Regulatory document, explicitly a discussion paper and not binding guidance; raises questions on training-data management, model validation/re-validation, and risk-based scrutiny without prescribing answers. Available at https://www.fda.gov/media/165743/download; Federal Register notice https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and.
- U.S. Food and Drug Administration, "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products," draft guidance, January 2025 — the 7-step risk-based AI model credibility framework (define question of interest; determine context of use; assess model risk; develop a credibility plan; execute; document; assess adequacy for the context of use). Regulatory document (draft guidance). Concepts derive from FDA's medical-device modeling guidance "Assessing the Credibility of Computational Modeling and Simulation in Medical Device Submissions" (final, Nov 2023, aligned to ASME V&V 40). Available at https://www.fda.gov/media/184830/download (AI guidance) and https://www.fda.gov/media/154985/download (CM&S guidance).
- European Commission, EudraLex Volume 4, draft Annex 22 "Artificial Intelligence," public consultation released 7 July 2025 (consultation closed October 2025; drafted by the EMA GMDP Inspectors' Working Group with PIC/S). Regulatory document, in consultation. Restricts critical GMP AI to static, deterministic models and explicitly excludes self-learning/adaptive and generative/probabilistic models from critical applications, requiring a predetermined change-control approach for model updates. Draft at https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en?filename=mp_vol4_chap4_annex22_consultation_guideline_en.pdf.
- International Society for Pharmaceutical Engineering (ISPE), GAMP Guide: Artificial Intelligence, July 2025 — extends GAMP 5 (2nd ed.) Appendix D11 to AI/ML-enabled GxP computerized systems with a risk-based, lifecycle framework covering data acquisition, training, testing, deployment, and ongoing monitoring, emphasizing human oversight and traceability. Industry standards-body guidance (purchase). Announcement: ISPE, "New GAMP Guide Addresses Challenges Posed by AI-Enabled Computerized Systems," Pharmaceutical Engineering, Sept/Oct 2025, https://ispe.org/pharmaceutical-engineering/september-october-2025/new-gampr-guide-addresses-challenges-posed-ai; background: "Applying GAMP Concepts to Machine Learning," Pharmaceutical Engineering, Jan/Feb 2023, https://ispe.org/pharmaceutical-engineering/january-february-2023/applying-gampr-concepts-machine-learning.
- U.S. Food and Drug Administration, Warning Letter #722591 (WL 320-26-58) to Purolea Cosmetics Lab, Livonia, MI, issued 2 April 2026 — the first FDA warning letter citing inappropriate use of AI in pharmaceutical manufacturing. The firm used AI agents to generate drug-product specifications, procedures, SOPs, and master production/control records without adequate quality-unit review (cited under 21 CFR 211.22), and the AI omitted process-validation requirements. Regulatory enforcement document. Reporting: Outsourced Pharma, "FDA's First cGMP Enforcement Action On AI Misuse In Drug Manufacturing," https://www.outsourcedpharma.com/doc/fda-s-first-cgmp-enforcement-action-on-ai-misuse-in-drug-manufacturing-0001; ECA Academy, "Use of AI Agents leads to the first FDA Warning Letter relating to AI," https://www.gmp-compliance.org/gmp-news/use-of-ai-agents-leads-to-the-first-fda-warning-letter-relating-to-ai.
- ISPE, "The 7th ISPE Pharma 4.0 Survey: Digital Transformation," Pharmaceutical Engineering, Sept/Oct 2024 — industry survey finding AI/ML adopted mainly at pilot/small scale with the fewest large-scale (systematic ongoing) implementations among assessed digital technologies, with production use clustering in monitoring, predictive maintenance, vision/image recognition, and human-in-the-loop documentation rather than autonomous CQA control. Trade/industry-body survey. Available at https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital.
Manufacturing Operations: Predictive Maintenance, Yield, and Scheduling
- Carvalho, T. P., Soares, F. A. A. M. N., Vita, R., Francisco, R. da P., Basto, J. P., & Alcalá, S. G. S. (2019). "A systematic literature review of machine learning methods applied to predictive maintenance." Computers & Industrial Engineering, 137, 106024. DOI: 10.1016/j.cie.2019.106024. Peer-reviewed survey establishing that rotating/reciprocating equipment exhibits degradation signatures (vibration, motor current, temperature) detectable weeks before failure, and that ML-based PdM converts unplanned failures into scheduled maintenance with lead time. Cross-industry PdM platforms cited in the prose: AVEVA Predictive Analytics (https://www.aveva.com/en/products/predictive-analytics/) and Siemens Senseye Predictive Maintenance (https://www.siemens.com/global/en/products/services/digital-enterprise-services/predictive-services/senseye-predictive-maintenance.html) — vendor pages, vendor self-reported.
- Lei, Y., Yang, B., Jiang, X., Jia, F., Li, N., & Nandi, A. K. (2020). "Applications of machine learning to machine fault diagnosis: A review and roadmap." Mechanical Systems and Signal Processing, 138, 106587. DOI: 10.1016/j.ymssp.2019.106587. Peer-reviewed; documents that frequency-domain / spectral features (energy at bearing characteristic-defect frequencies, high-frequency-band kurtosis) detect incipient bearing faults markedly earlier than a single time-domain RMS scalar, the basis for production PdM platforms (AVEVA, Siemens Senseye, Aizon, Rockwell) leaning on frequency-domain features. For the named Amgen New Albany deployment pairing AWS-based predictive maintenance with machine vision, see: AWS Press Center, "AWS Joins Forces With Amgen..." (2023), https://press.aboutamazon.com/2023/11/aws-joins-forces-with-amgen-on-generative-ai-solutions-to-accelerate-advanced-therapies — vendor self-reported.
- Rathore, A. S., Nikita, S., Thakur, G., & Mishra, S. (2023). "Artificial intelligence and machine learning applications in biopharmaceutical manufacturing." Trends in Biotechnology, 41(4), 497-510. DOI: 10.1016/j.tibtech.2022.08.007. Peer-reviewed review documenting the core small-data constraint of plant/process-level bioprocess ML: the unit of evidence is the batch (not the timepoint), batches are scarce/slow/expensive even at commercial scale, and run-to-run variability in living systems compromises transferability of models across processes and scales — the basis for the chapter's "batches-not-rows ceiling" and confounded-label arguments.
- Körber Pharma (Werum) PAS-X MES Suite. Vendor self-reported. "Up to 98% higher quality: Using PAS-X MES for digital pharmaceutical production" (Körber blog), https://www.koerber-pharma.com/en/blog/up-to-98-higher-quality-using-pas-x-mes-for-digital-pharmaceutical-production ; PAS-X MES Suite product page (Right-First-Time, Review-by-Exception, electronic batch records, 1000+ installations, >50% of top-30 pharma), https://www.koerber-pharma.com/en/solutions/software/werum-pas-x-mes-suite . The "up to 98%" right-first-time figure and install-base counts are vendor materials, a best-case ceiling, not independently verified. Independent recognition as an MES Leader: Gartner Magic Quadrant for Manufacturing Execution Systems (Körber Werum PAS-X named a Leader); see Gartner Peer Insights, PAS-X MES Suite, https://www.gartner.com/reviews/product/pas-x-mes-suite .
- U.S. Food and Drug Administration, Warning Letter 320-26-58 to Purolea Cosmetics Lab (Livonia, Michigan), issued 2 April 2026 (inspection 28-30 October 2025). First FDA cGMP warning letter to cite AI misuse as a stand-alone deficiency, with a section titled "Inappropriate Use of Artificial Intelligence in Pharmaceutical Manufacturing"; the firm used AI agents to generate drug-product specifications, SOPs, and master production/control records without adequate quality-unit review, cited under 21 CFR 211.22(c) and 211.100. FDA Warning Letters database: https://www.fda.gov/inspections-compliance-enforcement-and-criminal-investigations/warning-letters . Coverage: DLA Piper, "FDA Warning Letter highlights risks of using AI in drug manufacturing" (April 2026), https://www.dlapiper.com/en-us/insights/publications/2026/04/fda-warning-letter-highlights-risks-of-using-ai-in-drug-manufacturing .
- Sanofi. plai supply-chain AI platform (built with Aily Labs). Single-company self-reported: ability to predict ~80% of low-inventory positions / stock disruptions and correlate ~65% of supply risks to a root cause; not independently verified. Sanofi, "Digital Transformation and Artificial Intelligence," https://www.sanofi.com/en/our-science/digital-artificial-intelligence ; Sanofi press release, "Sanofi 'all in' on artificial intelligence and data science" (13 June 2023), https://www.sanofi.com/en/media-room/press-releases/2023/2023-06-13-12-00-00-2687072 .
- World Economic Forum, Global Lighthouse Network (pharma sites). Lighthouse-program self-reported, deployed at real sites, headline figures not independently audited. ACG Packaging Materials (Shirwal facility, admitted Feb 2026 as first pharma packaging Lighthouse): reported 31% reduction in energy consumption, 71% reduction in defects, 40% reduction in lead times — Pharmaceutical Technology, "WEF Welcomes ACG Shirwal as First Pharma Packaging Site in Global Lighthouse Network," https://www.pharmtech.com/view/wef-welcomes-acg-shirwal-as-first-pharma-packaging-site-in-global-lighthouse-network . Cipla Indore Oral Solid Dosage plant (WEF Advanced 4IR Lighthouse, 2022): Cipla press release, https://www.cipla.com/press-releases-statements/cipla-indore-plant-joins-world-economic-forums-prestigious-lighthouse-network . (Note: the chapter's specific "26% cost reduction" attribution for Cipla Indore is the self-reported Lighthouse figure; the independently surfaced, well-documented headline is ACG Shirwal's 31% energy reduction.)
- Amgen Inc., New Albany / Columbus-region, Ohio biomanufacturing "smart facility" (assembly & final-product packaging). Self-reported/vendor; energy and water reductions are design targets, not audited results, and the site is staffed (~400 employees), not lights-out. Amgen press release, "Amgen Begins Construction On New Biomanufacturing Plant In Central Ohio" (Nov 2021), https://www.amgen.com/newsroom/press-releases/2021/11/amgen-begins-construction-on-new-biomanufacturing-plant-in-central-ohio ; AWS Press Center, "AWS Joins Forces With Amgen on Generative AI Solutions..." (2023) on the AWS/Amazon SageMaker platform for predictive maintenance and machine vision, https://press.aboutamazon.com/2023/11/aws-joins-forces-with-amgen-on-generative-ai-solutions-to-accelerate-advanced-therapies ; Fierce Pharma, "Amgen boots up Ohio 'smart facility' where it plans to employ 400," https://www.fiercepharma.com/pharma/amgen-boots-ohio-smart-facility-where-it-plans-hire-400 .
- International Society for Pharmaceutical Engineering (ISPE), "The 7th ISPE Pharma 4.0 Survey: Digital Transformation" (survey conducted 2023), Pharmaceutical Engineering, September/October 2024. https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital . Documents that AI/ML, while widely referenced, has yet to achieve significant large-scale implementation — the central "most pilots, fewest scaled implementations" finding — with production concentrated in monitoring, image recognition, advanced modeling, and small-scale/pilot deployments rather than autonomous control.
- Manzano, T., et al. (2021). "AI Algorithm Qualification" [Artificial Intelligence Algorithm Qualification for pharmaceutical manufacturing]. PDA Journal of Pharmaceutical Science and Technology, 75(1), Jan/Feb 2021 (published online Aug 2020). DOI: 10.5731/pdajpst.2019.011338. First peer-reviewed study demonstrating that an ML algorithm (Isolation Forest) can be formally qualified for regulated GxP pharmaceutical manufacturing using a QbD/DoE methodology on Aizon's GxP-compliant AI platform; lead author is co-founder/CSO of Aizon. Aizon's broader GxP manufacturing-intelligence platform and its Grifols deployment are vendor self-reported: https://www.aizon.ai/ . Aizon summary of the study: https://www.aizon.ai/blog/jpst-publishes-first-study-showing-ai-algorithms-can-be-qualified-for-regulated-pharmaceutical-manufacturing .
- Two sources. (a) TetraScience Tetra Scientific Data and AI Cloud / platform ("Tetra Data" as AI-ready data) — vendor self-reported, including deployment/customer counts: https://www.tetrascience.com/solution-brief/the-tetra-scientific-data-and-ai-cloud . (b) Zifo Technologies Data Readiness Survey (published July 2025, ~30+ science-driven companies): 70% of respondents reported difficulty accessing the data needed to support AI projects (silos/integration), and only 39% use standardized data formats/ontologies across functions. PR Newswire, "Zifo's Global Survey Reveals Early Momentum for AI in Biopharma, But Data Readiness Remains Key Hurdle" (24 July 2025), https://www.prnewswire.com/news-releases/zifos-global-survey-reveals-early-momentum-for-ai-in-biopharma-but-data-readiness-remains-key-hurdle-302513061.html ; coverage in Pharmaceutical Executive, https://www.pharmexec.com/view/zifo-survey-biopharma-racing-ai-data-management-challenges . Survey is vendor-conducted (Zifo).
Generative AI and LLMs: Copilots, CAPA, and the Limits of Agents
- Salami H, Smith-Goettler B, Yadav V, et al. (Digital Services, MMD, Merck & Co., Inc., Rahway NJ / West Point PA). "Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations?" arXiv:2404.15578, April 2024. https://arxiv.org/abs/2404.15578 — Evaluates GPT-3.5, GPT-4 and Claude-2 on real pharmaceutical manufacturing deviation text: high-accuracy information (root-cause/entity) extraction and semantic retrieval of similar historical deviations, while explicitly flagging "a complex interplay between the apparent reasoning and hallucination behavior of LLMs as a risk factor" and that human review may be necessary in high-risk tasks. The chapter's peer-reviewed/preprint anchor.
- Murray P, et al. "The 7th ISPE Pharma 4.0™ Survey: Digital Transformation." Pharmaceutical Engineering (ISPE), September/October 2024. https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital — Across 19 enabling technologies scored by adoption stage (planning, pilot, small-scale, large-scale), the survey finds AI/ML has the highest number of pilot projects yet relatively few small- or large-scale implementations, with the pilot proportion high and stagnant; supports the "most pilots, fewest scaled" Pharma 4.0 reality cited throughout the book.
- U.S. Food and Drug Administration. Warning Letter to Purolea Cosmetics Lab (Livonia, MI), WL #722591, issued 2 April 2026, following an October 2025 inspection; cites violations of 21 CFR parts 210/211, including a dedicated "Inappropriate Use of Artificial Intelligence" deficiency: the firm used AI agents to create drug-product specifications, procedures, and master production/control records without the quality-unit review required by 21 CFR 211.22(c). The FDA's first warning letter to cite AI in cGMP. FDA Warning Letters database, fda.gov.
- Regulatory Affairs Professionals Society (RAPS). "FDA warns firm for inappropriate use of AI in drug manufacturing." RAPS News, 2026. https://www.raps.org/resource/fda-warns-firm-for-inappropriate-use-of-ai-in-drug-manufacturing.html — Trade-press coverage of the Purolea (Purolea Cosmetics Lab) warning letter: AI-generated specifications, SOPs, and master production records lacking quality-unit review; the first AI-citing cGMP enforcement action. (See also Manufacturing Chemist, "FDA issues first cGMP warning letter citing AI," manufacturingchemist.com, 2026.)
- European Commission / PIC/S. Draft "EU GMP Annex 22: Artificial Intelligence" (EudraLex Volume 4), released for public consultation 7 July 2025 (comment period 7 July–7 October 2025), alongside draft Annex 11 and Chapter 4. https://www.gmp-compliance.org/guidelines/gmp-guideline/eu-gmp-annex-22-draft-2025-artificial-intelligence — The first GMP annex dedicated to AI; restricts AI in GMP-critical use to static models with deterministic outputs and excludes adaptive/probabilistic and generative/continuously-learning AI from critical GMP operations.
- Commentary on draft Annex 22's validate-the-system posture: ECA Academy, "Drafts of EU GMP Guideline Annex 11, Annex 22 and Chapter 4 released for comment," gmp-compliance.org, 2025; and the peer-reviewed interpretation, "Bridging Guidance and Regulation: Interpreting the Draft Annex 22 on Artificial Intelligence in GMP Manufacturing," PubMed PMID 41698693 (2026). https://pubmed.ncbi.nlm.nih.gov/41698693/ — Annex 22 sets a risk-based framework requiring defined intended use, validation, lifecycle management, explainability, and human-in-the-loop oversight, with generative/continuously-learning AI excluded from critical decisions.
- McKinsey & Company. "Generative AI in the pharmaceutical industry: Moving from hype to reality." McKinsey Life Sciences insights, 2024. https://www.mckinsey.com/industries/life-sciences/our-insights/generative-ai-in-the-pharmaceutical-industry-moving-from-hype-to-reality — Consultancy-reported (not peer-reviewed): generative AI across the pharma-operations value chain (sourcing, manufacturing, quality, supply chain), with deviation synthesis and "80-percent-right" first-draft generation as a drafting aid requiring human iteration; basis for the ~70%-of-deviations / first-draft-CAPA figures cited as a consultancy claim.
- Salami H, Vyas J, et al. (Digital Manufacturing Data Science, Merck & Co., Inc., Rahway NJ / MSD). "MSD explores applying generative AI to improve the deviation management process using AWS services." AWS Machine Learning Blog, 2024. https://aws.amazon.com/blogs/machine-learning/msd-explores-applying-generative-al-to-improve-the-deviation-management-process-using-aws-services — Vendor/co-authored exploratory write-up of a RAG-style deviation assistant on Amazon Bedrock plus Amazon OpenSearch (vector DB), using historical deviations as the knowledge source; explicitly retrieval-grounded, not predictive.
- GEN (Genetic Engineering & Biotechnology News), "Sanofi All In on Digital Drugmaking," genengnews.com, 2025 (and Sanofi corporate communications, sanofi.com, "Digital and AI-Powered Manufacturing"). Self-reported figure: Sanofi expects its generative-AI solution to generate roughly 5,000 annual product-quality and manufacturing-science reports in the first half of 2026 (related target: ~3,500 annual Product Quality Reports with a ~70% reduction in creation time). Company-reported target, not an independently verified result; the chapter's "~8x faster" framing should be read as a self-reported aspiration.
- Aizon. "Aizon Execute — Intelligent Batch Record (iBR)" product page, aizon.ai/execute (and "Aizon Execute Product Sheet: Intelligent Batch Records," aizon.ai/2025). Vendor/self-reported: Aizon's GxP manufacturing-intelligence platform with electronic/intelligent batch records enabling review-by-exception, contextualized manufacturing data (Unify), and predictive ML (Predict). Vendor marketing material; outcomes not independently verified.
- "How Generative AI is Transforming Deviation Management: Lessons from Integrating Microsoft Copilot in Pharma Quality Systems." Pharmaceutical Engineering / iSpeak (ISPE), 2025. https://ispe.org/pharmaceutical-engineering/ispeak/how-generative-ai-transforming-deviation-management-lessons — Practitioner account of deploying Microsoft Copilot (RAG over quality-system documents) for deviation drafting in a GMP quality system; trade/professional-society source.
- Veeva Systems. "Veeva AI Agents to Be Released Across All Veeva Applications." Press release, PR Newswire, December 2025. https://www.prnewswire.com/news-releases/veeva-ai-agents-to-be-released-across-all-veeva-applications-302582730.html — Vendor announcement: Vault AI Agents built natively into the Vault Platform, using LLMs from Anthropic (Claude) and Amazon hosted on Amazon Bedrock (customers may use Veeva-hosted or customer-provided models on Bedrock/Azure); Safety and Quality agents on the roadmap for April 2026. Vendor/self-reported roadmap.
- Wendt C. "Driving pharma manufacturing excellence: Sanofi, Capgemini and Siemens on scaling MES with generative AI." Siemens Blog, September 2025. https://blog.siemens.com/2025/09/driving-pharma-manufacturing-excellence-sanofi-capgemini-and-siemens-on-scaling-mes-with-generative-ai/ — Vendor/integrator-reported: Siemens + Capgemini + Sanofi MES acceleration program using generative AI to replace paper batch records with digital ones (self-reported ~70% review-time reduction, ~80% fewer deviations). Representative of horizontal copilots and integrators (Microsoft, Accenture, Capgemini) selling into regulated drafting; outcomes self-reported.
- Aizon. "Aizon Pre-Announces Next Era of Pharma Manufacturing with Agentic AI." Aizon Blog, 27 October 2025. https://www.aizon.ai/blog/aizon-pre-announces-next-era-of-pharma-manufacturing-with-agentic-ai — Vendor pre-announcement of agentic-AI capabilities (conversational generation of batch-release cockpits, OEE trackers, PQR templates; "Agentic Studio") available to customers in early Q1 2026. Vendor/self-reported, pre-announced (not yet a verified production deployment).
- ProPharma Group, "AI & CGMP: FDA's 1st Warning Letter on Non-Compliant AI in Manufacturing," propharmagroup.com, 2026; and Clarkston Consulting, "FDA Issues Warning for Inappropriate AI Use: What Pharma Manufacturers Need to Know," clarkstonconsulting.com, 2026 — Industry analyses of the Purolea warning letter underscoring the enforcement boundary: FDA permits AI as an aid but requires that any AI output supporting cGMP activities (specifications, procedures, records) be reviewed and cleared by an authorized human in the quality unit; failure to do so is itself a cGMP violation.
- ISPE. "ISPE GAMP® Guide: Artificial Intelligence" (released 23 July 2025; ~290 pages), ispe.org/news/ispe-announces-availability-ispe-gampr-guide-artificial-intelligence. Companion article: Bechmann V, et al. "Seven Control Layers for Large Language Models (LLMs) in GMP Decision-Making." Pharmaceutical Engineering (ISPE), January/February 2026. https://ispe.org/pharmaceutical-engineering/january-february-2026/seven-control-layers-llms-gmp-decision-making — Extends GAMP 5 to AI/ML; the article defines seven complementary, defense-in-depth control layers (input/output guardrails, domain knowledge, LLM selection, monitoring/explainability, transparency) to mitigate overreliance, hallucination, and limited explainability.
- U.S. Food and Drug Administration. "Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products." Draft Guidance for Industry, issued 6 January 2025; 90 FR (Federal Register), 7 January 2025. https://www.federalregister.gov/documents/2025/01/07/2024-31542 — Establishes a risk-based credibility-assessment framework (a 7-step process keyed to the model's context of use, COU) for AI models that produce information supporting regulatory decisions on drug safety, effectiveness, or quality, spanning nonclinical, clinical, post-market, and manufacturing phases; scrutiny scales with model influence.
- ValGenesis. "ValGenesis Launches Smart GxP™ as First AI-Enabled Platform to Unify Validation and Process Development." Press release, 4 June 2025, valgenesis.com/news/valgenesis-launches-smart-gxp; and "AI-Powered Validation," valgenesis.com/solution/ai-powered-validation. Vendor/self-reported: validation lifecycle management (iVal/iClean/iOps) with AI assistance for protocol/validation drafting and streamlined digital tech transfer across the product lifecycle. Vendor marketing; outcomes not independently verified.
- Mareana. "Batch Release Copilot," mareana.com/batch-review-copilot (and "Revolutionizing Batch Release…" white paper, mareana.com). Vendor/self-reported: an AI rule engine ingesting LIMS/ERP/batch-record data, validating parameters against specifications and historical trends, and surfacing exceptions for human decision (exception-only batch review), with a generative "Neptune" assistant for root-cause analysis. Vendor marketing material; customer outcomes self-reported.
The Vendor Landscape: Who Sells What, and What Is Real
- DataHow AG, "DataHowLab" and company/case-study pages (e.g., "Stronger CQA prediction with fewer experiments with DataHowLab Hybrid Models"), datahow.ch. Vendor self-reported. Source of the "30–60% — up to 80% fewer experiments" hybrid-modeling figures and the company background establishing DataHow as an independent ETH Zurich spin-off (Series A led by Momenta, with Rockwell Automation and Zurcher Kantonalbank; Eppendorf collaboration announced 2024) — i.e., not owned by Sartorius. Marked vendor-self-reported. https://datahow.ch/
- Polak J, Huang Z, Sokolov M, von Stosch M, Butte A, Hodgman CE, Borys M, Khetan A. "An innovative hybrid modeling approach for simultaneous prediction of cell culture process dynamics and product quality." Biotechnology Journal. 2024 Mar;19(3):e2300473. DOI: 10.1002/biot.202300473. PMID: 38528367. Peer-reviewed but self-authored (DataHow AG co-authored with Bristol Myers Squibb). The combined hybrid model outperforms a black-box model by ~33% on average in predicting final product quality while requiring only ~half the training data; study used 48 experiments at 5 L scale across 12 CPPs and 18 CQAs. Prefer these journal figures over the vendor page's "22% / 3x" framing.
- ISPE. "The 7th ISPE Pharma 4.0 Survey: Digital Transformation." Pharmaceutical Engineering, September–October 2024. ISPE, International Society for Pharmaceutical Engineering. The survey clustered 19 enabling technologies and found AI/ML had the highest number of pilot projects but the fewest small/large-scale implementations, trailing big-data analytics, advanced analytics, robotic process automation, GxP cloud, and IIoT (all in the "well-established at large scale" cluster). https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital
- Sartorius (Umetrics Suite) product pages: SIMCA (multivariate data analysis), SIMCA-online (real-time statistical process monitoring), and MODDE (design of experiments), sartorius.com. Vendor pages (product description); SIMCA use is recognized by EMA and FDA for Real-Time Release testing and validated for 21 CFR Part 11 regulated environments. Marked vendor-self-reported for the product/branding claims (including the newer "Umetrics Digital Twin AI Ecosystem"). https://www.sartorius.com/en/products/process-analytical-technology/data-analytics-software
- Amgen / Sartorius case study on the Amgen Juncos (Puerto Rico) commercial GMP site running SIMCA/OPLS harvest-titer and in-process models (Amgen engineers, Sartorius as case-study sponsor). First-party, vendor-self-reported figures (e.g., elimination of ~6 hours harvest idle time and ~10 hours of column idle per batch); hour-savings are not externally verifiable. Marked vendor-self-reported. Sartorius Data Analytics / Umetrics customer case material, sartorius.com.
- Emerson. "Emerson Completes Acquisition of Remaining Outstanding Shares of AspenTech," 12 March 2025 (emerson.com press release; PR Newswire). Emerson's November 2024 proposal at $240/share implied a ~$15.3B fully diluted market capitalization (the "roughly fifteen-billion-dollar" figure); the final deal closed at $265/share. AspenTech ProMV (formerly ProSensus/MacGregor) and Emerson DeltaV PredictPro are now within Emerson. https://www.emerson.com/en-us/news/2025/emerson-completes-acquisition-of-remaining-outstanding-shares-of-aspentech
- Cytiva. "Cytiva acquires GoSilico to strengthen digital capabilities in bioprocessing," 3 June 2021 (cytivalifesciences.com); see also Sealfon (GEN), "Cytiva Acquires German Scientific Software Maker GoSilico," GEN, June 2021. GoSilico (ChromX/DSPX) is a mechanistic chromatography modeling suite (transport-dispersive and steric-mass-action equations), not machine learning; acquired by Cytiva (a Danaher company) in 2021. https://www.cytivalifesciences.com/en/us/news-center/cytiva-acquires-gosilico-to-strengthen-digital-capabilities-in-bioprocessing-10001
- Yokogawa Electric Corporation. "Yokogawa Acquires Insilico Biotechnology, Developer of Innovative Bioprocess Digital Twin Technology," press release, 2 November 2021 (yokogawa.com; Business Wire 20211102005399). Insilico Biotechnology AG (Stuttgart) uses a hybrid digital twin coupling a mechanistic metabolic-network model with a data-driven (machine-learning) model; acquired by Yokogawa (not Cytiva) in November 2021. https://www.businesswire.com/news/home/20211102005399/en/
- Manzano T, Fernandez C, Ruiz T, Richard H. "Artificial Intelligence Algorithm Qualification: A Quality by Design Approach to Apply Artificial Intelligence in Pharma." PDA Journal of Pharmaceutical Science and Technology. 2021 Jan-Feb;75(1):100-118. DOI: 10.5731/pdajpst.2019.011338 (Epub 14 Aug 2020). PMID: 32817323. Peer-reviewed but self-authored (Aizon authors). A QbD-based procedure for qualifying AI algorithms (demonstrated with Isolation Forest) for regulated pharmaceutical manufacturing — the notable peer-reviewed exception cited in the chapter.
- Korber Pharma / Werum PAS-X MES product and marketing pages (koerber-pharma.com), including review-by-exception and "up to 98% right first time" language and the K.AI / B.R.A.I.N. AI features, plus install-base claims (1000-plus installations; large share of top-20 biotech). Vendor self-reported; "up to 98%" is a best-case ceiling, not a typical result, and install-base figures appear on separate pages. Marked vendor-self-reported.
- ValGenesis product/marketing pages for VLMS (Validation Lifecycle Management System) and the "Smart GxP" / VAL AI platform (valgenesis.com), source of the "80% faster" validation figures. Vendor self-reported. Marked vendor-self-reported.
- European Commission / EMA / PIC-S. Draft EU GMP Annex 22 "Artificial Intelligence" (EudraLex Volume 4), published for public consultation 7 July 2025 (comment period to 7 October 2025), released alongside revised Annex 11 "Computerised Systems" and Chapter 4 "Documentation." The first manufacturing-specific AI rule: for critical GMP applications it permits only static, deterministic ML models and excludes dynamic/continuously-learning, probabilistic, and generative AI/LLM models, requiring locked models with a predetermined change-control plan and human oversight. Draft, expected to finalize ~mid-2026. EMA/EudraLex consultation document.
- U.S. Food and Drug Administration. Warning Letter to Purolea Cosmetics Lab (Livonia, Michigan), Warning Letter #722591, dated 2 April 2026, FDA Office of Manufacturing Quality (CDER). FDA's first warning letter citing inappropriate use of AI in pharmaceutical manufacturing: the firm used AI agents to generate drug product specifications, procedures, and master production/control records without adequate quality-unit review as required under 21 CFR 211.22(c). Primary regulatory source. fda.gov warning letters database.
- Trade-press analysis of the Purolea warning letter: "FDA's First cGMP Enforcement Action on AI Misuse in Drug Manufacturing," BioProcess Online (bioprocessonline.com), 2026; and ECA Academy, "Use of AI Agents leads to the first FDA Warning Letter relating to AI," gmp-compliance.org, 2026. Established trade press summarizing the FDA action and the missing process-validation requirement that the quality unit failed to detect. https://www.bioprocessonline.com/doc/fda-s-first-cgmp-enforcement-action-on-ai-misuse-in-drug-manufacturing-0001
- U.S. FDA, Center for Drug Evaluation and Research (CDER). Discussion Paper, "Artificial Intelligence in Drug Manufacturing" (FRAME — Framework for Regulatory Advanced Manufacturing Evaluation initiative), published 1 March 2023; Request for Information and Comments, Federal Register 88 FR 12923 (1 March 2023; comment period reopened 27 September 2023). FDA's stated posture on AI in pharmaceutical manufacturing. https://www.federalregister.gov/documents/2023/03/01/2023-04206/
- U.S. FDA. Draft Guidance for Industry, "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products," January 2025; availability announced Federal Register 90 FR 1525 (7 January 2025), comment period to 7 April 2025. Establishes a risk-based, seven-step credibility-assessment framework keyed to the model's "context of use," with scrutiny scaling to the model's influence on a decision and the consequence of error. https://www.federalregister.gov/documents/2025/01/07/2024-31542/
- Trade-press coverage of Amgen's deep-learning automated visual inspection: "Amgen's Deep Learning Approach to Vial Inspection," BioProcess Online / ISPE (ispe.org), with the Syntegon (formerly Bosch Packaging Technology) partnership at Amgen's Juncos, Puerto Rico syringe line — reported as the first fully validated AI visual inspection system, qualified over years and in direct consultation with the FDA. The ~95% auto-release headline is trade-press and vendor-self-reported (not independently verified); reported station-level gains include ~70% higher particle detection and ~60% lower false-rejection rate. Marked vendor-self-reported / trade-press. https://ispe.org/news/amgens-deep-learning-approach-vial-inspection
Case Studies: Named Deployments and Their Evidence
- Wolfgang Winter, Christian Wölbeling, Line Lundsberg-Nielsen, et al. "The 7th ISPE Pharma 4.0 Survey: Digital Transformation." Pharmaceutical Engineering (ISPE), September/October 2024. Reports that AI/ML, while widely referenced, has the most pilot-stage projects and the fewest large-scale ("systematic ongoing") implementations among Pharma 4.0 digital technologies. https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital (neutral professional-society survey; peer-reviewed-independent in the sense used in the chapter)
- McKinsey & Company / QuantumBlack. "The State of AI in 2025: Agents, Innovation, and Transformation" (Global Survey), published November 5, 2025. Finds ~88% of respondents report regular AI use in at least one business function, while only ~6% qualify as "high performers" capturing significant enterprise-wide value (and only ~7% report AI fully scaled). Online survey fielded June 25–July 29, 2025; 1,993 respondents in 105 nations. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai (analyst/consultancy tier)
- Pablo J. Rosado, Badua Merheb, and Alejandro Toro (Amgen Manufacturing Limited, Juncos, Puerto Rico). "Real-Time, Data-Driven, and Predictive Modeling: Accelerating Digital Transformation in Drug Substance Commercial Manufacturing." BioProcess International, February 9, 2023. First-party account of OPLS batch-level models deployed in SIMCA-online to predict harvest titer and elution mass in commercial GMP, eliminating roughly six hours of harvest idle time and inter-column idle time. https://www.bioprocessintl.com/pat/real-time-data-driven-and-predictive-modeling-accelerating-digital-transformation-in-drug-substance-commercial-manufacturing (trade-press, company-authored / self-reported; complemented by a Sartorius SIMCA vendor case study — vendor-self-reported)
- Amgen automated visual inspection (AVI) with deep learning. Industry-first validated AI retrofit of a Syntegon (formerly Bosch) syringe inspection machine at Amgen's Juncos, Puerto Rico facility — reported ~70% boost in particle detection and ~60% reduction in false rejects, distinguishing bubbles from contaminants in viscous solutions. See "Amgen's Deep Learning Approach to Vial Inspection," ISPE / BioProcess Online; and Syntegon coverage. The ~95% auto-release figure and ~20% rule-based false-rejection figure are from Amgen conference/trade presentations (e.g., 2025 ISPE Aseptic Conference). https://ispe.org/news/amgens-deep-learning-approach-vial-inspection and https://www.bioprocessonline.com/doc/amgen-s-deep-learning-approach-to-vial-inspection-0001 (vendor / trade-press self-reported)
- Jakub Polak, Zhuangrong Huang, C. Eric Hodgman, Michael Sokolov, Michael Borys, Moritz von Stosch, Anurag Khetan, Alessandro Butté (DataHow AG and Bristol Myers Squibb Biologics Development). "An innovative hybrid modeling approach for simultaneous prediction of cell culture process dynamics and product quality." Biotechnology Journal, 2024;19(3):e2300473. DOI: 10.1002/biot.202300473. PMID: 38528367. Hybrid mechanistic/ML modeling of 48 experiments at 5 L scale, 12 process parameters, 18 CQAs, reporting stronger CQA prediction with substantially less experimental data than a black-box model. https://doi.org/10.1002/biot.202300473 (peer-reviewed, self-authored by DataHow and BMS)
- Jiarui Wang, Jingyi Chen, Joey Studts, Gang Wang, et al. (Boehringer Ingelheim, Late Stage Downstream Process Development, Biberach an der Riss). "Simultaneous prediction of 16 quality attributes during protein A chromatography using machine learning based Raman spectroscopy models." Biotechnology and Bioengineering, 2024;121(7). DOI: 10.1002/bit.28679. Uses k-nearest-neighbor (KNN) regression on Butterworth-filtered Raman spectra to predict 16 product quality attributes in-line during Protein A capture (no deep-network/CNN claim). https://doi.org/10.1002/bit.28679 (peer-reviewed, self-authored)
- Bingchuan Wei, Nicole Woon, Lu Dai, Raphael Fish, et al. (Genentech and Roche). "Multi-attribute Raman spectroscopy (MARS) for monitoring product quality attributes in formulated monoclonal antibody therapeutics." mAbs, 2022;14(1):2007564 (published online December 29, 2021). DOI: 10.1080/19420862.2021.2007564. PMID: 34965193. A high-throughput Raman + DoE + MVDA method for measuring multiple product quality attributes of formulated (drug-product) mAbs from a single scan — a drug-product unit operation, distinct from the Boehringer Ingelheim Protein A work. https://doi.org/10.1080/19420862.2021.2007564 (peer-reviewed, self-authored)
- Roche. "Roche launches NVIDIA AI factory to accelerate the development of new therapeutics and diagnostics solutions." Press release, March 16, 2026, expanding a collaboration begun in 2023 with its subsidiary Genentech. Describes NVIDIA Omniverse-powered digital twins ("virtual replicas of production lines") for facility/process design and simulation — facility-design twins, not closed-loop quality control. https://www.roche.com/media/releases/med-cor-2026-03-16 (corporate press release / press-release tier)
- Shuting Xu, Yanting Huang, Xin Shen, Rongjia Mao, Hang Zhou, Weichang Zhou, et al. (Cell Culture Process Development, WuXi Biologics, Shanghai). "Innovating cell culture process development with deep learning-powered robotic experimentation using the first Industrial Smart Lab Framework." Biotechnology Progress, 2025;41(6):e70051. DOI: 10.1002/btpr.70051. PMID: 40542657. The ISLFCC pairs decoder-only transformer models (CCGPT/ILGPT) with robotic sampling at 3–15 L; reports +26.8% average titer across three CHO clones with lactate held below 1 g/L within a single batch. https://doi.org/10.1002/btpr.70051 (peer-reviewed, self-authored; PD scale, single-company)
- Sanofi SimplY AI-powered yield-analytics platform; reported target of an ~8% increase in Dupixent drug-substance output over three years at the Geel, Belgium "digital lighthouse" site (attributed to Ariel Bismuth, Sanofi global digital head of manufacturing and supply). See "Sanofi All In on Digital Drugmaking," Genetic Engineering & Biotechnology News (GEN), August 2025; and Sanofi corporate digital/AI pages. https://www.genengnews.com/topics/bioprocessing/sanofi-all-in-on-digital-drugmaking/ (vendor/company self-reported; multi-year target, not audited realized outcome)
- Sanofi. "Press Release: Sanofi 'all in' on artificial intelligence and data science to speed breakthroughs for patients." June 13, 2023. States that plai adoption in the biopharma supply chain demonstrated the ability to predict ~80% of low-inventory positions (with a separate ~65% risk-to-root-cause figure on Sanofi's corporate digital/AI page). plai was co-developed with Aily Labs. https://www.sanofi.com/en/media-room/press-releases/2023/2023-06-13-12-00-00-2687072 (corporate press release; not independently verified)
- Pfizer "Golden Batch" / Vox generative-AI manufacturing figures (e.g., ~16,000 hours/year saved and ~20,000 additional doses/batch) trace to Pfizer/AWS communications and analyst/secondary write-ups (Lidia Fonseca, AWS Summit; Pfizer–Amazon Collaboration Team, PACT). Note these blend distinct claims — 16,000 hours refers to scientist document-search time saved, and 20,000 doses refers to an mRNA-vaccine yield prediction — and are largely small-molecule/operations, not biologics CQA control. See e.g. HealthTech Magazine, "How AI Drug Manufacturing Is Changing the Game," Feb 2025; AWS Pfizer PACT case study. https://healthtechmagazine.net/article/2025/02/ai-in-drug-manufacturing-perfcon (press-release / secondary tier)
- Hossein Salami, Brandye Smith-Goettler, Vijay Yadav (Digital Services, MMD, Merck & Co./MSD, Rahway NJ and West Point PA). "How can language models assist with pharmaceuticals manufacturing deviations and investigations?" International Journal of Pharmaceutics, December 2024. DOI: 10.1016/j.ijpharm.2024.125021. PMID: 39701477 (preprint: arXiv:2404.15578, "Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations?"). Evaluates GPT-3.5, GPT-4, and Claude-2 on root-cause extraction and semantic search over real deviation records, candidly discussing "the complex interplay between apparent reasoning and hallucination" as a risk factor. https://pubmed.ncbi.nlm.nih.gov/39701477/ (peer-reviewed, self-authored; retrieval/extraction, not predictive control)
- Samsung Biologics. "Plant 5 | Re-shaping biopharma operations" and "Enabling Digital Twins With Computational Fluid Dynamics Modeling" (bioprocess digital twins). Describes Plant 5 at Bio Campus II, Songdo, using hybrid modeling (CFD + mechanistic + ML), Raman spectroscopy as a glucose soft sensor with automated feeding, hybrid MPC, and AI/digital-twin technologies. https://samsungbiologics.com/media/bio-story/plant5-re-shaping-biopharma-operations and https://samsungbiologics.com/media/science-technology/enabling-digital-twins-with-computational-fluid-dynamics-modeling (vendor self-reported)
- Celltrion. "Celltrion Accelerates AI Transformation Across R&D, Manufacturing, and Corporate Operations," Korea IT Times (and Celltrion press releases). Describes AI-powered smart factories at new API/drug-substance facilities in Songdo (autonomous logistics robots, automated warehousing, collaborative robots, intelligent manufacturing platforms), with quality control and production optimization slated for later phases. https://www.koreaittimes.com/news/articleView.html?idxno=153994 (corporate press release / press-release tier)
- Wouter Heyndrickx, Lewis Mervin, Tobias Morawietz, et al. (MELLODDY consortium). "MELLODDY: Cross-pharma Federated Learning at Unprecedented Scale Unlocks Benefits in QSAR without Compromising Proprietary Information." Journal of Chemical Information and Modeling, 2024;64(7):2331–2344. DOI: 10.1021/acs.jcim.3c00799. PMID/PMC: PMC11005050. Ten pharma companies federated ~2.6 billion activity data points across ~21 million molecules — a drug-discovery QSAR effort (not manufacturing). https://doi.org/10.1021/acs.jcim.3c00799 (peer-reviewed, consortium-authored)
- BioPhorum. "Managing data as a product for digital transformation in the pharmaceutical industry" (IT, Digital and Data workstream). Advocates treating data as a product built to FAIR principles (findable, accessible, interoperable, reusable), with metadata on lineage/quality and semantic mapping to enable AI/ML across pharma manufacturing. https://www.biophorum.com/download/managing-data-as-a-product-for-digital-transformation-in-the-pharmaceutical-industry/ (industry-consortium publication / self-reported)
- U.S. Government Accountability Office. "Drug Manufacturing: FDA Should Fully Assess Its Efforts to Encourage Innovation," GAO-23-105650, March 2023. Found that only 16 approved applications or supplements incorporated an advanced manufacturing technology between 2015 and late 2022, while 112 advanced technologies were accepted into CDER's Emerging Technology Program over the same period — i.e., few drugs had actually been made using advanced manufacturing. https://www.gao.gov/products/gao-23-105650 (peer-reviewed-independent / government audit)
- U.S. Food and Drug Administration. cGMP Warning Letter to Purolea Cosmetics Lab (Livonia, Michigan), issued April 2, 2026, following an October 2025 inspection — FDA's first warning letter with a dedicated section titled "Inappropriate Use of Artificial Intelligence in Pharmaceutical Manufacturing." The firm used AI agents to generate drug-product specifications, SOPs, and master production/control records that the quality unit did not adequately review (AI omitted process-validation requirements), cited as a violation of 21 CFR 211.22(c). See FDA warning letter and coverage by DLA Piper, Morgan Lewis, and ECA Academy. https://www.dlapiper.com/en-us/insights/publications/2026/04/fda-warning-letter-highlights-risks-of-using-ai-in-drug-manufacturing (regulatory primary source / trade-legal analysis)
Regulation and Governance: FDA, Annex 22, and Validating a Model
- U.S. Food and Drug Administration (CDER). "Artificial Intelligence in Drug Manufacturing" (Discussion Paper). 2023. Docket No. FDA-2023-N-0487; Federal Register notice 88 FR 12943, 1 March 2023 (comment period reopened 27 September 2023). Available at https://www.fda.gov/media/165743/download and https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and. A structured set of questions (not binding requirements) on applying the GMP framework to AI, data management, and model change/re-validation.
- U.S. Food and Drug Administration. "Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products." Draft Guidance for Industry, issued 6 January 2025 (Federal Register availability 7 January 2025, 90 FR 1167; Docket FDA-2024-D-4689). Available at https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological. Establishes the 7-step risk-based credibility-assessment framework: (1) question of interest, (2) context of use, (3) model risk = model influence x decision consequence, (4) credibility-assessment plan, (5) execute, (6) document results, (7) determine model adequacy.
- European Commission / EMA, in cooperation with PIC/S. Draft EU GMP Annex 22, "Artificial Intelligence," EudraLex Volume 4 (Good Manufacturing Practice). Released for public consultation 7 July 2025 (comment period to 7 October 2025), alongside revised Annex 11 and Chapter 4. The first GMP text written specifically for AI. Draft text: https://health.ec.europa.eu/ (EudraLex Vol. 4) and consultation copy at https://www.gmp-compliance.org/files/guidemgr/mp_vol4_chap4_annex22_consultation_guideline_en.pdf. Permits only static, deterministic models for critical GMP use and sets expectations for intended purpose, risk assessment, data governance, test-data independence, human oversight, acceptance criteria, and change control.
- ECA Academy / GMP Compliance. "EU GMP Annex 22 (Draft 2025): Artificial Intelligence" and "Drafts of EU GMP Guideline Annex 11, Annex 22 and Chapter 4 released for comment." 2025. https://www.gmp-compliance.org/guidelines/gmp-guideline/eu-gmp-annex-22-draft-2025-artificial-intelligence. Trade-press summary confirming that draft Annex 22 excludes dynamic continuously-learning models, probabilistic models with non-reproducible output, and generative AI / large language models from critical GMP applications, confining them to non-critical, human-supervised uses. See also S. R. Niazi, "Bridging Guidance and Regulation: Interpreting the Draft Annex 22 on Artificial Intelligence in GMP Manufacturing," PDA Journal of Pharmaceutical Science and Technology (online ahead of print, 14 February 2026), https://journal.pda.org/content/early/2026/02/14/pdajpst.2025-000076.1.
- International Council for Harmonisation (ICH). Quality guidelines Q8(R2) Pharmaceutical Development (2009), Q9(R1) Quality Risk Management (2005; revised 2023), Q10 Pharmaceutical Quality System (2008), Q11 Development and Manufacture of Drug Substances (2012), and Q12 Technical and Regulatory Considerations for Pharmaceutical Product Lifecycle Management (2019, step 4 January 2020). The integrated quality-by-design, risk-management, pharmaceutical-quality-system, and lifecycle-management framework. Guidelines available at https://www.ich.org/page/quality-guidelines (e.g., Q8(R2): https://database.ich.org/sites/default/files/Q8_R2_Guideline.pdf).
- International Society for Pharmaceutical Engineering (ISPE). GAMP Guide: Artificial Intelligence (published July 2025), building on GAMP 5: A Risk-Based Approach to Compliant GxP Computerized Systems, 2nd edition (2022) and its Appendix D11 on AI/ML. Publication pages: https://ispe.org/publications/guidance-documents (GAMP AI Guide) and https://ispe.org/publications/guidance-documents/gamp-5-guide-2nd-edition. The 'seven control layers' structure for LLM/AI systems is set out in B. Stockton et al., "Seven Control Layers for LLMs in GMP Decision-Making," Pharmaceutical Engineering, January-February 2026, https://ispe.org/pharmaceutical-engineering/january-february-2026/seven-control-layers-llms-gmp-decision-making. See also coverage in BioProcess International, "ISPE releases new GAMP guide for artificial intelligence in pharmaceutical manufacturing," 2025.
- U.S. Food and Drug Administration (CDRH/CBER). "Computer Software Assurance for Production and Quality System Software." Final Guidance for Industry and FDA Staff; Docket FDA-2022-D-0795. Finalized 24 September 2025 (Federal Register availability 24 September 2025); supersedes Section 6 of the 2002 General Principles of Software Validation. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/computer-software-assurance-production-and-quality-system-software. Advances a risk-based, critical-thinking, least-burdensome approach to software assurance scaled to the impact on product quality and patient safety. (The chapter additionally references a February 2026 update; verify against the live FDA guidance page for the latest revision date.)
- ALCOA+ data-integrity principles (Attributable, Legible, Contemporaneous, Original, Accurate, plus Complete, Consistent, Enduring, Available). Authoritative sources: MHRA, "GxP Data Integrity Guidance and Definitions," Revision 1, March 2018, https://assets.publishing.service.gov.uk/media/5aa2b9ede5274a3e391e37f3/MHRA_GxP_data_integrity_guide_March_edited_Final.pdf; PIC/S, "Good Practices for Data Management and Integrity in Regulated GMP/GDP Environments," PI 041-1, 1 July 2021; and WHO, "Guidance on Good Data and Record Management Practices," WHO Technical Report Series No. 996, Annex 5 (2016). The four '+' attributes were introduced by the EMA reflection paper on electronic source data (2010).
- U.S. Food and Drug Administration. Warning Letter to Purolea Cosmetics Lab (Livonia, Michigan), WL 320-26-58, issued 2 April 2026, following an inspection of 28-30 October 2025. Citing 21 CFR 211.22(c) (quality-unit oversight) and 21 CFR 211.100 (production and process controls). Widely reported as the first FDA cGMP warning letter to cite the use of AI in pharmaceutical manufacturing. Listed in FDA's Warning Letters database at https://www.fda.gov/inspections-compliance-enforcement-and-criminal-investigations/warning-letters. Coverage: Regulatory Affairs Professionals Society (RAPS), "FDA warns firm for inappropriate use of AI in drug manufacturing," https://www.raps.org/resource/fda-warns-firm-for-inappropriate-use-of-ai-in-drug-manufacturing.html.
- ECA Academy / GMP Compliance, "Use of AI Agents leads to the first FDA Warning Letter relating to AI," 2026, https://www.gmp-compliance.org/gmp-news/use-of-ai-agents-leads-to-the-first-fda-warning-letter-relating-to-ai; and ProPharma Group, "AI & cGMP: FDA's 1st Warning Letter on Non-Compliant AI in Manufacturing," https://www.propharmagroup.com/thought-leadership/ai-cgmp-fdas-1st-warning-letter-non-compliant-manufacturing. Trade-press analyses of the Purolea warning letter documenting that the firm used AI agents to generate drug specifications, SOPs/procedures, and master production/control records that no human quality unit reviewed (the AI omitted process-validation requirements and the quality unit did not detect the omission) - the missing four-eyes/quality-unit review being the cited cGMP violation.
The Frontier: Foundation Models, Autonomous Labs, and Agentic AI
- Rajamanickam V, Babel H, Montano-Herrera L, et al. "About Model Validation in Bioprocessing" / and broader: Helleckes LM, Hemmerich J, Wiechert W, von Lieres E, Grünberger A. "Machine learning in bioprocess development: from promise to practice." Trends in Biotechnology 41(6):817-835 (2023). DOI 10.1016/j.tibtech.2022.10.010. Peer-reviewed review establishing that bioprocess ML is constrained by a small-data ceiling (scarce, slow, confounded, fast-decaying living-system data), that production-deployed AI clusters in monitoring/soft sensing/anomaly detection rather than autonomous control of critical quality attributes, and that vendor/agentic messaging routinely runs ahead of demonstrated deployment. Supports the chapter's recurring 'honest map' framing of the demo-to-routine gap. See also: Imran SA, et al. "Machine Learning Methods for Small Data and Upstream Bioprocessing Applications: A Comprehensive Review," arXiv:2506.12322 (2025) for the small-data ceiling specifically.
- Gadiyar CJ, Müller C, Vuillemin T, Bielser J-M, Souquet J, Fagnani A, Sokolov M, von Stosch M, Feidl F, Butté A, Cruz Bournazou MN, et al. "Self-Driving Development of Perfusion Processes for Monoclonal Antibody Production." Biotechnology and Bioengineering 123(2):391-405 (2026); published online 12 Dec 2025. DOI 10.1002/bit.70093. Peer-reviewed but self-authored by the vendors/process owners (DataHow AG and Ares Trading SA, a Merck KGaA affiliate, with Sartorius ambr250 hardware). Describes an autonomous perfusion-process development campaign for a monoclonal antibody using 24 parallel ambr250 mini-bioreactors, a cognitive digital twin (hybrid mechanistic-plus-data model with a step-wise Gaussian-process surrogate), and Bayesian optimal experimental design, run over a ~27-day cultivation at process-development scale (not GMP). Open access via Wiley; preprint: bioRxiv 2024.09.03.610922.
- Gadiyar CJ, et al. "Self-Driving Development of Perfusion Processes for Monoclonal Antibody Production." Biotechnology and Bioengineering 123(2):391-405 (2026). DOI 10.1002/bit.70093. Cited specifically for the authors' own caveat distinguishing robotic capability (the rig can execute experimental steps without manual intervention) from device autonomy (the system can be trusted to decide), and for their statement that the demonstration is development-scale, explores development (not locked/validated) conditions, and is not a GMP deployment of a commercial drug — the explicit framing that prevents the result from being misread as a deployment.
- Tom G, Schmid SP, Baird SG, et al. "Self-Driving Laboratories for Chemistry and Materials Science." Chemical Reviews 124(16):9633-9732 (2024). DOI 10.1021/acs.chemrev.4c00055. Peer-reviewed independent review cataloguing closed-loop autonomous experimentation platforms (Bayesian-optimization-driven) across chemistry, materials, and biology. Supports the chapter's point that the self-driving-lab literature is dominated by fast, cheap, well-instrumented systems and that the named extension gap is toward slow, expensive, multi-vessel biological cultivation.
- Strieth-Kalthoff F, et al. / and Royal Society review: "Autonomous 'self-driving' laboratories: a review of technology and policy implications." Royal Society Open Science 12(7):250646 (2025). DOI 10.1098/rsos.250646 (PubMed 40852582). Peer-reviewed review noting that optimization is the most common self-driving-lab task (tractable with Bayesian optimization), and that the persistent challenge is extending from fast-growing microbial/abiotic systems and single tuned reactors to slow CHO mammalian culture and dynamic, multi-vessel control — the exact gap a self-driving GMP suite would face.
- Heyndrickx W, Mervin L, Morawietz T, et al. "MELLODDY: Cross-Pharma Federated Learning at Unprecedented Scale Unlocks Benefits in QSAR without Compromising Proprietary Information." Journal of Chemical Information and Modeling 64(7):2331-2344 (2024). DOI 10.1021/acs.jcim.3c00799 (preprint: ChemRxiv 10.26434/chemrxiv-2022-ntd3r, Oct 2022). Peer-reviewed report of the ten-company federated-learning consortium (Amgen, Astellas, AstraZeneca, Bayer, Boehringer Ingelheim, GSK, Janssen, Merck KGaA, Novartis, Servier) training a shared model across 2.6+ billion activity data points spanning 21+ million molecules and 40k+ assays without exposing raw data. Confirms the privacy mechanism and federated benefit — explicitly in drug discovery/QSAR, not manufacturing.
- Sokolov M, von Stosch M, Narayanan H, Feidl F, Butté A. "Hybrid modeling — a key enabler towards realizing digital twins in biopharma?" Current Opinion in Chemical Engineering 34:100715 (2021). DOI 10.1016/j.coche.2021.100715. Peer-reviewed review arguing that hybrid (mechanistic + ML) models win over pure ML precisely under the small-data, non-commensurable conditions of bioprocessing. Supports the small-data ceiling, the non-commensurability of same-named features across sites/scales, and the position that a manufacturing analogue of MELLODDY (federating across physical production sites) is structurally harder than federating compound libraries and remains a research perspective, not a shipped capability.
- Theodoris CV, Xiao L, Chopra A, et al. "Transfer learning enables predictions in network biology" (Geneformer). Nature 618(7965):616-624 (2023). DOI 10.1038/s41586-023-06139-9. And: Cui H, Wang C, Maan H, Pang K, Luo F, Wang B. "scGPT: toward building a foundation model for single-cell multi-omics using generative AI." Nature Methods 21(8):1470-1479 (2024). DOI 10.1038/s41592-024-02201-0. Peer-reviewed papers establishing that the life-science foundation models that DO exist operate on single-cell transcriptomics (gene-expression sequences), a different data modality from a bioreactor's time series — they do not predict titer from a Raman spectrum, supporting the claim that a bioprocess foundation model in that sense does not yet exist as an established system.
- Ansari AF, Stella L, Turkmen C, et al. "Chronos: Learning the Language of Time Series." Transactions on Machine Learning Research (2024), arXiv:2403.07815; and Garza A, Mergenthaler-Canseco M. "TimeGPT-1," arXiv:2310.03589 (2023); Woo G, et al. "Unified Training of Universal Time Series Forecasting Transformers" (Moirai), ICML 2024, arXiv:2402.02592. Peer-reviewed/conference work on generic time-series foundation models pretrained on large generic-time-series corpora for zero-shot forecasting — the genuine research seed behind the idea of a bioprocess foundation model, while noting such models have no bioprocess-specific corpus, kinetics, or mass-balance knowledge.
- Narayanan H, Sokolov M, Morbidelli M, Butté A. "A new generation of predictive models: The added value of hybrid models for manufacturing processes of therapeutic proteins." Biotechnology and Bioengineering 116(10):2540-2549 (2019). DOI 10.1002/bit.27097. Peer-reviewed demonstration that hybrid modeling and knowledge transfer (mechanistic priors plus ML) materially improve predictive accuracy and sample efficiency on scarce process data — the unglamorous, deployable near-term workaround (transfer learning, Bayesian/kinetic priors, calibration transfer) the literature recommends in place of a not-yet-existing foundation model.
- U.S. Food and Drug Administration. Warning Letter 320-26-58 to Purolea Cosmetics Lab (Livonia, MI), issued 2 April 2026, including the section 'Inappropriate Use of Artificial Intelligence in Pharmaceutical Manufacturing.' Available at fda.gov (Warning Letters database). The FDA's first cGMP warning letter to cite AI: the firm used AI agents to generate drug-product specifications, SOPs, and master production and control records, and the quality unit failed to review/verify those AI-generated outputs before use (a 21 CFR 211.22(c) violation; the AI omitted the process-validation requirement). The agency stated AI use is permitted but its outputs must be reviewed and cleared by an authorized human in the quality unit. See also DLA Piper analysis (Apr 2026) and ECA Academy GMP News coverage.
- European Commission / EMA / PIC/S. EudraLex Volume 4 — Draft Annex 22: Artificial Intelligence, published for EU/PIC/S stakeholder consultation 7 July 2025 (comment period to 7 October 2025); finalization expected 2026 — cite as DRAFT. Available via the EU GMP guideline consultation portal (file mp_vol4_chap4_annex22_consultation_guideline_en.pdf). The first manufacturing-specific GMP AI rule: for CRITICAL GMP applications it permits only static (no continuous/online learning) and deterministic (identical input → identical output) models, and excludes dynamic/adaptive, probabilistic, and generative AI/LLM models from critical use (such models remain allowable only for non-critical tasks with a qualified human in the loop).
- European Commission / EMA / PIC/S. EudraLex Volume 4 — Draft Annex 22: Artificial Intelligence (consultation draft, 7 July 2025), Scope and definitions — cite as DRAFT (provisional, finalization mid-2026). Cited for the explicit exclusion of generative AI and large language models from critical GMP applications, and the requirement that permitted models be fully characterized, validated against predefined acceptance criteria with independent test data, log feature attribution and confidence scores, and be continuously monitored after deployment. Secondary explainers: European Pharmaceutical Review ("What Annex 22 spells for AI in GMP manufacturing," 2025) and ECA Academy.
- ISPE GAMP Guide: Artificial Intelligence / Machine Learning, and U.S. FDA, "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products" (draft guidance, Jan 2025); together with the FDA/CDRH Predetermined Change Control Plan (PCCP) framework finalized Dec 2024. Establishes the consistent cross-regulator expectation that a model be locked at validation under a predetermined change-control plan, with human-in-the-loop or human-on-the-loop oversight throughout — the opposite of an agent that adapts its own behavior in production. Supports the claim that FDA, EMA, draft Annex 22, and GAMP converge on the same governance boundary.
- BioPhorum. Digital Plant Maturity Model (DPMM) version 3.0 (October 2023). Available at biophorum.com (Digital Plant Maturity Model 3.0 download/workstream). Industry consortium framework (clearly marked as a trade-body self-reported maturity model) defining five maturity levels from a paper-based plant (Level 1) to a fully automated, adaptive, self-aware, self-optimizing, autonomous plant (Level 5). BioPhorum explicitly describes the Level-5 autonomous/self-optimizing end-state as aspirational and beyond current manufacturing/IT capability, with realistic organizations operating at Levels 3-4 — supporting the chapter's claim that fully autonomous self-optimizing operation is an end-state essentially no plant has reached.
The Honest Verdict: Where ML/AI in Biomanufacturing Really Stands
- Sartorius Stedim Data Analytics, "SIMCA and SIMCA-online: Multivariate Data Analysis (MVDA/MSPC) and Real-Time Process Monitoring" (vendor product documentation, accessed 2026), https://www.sartorius.com/en/products/process-analytical-technology/data-analytics-software/mvda-software/simca and https://www.sartorius.com/en/products/process-analytical-technology/data-analytics-software/real-time-process-monitoring-software/simca-online; AspenTech, "Aspen ProMV" multivariate analysis product page (vendor). For the underlying method see Nomikos P. and MacGregor J.F., "Monitoring batch processes using multiway principal component analysis," AIChE Journal 40(8):1361-1375, 1994, doi:10.1002/aic.690400809. Vendor product claims are self-reported.
- Berry B.N., Dobrowsky T.M., Timson R.C., Kshirsagar R., Ryll T., Wiltberger K., "Quick generation of Raman spectroscopy based in-process glucose control to influence biopharmaceutical protein product quality during mammalian cell culture," Biotechnology Progress 32(1):224-234, 2016, doi:10.1002/btpr.2205 (PMID:26587969). Demonstrates PLS-Raman closed-loop glucose set-point control in fed-batch CHO culture (glycation reduced ~9% to 4%).
- Eyster T. et al. / Sartorius, "Seamless Integration of Glucose Control Using Raman Spectroscopy in CHO Cell Culture" (ProCellics Raman Analyzer / Bio4C PAT, OPC-UA closed-loop feed control), BioProcess International / bioprocessonline.com sponsored technical article, accessed 2026, https://www.bioprocessintl.com/sponsored-content/seamless-integration-of-glucose-control-using-raman-spectroscopy-in-cho-cell-culture. See also Craven S., Whelan J., Glennon B., "Glucose concentration control of a fed-batch mammalian cell bioprocess using a nonlinear model predictive controller," Journal of Process Control 24(4):344-357, 2014, doi:10.1016/j.jprocont.2014.02.007. Vendor application note is self-reported.
- ISPE, "Amgen's Deep Learning Approach to Vial Inspection" (Jorge Delgado, Amgen; 2025 ISPE Aseptic Conference), https://ispe.org/news/amgens-deep-learning-approach-vial-inspection; and Packaging Digest, "Amgen Invests in AI-Assisted Packaging Inspection" (first fully validated AI visual-inspection retrofit, Syntegon syringe line, Juncos, Puerto Rico; reported ~70% rise in particle detection, ~60% drop in false rejects at the critical station), https://www.packagingdigest.com/pharmaceutical-packaging/amgen-invests-in-ai-assisted-packaging-inspection. The ~95% auto-release figure is vendor/self-reported by Amgen and not independently verified.
- Cytiva, "GoSilico Chromatography Modeling Software and Services" (ChromX/DSPX mechanistic chromatography modeling), vendor product/services pages, https://www.cytivalifesciences.com/en/us/products/items/gosilico-chromatography-modeling-software-p-28023; acquisition reported by GEN, "Cytiva Acquires German Scientific Software Maker GoSilico," 2021, https://www.genengnews.com/news/cytiva-acquires-german-scientific-software-maker-gosilico/. For peer-reviewed mechanistic chromatography method see Hahn T., Huuk T., Heuveline V., Hubbuch J., "Simulating and Optimizing Preparative Protein Chromatography with ChromX," Journal of Chemical Education 92(9):1497-1502, 2015, doi:10.1021/ed500854a. Vendor performance claims are self-reported.
- Körber Pharma, "PAS-X MES Suite" and "Boosting efficiency and productivity with Electronic Batch Recording (EBR)" — review-by-exception electronic batch records (Werum PAS-X), vendor product documentation, https://www.koerber-pharma.com/en/solutions/software/werum-pas-x-mes-suite and https://www.koerber-pharma.com/en/blog/boosting-efficiency-and-productivity-with-electronic-batch-recording-ebr. Vendor efficiency/quality figures are self-reported.
- ISPE, "The 7th ISPE Pharma 4.0 Survey: Digital Transformation," Pharmaceutical Engineering, Sep/Oct 2024, https://ispe.org/pharmaceutical-engineering/september-october-2024/7th-ispe-pharma-40tm-survey-digital — AI/ML shows the most pilot projects and fewest scaled implementations of the surveyed enabling technologies; data silos/non-FAIR records are the leading barrier. Companion industry sources: McKinsey, "The State of AI" annual survey (most organizations remain in experiment/pilot mode), https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai; BioPhorum digital-plant maturity model. Federated-learning precedent: MELLODDY Consortium (IMI), https://www.melloddy.eu/.
- Wang J., Chen J., Studts J., Wang G. (Boehringer Ingelheim Pharma GmbH & Co. KG), "Simultaneous prediction of 16 quality attributes during protein A chromatography using machine learning based Raman spectroscopy models," Biotechnology and Bioengineering 121(5):1605-1620, 2024, doi:10.1002/bit.28679. The model is k-nearest-neighbor (KNN) regression on Butterworth-filtered Raman spectra; the paper makes no deep-learning-superiority claim. (Companion: Wang et al., Journal of Chromatography A, 2024, doi:10.1016/j.chroma.2024.464721.)
- 908 Devices, "Resilience Demonstrates Lower Cost of Perfusion Bioreactor Process Using 908 Devices' REBEL At-line Analyzer" (National Resilience reports ~50% titer increase via amino-acid-targeted media/feed optimization with REBEL at-line analysis), press release, 26 April 2023, https://markets.financialcontent.com/pennwell.renewableenergy/article/bizwire-2023-4-26-resilience-demonstrates-lower-cost-of-perfusion-bioreactor-process-using-908-devices-rebel-at-line-analyzer. This is PAT plus manual feed optimization, not an ML deployment; figure is vendor/self-reported.
- Yokogawa Electric Corporation, "Yokogawa Acquires Insilico Biotechnology, Developer of Innovative Bioprocess Digital Twin Technology," press release, 2 November 2021, https://www.yokogawa.com/news/press-releases/2021/2021-11-02/ (Insilico Biotechnology AG is owned by Yokogawa, not Cytiva). DataHow AG is an independent ETH Zurich spin-off (founded 2017, Morbidelli group), not Sartorius-owned: DataHow company/about pages, https://datahow.ch/ and Crunchbase profile, https://www.crunchbase.com/organization/datahow.
- Xu Y. et al. (WuXi Biologics), "Innovating cell culture process development with deep learning-powered robotic experimentation using the first Industrial Smart Lab Framework," Biotechnology Progress 41(4):e70051, 2025, doi:10.1002/btpr.70051 (PMID:40542657) — reports +26.8% average titer across three CHO clones (single-company self-reported, PD scale). National Resilience +50% (908 Devices press release, 2023, see ref 9). Sanofi +8% drug substance and Genentech ~35% figures are single-company self-reported headline numbers from press/trade coverage and are not independently verified; each must be labeled illustrative/self-reported.
- von Stosch M., Oliveira R., Peres J., Feyo de Azevedo S., "Hybrid semi-parametric modeling in process systems engineering: Past, present and future," Computers & Chemical Engineering 60:86-101, 2014, doi:10.1016/j.compchemeng.2013.08.008; and Narayanan H. et al., "Bioprocessing in the Digital Age: The Role of Process Models," Biotechnology Journal 15(1):1900172, 2020, doi:10.1002/biot.201900172. These establish that mechanistic-plus-data hybrid models outperform pure approaches in the small-data regime via embedded physics, transfer learning, and Bayesian priors.
- FDA approval of continuous manufacturing with real-time release testing for Janssen's Prezista (darunavir) 600 mg tablets, Gurabo, Puerto Rico, 8 April 2016 — first FDA approval converting an approved product from batch to continuous manufacturing. Reported by BioPharma Dive, "In first, FDA approves Janssen's switch to continuous manufacturing for HIV drug," https://www.biopharmadive.com/news/in-first-fda-approves-janssens-switch-to-continuous-manufacturing-for-hiv/417460/; EMA RTRT approval followed June 2017. This hard RTRT precedent is small-molecule, not a biologic CQA.
- European Commission / EMA, EudraLex Volume 4 GMP, draft Annex 22 "Artificial Intelligence" (public consultation draft, July 2025): permits only static, deterministic AI models in critical GMP applications and excludes dynamic/continuously-learning, probabilistic, and generative/LLM models from the critical path. EMA consultation page, https://www.ema.europa.eu/en/events/good-manufacturing-practice-multistakeholder-workshop-expert-contributions-artificial-intelligence-guidance-development-annex-22; draft text and reasons-for-changes, https://health.ec.europa.eu/document/download/5f38a92d-bb8e-4264-8898-ea076e926db6_en. Issued jointly for PIC/S consultation.
- U.S. FDA, Warning Letter 320-26-58 to Purolea Cosmetics Lab (Livonia, MI), 2 April 2026 — first FDA warning letter to cite AI use, for using AI agents to generate drug product specifications, procedures, and master production/control records without authorized Quality Unit review (FD&C Act 501(a)(2)(B); 21 CFR Part 211). Coverage: DLA Piper, "FDA Warning Letter highlights risks of using AI in drug manufacturing," April 2026, https://www.dlapiper.com/en-us/insights/publications/2026/04/fda-warning-letter-highlights-risks-of-using-ai-in-drug-manufacturing.
- U.S. FDA / CDER, discussion paper "Artificial Intelligence in Drug Manufacturing" (Docket FDA-2023-N-0487), March 2023, https://www.fda.gov/media/165743/download (Federal Register notice 2023-04206, https://www.federalregister.gov/documents/2023/03/01/2023-04206/). Risk-based credibility framework further developed in FDA draft guidance "Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products," January 2025, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological. See also ISPE GAMP guidance on AI/ML.