The Standards Bodies: Who Actually Builds Biopharma's Shared Vocabulary
📍 Where we are: Part VII · Ontologies in Industry Today — Chapter 24. Part VI assembled, governed, and FAIR-measured the whole model on one antibody batch. Now we leave the bench of our own example and ask the empirical question the earlier chapters quietly assumed away.
For twenty-three chapters we leaned on a comforting phrase: the shared standard. We anchored bp:DS-001 to an IDMP substance identifier, aligned our classes to BFO, borrowed mid-level scaffolding from IOF Core, and validated with SHACL. Each time, we spoke as if some authoritative vocabulary simply existed, waiting to be imported. It is time to be honest about who writes those vocabularies — because the answer reshapes how a real plant should plan its semantics.
The short version is that almost none of it comes from a software vendor. In biopharma, the engines of shared meaning are pre-competitive consortia: alliances where rival companies pool effort on the parts that confer no competitive advantage — the words, the file formats, the identifiers — precisely so they can compete on the science instead. Each consortium owns a slice of the process. None owns the whole. The result is a patchwork that is converging, but slowly.
Imagine a dozen rival restaurants that all need the same thing: an agreed list of ingredient names, so a supplier's "caster sugar" means the same thing in every kitchen. No single restaurant should own that list — the others would never trust it — so they form a club to maintain it together. Biopharma has not one such club but a handful, each curating a different aisle of the pantry: one for lab instruments, one for the drug substance, one for the recipe, one for the shipping label. The food is fine. The trouble is that the aisles were stocked by different clubs that do not always agree on where the shelves end.
What this chapter covers
This chapter is a map of the consortia, not the vocabularies themselves — those get their own chapter next. We hold one distinction firm throughout: a true formal ontology is an OWL/RDF artifact a reasoner can run over, whereas a structured information model is an XML or object schema that organizes data without formal logical semantics. The two are easy to conflate and must not be. We tour the Allotrope Foundation and the Pistoia Alliance as the leading formal-ontology consortia; MESA International, the OPC Foundation, and PROFIBUS & PROFINET International as the structured-model bodies behind ISA-88 and ISA-95; ISPE and BioPhorum as the maturity-framework authors; and GS1 together with the OAGi/NIIMBL collaboration for identification and emerging biomanufacturing ontologies. Every adoption claim carries a maturity tag in bold parentheses, because a published spec and a deployed one are years apart.
Each pre-competitive consortium owns one aisle of biopharma's shared vocabulary, color-coded by how mature its work really is, and no single body owns the whole.
Original diagram by the authors, created with AI assistance.
The formal-ontology consortia
Two consortia do the genuine OWL-and-BFO work this book has assumed.
The Allotrope Foundation is the most industrially adopted formal-ontology effort in the lab-data space (production). It was formed on 4 June 2012 as an independent legal entity, spun off from the IQ Consortium — the International Consortium for Innovation and Quality in Pharmaceutical Development [1]. Allotrope publishes a three-part framework: the AFO (Allotrope Foundation Ontology), a BFO-aligned controlled vocabulary for analytical and lab data; the ADM (Allotrope Data Models); and the ADF (Allotrope Data Format), an HDF5-based binary data container whose v1.0 shipped in October 2015 [2]. Its weight comes from its members: nine Foundation Members — Amgen, BASF, Bayer, Boehringer Ingelheim, Dow, Genentech, GSK, Johnson & Johnson, and Merck & Co. — with instrument vendors such as Agilent, Bruker, SCIEX, Shimadzu, Benchling, and BIOVIA seated in a separate Partner Network tier [1]. When an earlier chapter spoke of importing a controlled term for an SEC result like our 98.611 % monomer, AFO is the kind of vocabulary it had in mind.
The Pistoia Alliance is the central vehicle for pre-competitive pharma ontologies more broadly (piloted to production, by component). It was incorporated in 2008 by representatives of AstraZeneca, GSK, Novartis, and Pfizer who had met at a conference in Pistoia, Italy; it now reports more than 200 members [4]. Its portfolio is a useful index of what the industry is trying to standardize, even where the work is unfinished:
| Project | What it standardizes | Maturity |
|---|---|---|
| IDMP Ontology (IDMP-O) | Substance and product identification; v1.0 released 24 January 2024, open-source, co-developed by 11 pharma companies [3] | (production) by component |
| Unified Data Model (UDM) | Open chemical-reaction exchange; v6.0 January 2021, MIT license | (production) |
| Methods Hub | Machine-readable analytical-method transfer | (piloted) |
| Pharma General Ontology (PGO) | A shared upper-level reference targeting the "FAIR silos" problem — data FAIR within a company but not across companies | (proposed) |
| CMC Process Ontology | An ISA-88-aligned process vocabulary; v1.0 announced for around mid-2026 [5] | (proposed) |
IDMP-O is exactly what sits behind our bp:DS-001 in a real plant: the formal ontology a manufacturer uses to express the regulated substance identity. PGO is worth flagging for its candor — it exists because FAIR within one company has proven insufficient across companies, the precise gap that Part VI's FAIR scoring would expose the moment our model met a partner's.
The structured-model bodies
The next tier produces information models, not formal ontologies — and the distinction earns its keep here, because you cannot run a reasoner over an XML schema the way you can over OWL.
The backbone is the ISA standards family: ISA-88 for batch recipe structure and ISA-95 (also published as IEC/ISO 62264) for connecting the manufacturing execution system to the enterprise, with object models for equipment, material, and personnel (production) [6]. These standards describe their objects in prose and tables; turning them into something machines exchange falls to implementations.
MESA International supplies that implementation as B2MML (Business to Manufacturing Markup Language), a royalty-free, W3C-XSD rendering of the ISA-95 family. Its latest is B2MML Version 7 (V0700), announced 24 November 2020, aligning with the 2018 ISA-95 editions and adding a first B2MML-JSON specification [6]. B2MML is widely described as the de facto MES exchange layer in pharma — but that reputation rests largely on system-integrator assertion rather than published adoption data, and should be read as such (production, vendor/integrator-reported).
Where B2MML carries transactional data, the OPC Foundation adds real-time semantics through the OPC UA for ISA-95 Common Object Model (OPC 10030), Release 1.00 dated 6 November 2013 — a companion specification that lets live equipment speak the same object vocabulary B2MML moves between systems (production) [7].
Newer, and aimed more squarely at modular plants, is the Module Type Package (MTP), standardized as VDI/VDE/NAMUR 2658 and built on OPC UA plus AutomationML to enable "plug-and-produce" modular automation. Its governance moved to PROFIBUS & PROFINET International (PI) in November 2021, jointly with NAMUR and ZVEI; MTP V2.0.0 was released in the fall of 2024, and the work is being internationalized as IEC 63280 (piloted) [8]. Pharma and biotech are repeatedly cited as the leading-interest sector for MTP — a reasonable signal of intent, not yet of broad production use.
The maturity-framework bodies
A third kind of consortium standardizes not vocabulary but how far along you are in using it. These are frameworks, and their maturity tag is (production, as a framework) — meaning the framework itself is real and used, not that any given plant has reached its top rung.
ISPE publishes the Pharma 4.0 work: Baseline Guide Vol 8 (Pharma 4.0, 1st edition), released December 2023, which extends ICH Q10 and incorporates 35 use cases [9]. It contains a Maturity Model and Self-Assessment built on the acatech Industrie 4.0 Maturity Index — a six-stage model, so the common "five-level" shorthand is imprecise — across four operating-model dimensions: Resources; Organization and Processes; Information Systems; and Culture [9].
BioPhorum maintains the Digital Plant Maturity Model (DPMM), a five-level model running from paper-based operations to a self-optimizing autonomous plant. DPMM V3 was published in October 2023, adding split QC and QMS dimensions and a Process Development dimension, and renaming a security dimension "Cybersecurity" [10]. A caution on a figure you will see quoted: the frequently cited claim that roughly 80% of members use the DPMM traces to the earlier, pre-V3 tool — treat it as legacy, self-reported evidence, not a current measurement (production, as a framework) [10].
Identification, and the newest entrant
For identification, the converged backbone is GS1: the GTIN, serial number, and GLN encoded in a 2D DataMatrix, with EPCIS for event exchange — the standards that now underpin pharmaceutical serialization (production) [11]. US DSCSA manufacturer enforcement began 27 May 2025; the EU's Falsified Medicines Directive has been live since 9 February 2019. One nuance matters for accuracy: DSCSA mandates interoperable electronic traceability but does not name GS1 as the sole permissible format — the industry chose GS1 EPCIS to meet the requirement [11].
The newest entrant points back toward formal ontologies. In June 2024, NIIMBL contributed the ontologies from its Big Data Program to the Open Applications Group (OAGi) to jointly develop open-source biopharma manufacturing ontologies, aligned to BFO and IOF Core (proposed) [12]. That effort has a published counterpart: the IOF biopharma domain ontology itself. Audited against its February 2026 release (Release_202602), it defines 171 Released classes — 44 unit operations, 17 QbD parameters, plus equipment, materials, and recipe terms — the most complete formal vocabulary of the bioprocess interior that exists today, and the one this book's running example now binds its process steps to. This is the work most directly continuous with the model this book has built — and yet, published and Released as it is, it still wants the one thing this part keeps circling back to: a plant in production that actually depends on it. The vocabulary has arrived faster than the adoption, which is the clearest marker of how young cross-industry biomanufacturing semantics still are.
The unsolved part: a published spec is not an adopted one
Pre-competitive consensus is slow and partial, and the map above shows why. A different body owns each aisle — Allotrope the lab, Pistoia the substance, MESA and PI the recipe and the plant floor, GS1 the package — and their slices overlap in some places and leave gaps in others. Worse, the dates tell a recurring story: a specification is typically published years ahead of broad adoption, so the existence of a standard says little about whether the plant down the road actually speaks it. Where this chapter could find no published deployment data, that is reported as not found in public evidence rather than as evidence of absence. The consortia have done the genuinely hard part — producing governed, shared vocabulary — but no consortium can mandate its use. Getting an entire industry to converge on what already exists is the unfinished, and largely ungovernable, work.
Why it matters
Every term the earlier chapters anchored to "a shared standard" is, in reality, some consortium's deliberate and governed labor. Knowing who owns which slice is not trivia — it is the difference between a plant importing a vocabulary and reinventing one. It tells you where to reuse AFO instead of writing your own analytical terms, where IDMP-O already encodes the substance identity behind bp:DS-001, and — just as importantly — where, honestly, there is not yet a standard to import and you are building on open ground. The through-line of this book is turning records into knowledge that can be reasoned over under pressure; that knowledge is only as shareable as the vocabularies these bodies maintain.
In the real world
Sorted by what is actually deployed rather than merely published, the map tightens — and it is worth gathering the adoption realities in one place. The Allotrope Foundation Ontology is the most industrially adopted formal-ontology effort in this space, carried by nine pharma Foundation Members and a vendor Partner Network [1]. GS1 and EPCIS are unambiguously in production and now legally load-bearing: US DSCSA manufacturer enforcement began 27 May 2025 and the EU Falsified Medicines Directive has been live since 9 February 2019 — the serialization layer the bp:DP-001 lot's GTIN and DataMatrix ride on [11]. B2MML is the de facto MES exchange layer, but on system-integrator assertion rather than published adoption data — real, and worth reading with that caveat [6]. The frameworks are genuine while their top rungs stay aspirational: BioPhorum's frequently quoted "roughly 80% of members use the DPMM" traces to the pre-V3 tool, so it is legacy self-reported evidence, not a current measurement [10], and MTP's pharma-leading-interest is a signal of intent, not of broad production use [8]. And the work most continuous with this book — the OAGi/NIIMBL IOF biopharma ontologies that the running example's bp:BATCH-2026-001 and its process steps now bind to — is published and Released but still wants a plant in production that depends on it [12]. The consistent shape: vocabulary arrives years ahead of adoption, and a published standard says little about whether the plant down the road actually speaks it.
Key terms
- Pre-competitive consortium — an alliance of rival companies that pools effort on shared, non-differentiating assets (vocabularies, formats, identifiers) so members can compete elsewhere.
- Formal ontology vs. structured information model — the former is an OWL/RDF artifact a reasoner can operate over; the latter, such as an XML schema, organizes data without formal logical semantics.
- Allotrope Foundation — the consortium behind AFO, ADM, and the ADF data container; the most industrially adopted formal-ontology effort for lab data (production).
- Pistoia Alliance — the central pre-competitive pharma consortium; home of IDMP-O, UDM, Methods Hub, PGO, and the CMC Process Ontology.
- B2MML — MESA International's royalty-free XML and JSON implementation of the ISA-95 information model; widely asserted, on integrator evidence, as the de facto pharma MES exchange layer.
- OPC 10030 — the OPC UA companion specification mapping the ISA-95 common object model to real-time equipment data.
- Module Type Package (MTP) — VDI/VDE/NAMUR 2658, an OPC UA and AutomationML package for plug-and-produce modular automation, now governed by PI and internationalized as IEC 63280.
- Digital Plant Maturity Model (DPMM) — BioPhorum's five-level framework rating a plant from paper-based to self-optimizing autonomous operation.
- GS1 and EPCIS — the identification and event-exchange standards (GTIN, GLN, DataMatrix, EPCIS) the industry chose to satisfy the DSCSA and FMD serialization mandates.
Where this leads
We have mapped the builders; next we open their products. The following chapter, The Vocabularies in Use: From AFO to IDMP, descends from the consortia to the artifacts themselves — what AFO actually says about an analytical result, how IDMP-O structures a substance, and how these vocabularies would attach to the very batch, pool, and drug product we have modeled throughout this book.