Skip to main content

Maintenance: Publication, the Assembled Thread, and FAIR

📍 Where we are: Part VII · Maintenance, Publication & FAIR — the last phase of the lifecycle. Specification wrote the brief, Reuse borrowed the spine, Conceptualization and Formalization and Implementation built and instantiated the model, Validation gated it. Now the LOT (Linked Open Terms) methodology takes over the back end: publish the finished vocabulary as one reusable artifact, and measure — not assume — whether it is FAIR.

For one CHO monoclonal-antibody campaign we have laid every edge: derivedFrom from the cold-chain vial at a pharmacy shelf back through the serialized unit, the drug product, the drug substance, the polishing and capture pools, the bioreactor batch, the seed train, and the cell-bank tiers to WCB-CHO-001; affectsQuality from a feed-rate parameter to the monomer-purity attribute it shapes; release verdicts at the gate; PROV provenance reconciling two source systems that disagreed about a batch. Each was a modeling exercise. Assembled and published, they are one navigable vocabulary in which a recall can be scoped by query instead of by guesswork — and the publication phase asks the question that justifies the whole effort: is it actually useful to the world? FAIR is the answer's yardstick, and the honest discipline of this chapter is to turn FAIR from a slogan into a measurement, and to admit where the measurement comes back worse than the brochure.

The simple version

A library can own every book and still be useless if nothing is catalogued, the doors lock at random hours, and half the books are in unlabeled languages. "We have the books" is not "you can find, get, combine, and reuse them." For a mAb maker the books are batch records, cell-bank certificates, chromatograms, and cold-chain logs: owning all of them is not the same as being able to walk a failing lot back to its cell bank, or scope a recall to exactly the lots that share that bank, in one query. FAIR is the promise that the catalogue actually works — and the only way to know is to test it, not to assume the right shelving delivered it. This chapter publishes the finished CHO-campaign ontology as one vocabulary and tests it against FAIR honestly, admitting the dimension where it scores worst.

Start from the questions

Two competency questions from the specification earn this chapter, both in the provenance group, because publication is where the model's claims about where a batch's facts came from and what regulated identity its substance carries are finally exercised against the published graph. CQ-15 asks whether two source-system records — the MES that registered BATCH-2026-001 and the genealogy loader that named its bioreactor — reconcile to one curated decision without an owl:sameAs over-merge that would fuse the batch material into the vessel. CQ-16 asks whether the ISO IDMP regulatory substance identity attaches to the same DS-001 node the release gate validated, not a copy, so the regulator's view of the drug substance and the manufacturer's are one entity. Both run green over the published graph, and both are tests of FAIR-in-fact: provenance is what makes a batch record Reusable by a stranger investigating it years later, and a single shared identity is what makes the released lot Interoperable with a regulatory submission.

The whole vocabulary, assembled into one published artifact

Publication's first act is to step back and look at what the previous chapters actually built. Each one added its classes and relations to a single namespace, bp:. The result is not many vocabularies but one — a complete, loadable ontology spanning the whole process, from the day-one target to the cold-chain vial at the patient's bedside. The finished bp: vocabulary in bioproc.ttl is 206 classes, 88 object properties, and 46 datatype properties (plus one annotation property for the deprecation pattern) — every class grounded on the BFO upper spine and aligned up to BFO, IOF, the OBO biology ontologies, AFO, QUDT, RO, and PROV in align.ttl. The chapters introduced the load-bearing terms; the full leaf-level set — every CQA, every assay, every product variant — lives in the file. Collapsed to its seven top categories, the whole thing fits on one screen:

# bioproc.ttl — the assembled vocabulary, by BFO category (representative members)
bp:Material # cells, banks, cultures, batch, pools, substance, product, vials, packages
bp:Equipment # bioreactors, columns, resins, filters, instruments, loggers, sites
bp:Quality # purity, aggregate, charge variant, HCP, turbidity, integrity, criticality
bp:RealizableEntity # roles; dispositions (developability, resin binding, LRV); functions
bp:InformationArtifact # concept, sequence, recipe, spec, method, result, certificate, FAIR, IDMP
bp:ProcessParameter # the CPP types: feed rate, temperature, pH, dissolved oxygen
bp:Process # discovery, transfection, culture runs, unit operations, assays, EPCIS events

# and one transitive relation threads every material in the namespace together:
bp:derivedFrom a owl:TransitiveProperty . # WCB-CHO-001 -> ... -> DS-001 -> { DP-001 , DP-002 }

This is not a picture of an ontology — it is the ontology this book wrote for one CHO mAb. The single bp: namespace, published as one artifact under one version and one license, is the LOT deliverable: a reusable vocabulary, not twenty disconnected modelets. Its value is not that it exists but that publishing it makes the hard manufacturing questions into one-line queries.

What the published thread answers: lineage, impact, and the cross-lifecycle walk

Publication only matters because the assembled bp: graph turns three questions a quality unit used to answer with weeks of cross-referencing into single statements. The first is the lineage walkwhere did this lot come from? Because derivedFrom is transitive and every edge is the trace of a real material transformation, a one-or-more-hop SPARQL property path reconstructs a drug substance's full ancestry at any depth, the same query validate.py runs:

# queries/CQ-01.rq — everything DS-001 derives from, to any depth, in one property path.
# (bp:derivedFrom)+ = one-or-more hops; works whether the lineage is 3 steps or 20.
PREFIX bp: <https://example.org/bioproc#>
SELECT ?ancestor ?type WHERE {
bp:DS-001 (bp:derivedFrom)+ ?ancestor .
?ancestor a ?type .
FILTER(STRSTARTS(STR(?type), STR(bp:)))
} ORDER BY ?ancestor

Over the loaded graph it returns the full eleven-ancestor ancestry — every real unit-operation intermediate a coarse CSV chain would collapse, from the polishing pools back to the cell-bank tiers:

[3] lineage walk from DS-001: 11 ancestors
BATCH-2026-001 CLAR-001 MCB-CHO-001 PApool-001 POLpool-001 RCB-CHO-001
SEED-001 SEEDFLASK-001 VFpool-001 VIpool-001 WCB-CHO-001

This is why cell banks need genealogy at all: the drug substance traces back through the polishing and viral-clearance pools (POLpool-001, VFpool-001, VIpool-001), the capture pool and clarified harvest (PApool-001, CLAR-001), the bioreactor batch, the seed train (SEED-001, SEEDFLASK-001), and the cell-bank tiers (WCB-CHO-001, MCB-CHO-001, RCB-CHO-001) — and the long-range reachability is inferred by the transitive closure, never asserted by hand.

The second question is the inverse, and it is where the thread pays for itself in dollars and patient safety: when a lot fails, what shares its fate? When DP-004 goes out of spec on HMW aggregate, an investigation walks up its lineage to a shared ancestor and back down to every drug product that traces to it:

# queries/CQ-04.rq — when DP-004 fails, what shares its fate? Walk UP DP-004's
# lineage to common ancestors, then back DOWN to every drug product that shares
# one — i.e. every sibling lot tracing to the same cell bank. Scopes a recall by
# query instead of quarantining the whole campaign.
PREFIX bp: <https://example.org/bioproc#>
SELECT DISTINCT ?affected WHERE {
bp:DP-004 (bp:derivedFrom)+ ?shared . # an ancestor of the failed lot
?affected (bp:derivedFrom)+ ?shared . # anything else derived from it
?affected a bp:DrugProduct .
FILTER(?affected != bp:DP-004)
} ORDER BY ?affected

Because every batch in the campaign traces to the same WCB-CHO-001, the query returns both siblings — the forward fork bp:DS-001 bp:fillsInto bp:DP-001 , bp:DP-002 is exactly what makes a shared-fate impact set complete by construction rather than by luck, the difference between a scoped deviation and a blind campaign-wide quarantine:

impact of DP-004 (shared cell bank): ['DP-001', 'DP-002']

The third question is the one only this book's modeling makes possible, because it crosses the seams a fragmented plant keeps in three dialects: which process parameter drove this lot's monomer purity? It starts at a development-era affectsQuality assertion (bp:FeedRate bp:affectsQuality bp:MonomerPct-CQA), finds the production-phase run that realized that parameter as a setpoint, steps to the batch the run output, and walks derivedFrom forward to the released drug-substance lot carrying the result — joining a development study, a plant sensor, and a QC verdict in one statement that lands the feed-rate CPP on DS-001. The point of publishing the vocabulary is that these three queries run over one artifact: lay the edges faithfully and the lineage, the recall scope, and the cross-lifecycle link are all computed, not reconstructed by archaeology.

Diagram of the assembled bp: vocabulary: five colored panels across the top — Foundations (the BFO spine and core relations), Discovery and Development (target, molecule, cell line, process, analytics, tech transfer), Upstream (seed train, production bioreactor, harvest), Downstream (capture, viral safety, polishing, UF/DF), and Fill-Finish and Release (formulation, QC release, packaging, distribution) — each listing representative bp: classes it contributes; a down arrow from every panel into a central genealogy spine showing the derivedFrom chain from WCB-CHO-001 through SEED-001, BATCH-2026-001 and DS-001 forking to DP-001 and DP-002; and a full-width capstone bar summarizing one bp: ontology of 206 classes, 88 object properties and 46 datatype properties, aligned up to BFO, IOF, OBO and QUDT and loaded and SHACL-gated by validate.py. One ontology, published as one vocabulary: every chapter adds its classes to a single bp: namespace, all grounded on one BFO spine and threaded by the transitive derivedFrom genealogy — 206 classes assembled into the complete, loadable model the whole book was writing. Original diagram by the authors, created with AI assistance.

FAIR turned into measurement

FAIR's power is that it decomposes into concrete checks, which is what lets it be measured rather than merely claimed [1][2]. For the published CHO-campaign graph, each letter lands on a real artifact and a real manufacturing need:

  • Findable — every entity carries a globally unique, persistent IRI and indexed metadata, so an investigator can locate the exact drug-substance lot rather than hunt for it across spreadsheets. bp:DS-001 expands to one global string, https://example.org/bioproc#DS-001, the same everywhere it appears — in the lineage walk, the release gate, and the IDMP record — with a label, a type, and a release status, not an anonymous cell.
  • Accessible — the data is retrievable by that identifier over a standard protocol with clear access rules, which is what lets an auditor pull the cell-bank certificate on demand. A SPARQL endpoint with documented authorization passes; FAIR-accessible does not mean open — a GMP batch record is regulated and tightly access-controlled, yet fully FAIR if its access conditions are clear and its retrieval mechanism standard.
  • Interoperable — values use shared, formal vocabularies and qualified, unit-bearing references, so a monomerPct of 98.611 cannot be misread as a fraction by another system or a regulator's tool. A value resolving to a shared class with a QUDT unit passes; a bare string does not — and this is the dimension the genealogy itself depends on, since a derivedFrom edge is only combinable across systems if both sides recognize what it means.
  • Reusable — the data is richly described with provenance and a clear usage license, so a stranger investigating a deviation can trust the batch record without re-deriving it. A result carrying its method, its sample lineage back to the cell bank, and terms of use passes; an orphaned number does not.

These are loadable individuals, not slide bullets. The graph carries its own FAIR self-assessment of the DS-001 record as four bp:FAIRCheck nodes hung off a bp:FAIRAssessment, plus the license the Reusable check depends on:

bp:FAIR-DS-001 a bp:FAIRAssessment ; rdfs:label "FAIR assessment of the DS-001 record" ;
bp:assesses bp:DS-001 ;
bp:hasCheck bp:FC-F , bp:FC-A , bp:FC-I , bp:FC-R .
bp:FC-F a bp:FAIRCheck ; rdfs:label "Findable (global IRI)" ; bp:fairVerdict "PASS" .
bp:FC-A a bp:FAIRCheck ; rdfs:label "Accessible (resolves, with access conditions)" ; bp:fairVerdict "PASS" .
bp:FC-I a bp:FAIRCheck ; rdfs:label "Interoperable (shared vocab + QUDT units)" ; bp:fairVerdict "PARTIAL" .
bp:FC-R a bp:FAIRCheck ; rdfs:label "Reusable (method, lineage, licence)" ; bp:fairVerdict "PASS" .
bp:LICENSE-CC-BY a bp:UsageLicense ; rdfs:label "CC BY 4.0" .
bp:DS-001 bp:hasLicense bp:LICENSE-CC-BY .

The honesty of the model is in the one verdict that is not "PASS": Interoperable is recorded "PARTIAL", not because the QUDT units or shared-class edges are missing, but because even a graph this carefully aligned still maps only a fraction of its local terms up to verified external IRIs — and interoperability is the dimension a real graph most often fails.

Identity card of a FAIR assessment scorecard for the bioprocess graph: four rows, one per letter, each with a concrete check and a verdict — Findable (persistent IRI plus indexed metadata: pass), Accessible (SPARQL endpoint with documented, possibly restricted access: pass, with a note that accessible is not open), Interoperable (values use shared ontology classes and QUDT units versus bare strings: the at-risk row, marked partial), and Reusable (provenance, lineage, and license present: pass); a machine-actionability banner across the top; and a callout flagging Interoperability as the dimension most likely to score low because metadata is authored by hand without a controlled vocabulary. FAIR as a scorecard, not a slogan: each letter becomes a concrete check, machine-actionability is the target, and Interoperability is flagged as the dimension a real graph most often fails — usually because of hand-authored metadata, not the engine. Original diagram by the authors, created with AI assistance.

Reusable means provenance: CQ-15 reconciles two source systems

The Reusable letter is earned by provenance, and the graph carries a real one. Two source systems disagreed about a node: the batch register (the MES) claimed BATCH-2026-001 is a Batch material, while the genealogy loader (an ETL job) claimed the run used a bioreactor that resolves to BR-101. A naive integration would owl:sameAs-merge the two and conflate the batch material with the vessel. Instead, a data steward's curation used both claims to separate them, with each claim attributed to its own source agent via W3C PROV-O — exactly the discipline instances.ttl records by bp:reconciliation-001 prov:wasAssociatedWith bp:DataSteward:

# instances.ttl — the PROV reconciliation: two source claims, one curated decision, no over-merge.
bp:BatchRegister a prov:SoftwareAgent ; rdfs:label "batch register (MES)" .
bp:GenealogyLoad a prov:SoftwareAgent ; rdfs:label "genealogy loader (ETL)" .
bp:DataSteward a prov:Agent ; rdfs:label "ontology data steward" .
bp:claim-batch-001 a prov:Entity ; rdfs:label "source claim: BATCH-2026-001 is a Batch (material)" ;
prov:wasAttributedTo bp:BatchRegister .
bp:claim-vessel-001 a prov:Entity ; rdfs:label "source claim: BATCH-2026-001's run used a bioreactor -> resolved to BR-101" ;
prov:wasAttributedTo bp:GenealogyLoad .
bp:reconciliation-001 a prov:Activity ; rdfs:label "curation: separate the vessel (BR-101) from the batch material" ;
prov:used bp:claim-batch-001 , bp:claim-vessel-001 ;
prov:wasAssociatedWith bp:DataSteward .

CQ-15 asks the published graph to return the two reconciled source claims, each attributed to its source — a SELECT over prov:used and prov:wasAttributedTo:

# queries/CQ-15.rq — the two source claims the steward's reconciliation USED, each attributed.
PREFIX bp: <https://example.org/bioproc#>
PREFIX prov: <http://www.w3.org/ns/prov#>
SELECT ?claim ?source WHERE {
bp:reconciliation-001 prov:used ?claim .
?claim prov:wasAttributedTo ?source .
} ORDER BY ?claim

validate.py runs it over the loaded graph and returns both claims, the curated decision recorded as provenance rather than a destructive merge:

CQ-15 provenance PASS claim = ['claim-batch-001', 'claim-vessel-001']

That recorded provenance is precisely what makes BATCH-2026-001's record reusable by a stranger: a downstream consumer can see not only the fact but who asserted it and how the conflict was resolved.

Interoperable means one shared identity: CQ-16 puts IDMP on the same node

The Interoperable letter is earned when the regulator's identity and the released quality are one entity, not a copy. The graph attaches the ISO IDMP substance identity — the regulator's name for the substance the model tracks internally as bp:DS-001 — directly to DS-001, the same node the release gate validated:

# instances.ttl — the IDMP regulatory identity attaches to the SAME node the gate validated.
bp:DS-001 bp:hasSubstanceIdentifier bp:IDMP-DS-001 .
bp:IDMP-DS-001 a bp:SubstanceIdentifier ; rdfs:label "IDMP substance identity for DS-001" ;
bp:isAbout bp:DS-001 ;
bp:uniiCode "ILLUSTRATIVE-UNII-0001" ; # an FDA UNII / GSRS code (value illustrative)
bp:mpid "ILLUSTRATIVE-MPID-mAb-A" . # an IDMP Medicinal Product Identifier (value illustrative)

CQ-16 verifies the regulated identity and the released CQA panel hang off the same IRI with one ASK:

# queries/CQ-16.rq — the IDMP identifier and the release CQA are carried by the SAME node.
PREFIX bp: <https://example.org/bioproc#>
ASK {
bp:DS-001 bp:hasSubstanceIdentifier ?id .
bp:DS-001 bp:monomerPct ?m .
}

validate.py returns true: the lot carrying the IDMP/UNII identifier is the lot carrying the release monomer value, so the regulator's view and the manufacturer's view of the substance are interoperable by construction, not by a fragile join:

CQ-16 provenance PASS ASK = True

Where the published thread ends: the cold chain and the factory wall

The published graph models the last leg of the campaign honestly, and that leg is where FAIR's federation limit becomes concrete. Distribution is a monitored process: a temperature-sensitive vial must stay within its validated 2-8 deg C cold chain from the plant onward, and a departure from that range — an excursion — is a consequential event that can compromise the product's quality and shelf life, revisiting the storage assumption the release decision made. The dataset carries this as real individuals — a bp:Shipment that prov:used the released DP-001, a bp:ColdChainProcess bearing the logger trace, a bp:DataLogger, a discrete bp:TemperatureExcursion, and the bp:CustodyRole that names the handoff to a distributor — with the dense logger payload referenced (index-versus-payload) rather than flattened into the graph:

# instances.ttl — the first distribution leg, modeled as the manufacturer's last authored edges.
bp:SHIP-001 a bp:Shipment ; rdfs:label "shipment of pallet P-001 to distributor" ;
prov:used bp:DP-001 ; # the released drug product (true individual)
bp:hasOutput bp:VIAL-DP-001-000042 . # the serialized vial that ships
bp:COLDCHAIN-001 a bp:ColdChainProcess ; rdfs:label "2-8 C cold-chain transport" ;
bp:hasParticipant bp:VIAL-DP-001-000042 ;
bp:storageTempLowC 2.0 ; bp:storageTempHighC 8.0 ;
bp:hasTrace bp:Trace-Logger-DP001 .
bp:LOGGER-01 a bp:DataLogger ; rdfs:label "cold-chain temperature logger" .
bp:EXCURSION-001 a bp:TemperatureExcursion ; rdfs:label "transient temperature excursion" ;
bp:isAbout bp:VIAL-DP-001-000042 ; bp:excursionTempC 11.4 .
bp:CUSTODY-distributor a bp:CustodyRole ; rdfs:label "distributor custody role" .

Because derivedFrom is transitive, the serialized vial VIAL-DP-001-000042 — carrying its GS1 SGTIN as an identifier system cross-linked by skos:exactMatch, not subclassed — closes the whole walk from a pharmacy shelf back to WCB-CHO-001, the same eleven-hop lineage now stretching from a patient's hand to a frozen vial of cells. But the graph ends where authorship passes to the distributor: the excursion to bp:excursionTempC 11.4 was logged in transit the manufacturer controlled, while which truck carried the pallet onward, which warehouse held it, and at what temperature are written in other organizations' systems, if at all. And the thread stops one step short of the patient on purpose — linking to that a unit was dispensed without ingesting who received it, because the patient is a person with privacy rights, not a manufacturing entity, and that boundary is data-protection law, not an engineering limit. This is the structural reason the published artifact is Interoperable in fact only inside the factory and federable only as far as the wall.

Evaluation: the published graph still passes

Publication does not get to weaken validation. validate.py loads the published artifact — bioproc.ttl + align.ttl + instances.ttl — parses 2120 triples, applies the OWL-RL closure to 7137 triples, runs all 23 competency questions, and SHACL-gates the result. The provenance pair this chapter rests on are two of the green lines, and the release gate still fails only where it should — DP-004/DS-004 on hmwPct, every other lot in spec, the DS-001 monomer at 98.611 %, the total viral LRV at 8.7:

[1] parsed 2120 triples (bioproc + align + instances)
[2] reasoned: 2120 -> 7137 triples after OWL-RL closure
...
CQ-15 provenance PASS claim = ['claim-batch-001', 'claim-vessel-001']
CQ-16 provenance PASS ASK = True
...
23/23 competency questions PASS

ALL CHECKS PASSED

A published vocabulary that no longer passed its own ORSD would not be a release; it would be a regression. The LOT publication step ships the same green artifact the validation phase signed off.

The unsolved part: compliant on the wire, hollow in fact, fragile past the wall

Here is the gap the series has circled. When researchers assess real datasets against FAIR, a consistent pattern emerges: nearly everything is Findable, while Interoperability is the dimension that most often scores poorly, frequently the lowest of the four [2][3]. The cause is rarely the triplestore — it is metadata authored by hand on the plant floor. An operator under deadline types a CQA name into a free-text field without a controlled vocabulary to pick from, so one batch record says "monomer", the next "% main peak", the next "purity (SEC)", and the units arrive as bare strings; the fields exist but point at no shared ontology, and the lineage edge between two such systems cannot be combined even though both are valid RDF. The data is findable and downloadable yet un-combinable. That is exactly why the model's own FC-I verdict is honestly "PARTIAL": some terms map cleanly to a verified external IRI — bp:Material to BFO's material entity, bp:HostOrganism to NCBI Taxonomy's Cricetulus griseus, bp:HCPAssay to the OBI ELISA class, bp:derivedFrom to RO's derives from — while others like the QbD relation affectsQuality have no canonical external leaf and stay marked ILLUSTRATIVE rather than mapped to a plausible-but-wrong IRI, and the IOF terms were verified against the published IOF release rather than the EBI Ontology Lookup Service, which does not host IOF — so a downstream tool resolving IRIs through OLS recognizes the BFO/OBI side and silently misses the IOF side. A SHACL gate catches a missing field or a wrong datatype, but it cannot catch a human who confidently mislabels the SEC monomer field with a plausible-looking but wrong term — and a confident mislabel on a release CQA is precisely the kind of error that undercuts interoperability assurance. FAIRness is delivered by discipline and tooling, not by adopting RDF.

The deeper limit is the one distribution exposed and that publication makes unavoidable: the thread is ironclad inside the factory and a fragile federation beyond it. The manufacturer's published graph models the first shipment leg honestly — a bp:Shipment prov:used bp:DP-001, the cold-chain process it logged, the custody role it assigned — but it ends where authorship passes to the distributor. Past that handoff, provenance depends on a federation of independent graphs that must agree on what a unit is and choose to share what happened to it, including any cold-chain excursion in a warehouse the manufacturer never controlled. You can standardize the interface (GS1, EPCIS) and hope each party holds up its end; you cannot mandate another company's data quality. Cross-organizational digital-thread federation is a genuinely open problem no published ontology resolves on its own. So the honest FAIR standard for the published artifact is precise: genuinely Findable, Accessible, and Reusable; Interoperable in fact only as far as the alignment is verified; and federable only as far as the factory wall.

Why it matters

FAIR is the entire justification for the modeling effort — the reason a mAb maker pays the cost of ontologies, IRIs, and units rather than just storing numbers. The payoff is concrete: a published, FAIR graph turns a deviation into a lineage walk back to WCB-CHO-001, an out-of-spec lot into a recall scoped by query to exactly DP-001 and DP-002, and a release CQA into a value a regulator's tool can read without ambiguity. But that justification holds only if the FAIRness is real, and the consistent finding that real data is findable-but-not-interoperable means the value is routinely claimed and not delivered: a graph whose monomerPct is a bare string cannot actually scope the recall it promises to. Publishing the model as one versioned, licensed bp: vocabulary and then measuring its FAIRness — turning each letter into a check and scoring it honestly, including a "PARTIAL" where one is deserved — is what keeps a project from the comfortable lie that adopting standards equals achieving the goal. A graph that looks FAIR and is not is arguably worse than an honest spreadsheet, because it invites a trust the batch record has not earned.

In the real world

LOT was built explicitly for industrial ontology development and is used in standards bodies and large companies to publish reusable vocabularies, with publication, versioning, and FAIR release as first-class steps. FAIR assessment, in turn, has matured from principle to practice: there are published metrics, maturity indicators, and interpretation frameworks that let an organization score its data rather than assert compliance [2][3]. The consistent, sobering result across domains is that Interoperability and rich Reusability lag far behind Findability. In biopharma, the standards that would deliver interoperability — IOF/BMIC, Allotrope, QUDT — exist and are converging, so the bottleneck is not missing technology but the discipline of authoring to them on the floor, and the federation work — shared registries, verifiable credentials — that the supply chain is still building out past every organizational handoff.

Key terms

  • LOT (Linked Open Terms) — the industry-oriented methodology governing requirements → implementation → publication → maintenance of a reusable vocabulary; the source of this phase's publish-and-release back end.
  • Assembled vocabulary — the single bp: namespace of 206 classes, 88 object properties, and 46 datatype properties, published as one versioned, licensed artifact rather than many modelets.
  • Lineage walk / impact analysis — the queries publication earns: a (derivedFrom)+ property path that returns the eleven-ancestor ancestry of DS-001 back to WCB-CHO-001, and its inverse that scopes a failed lot's recall to the siblings (DP-001, DP-002) sharing its cell bank.
  • Cold chain / excursion — the validated 2-8 deg C range the shipped vial must hold, modeled as a monitored bp:ColdChainProcess; a departure (bp:excursionTempC 11.4) is a discrete event bearing on the unit's fitness for use.
  • Patient-identity edge — the deliberate termination of the manufacturing thread at dispensing, short of an identified individual, where data integrity gives way to data-protection law.
  • FAIR principles — that data be Findable, Accessible, Interoperable, and Reusable, with machine-actionability the explicit target; principles, not a conformance test.
  • FAIR is not open — Accessible means clear access conditions and a standard retrieval mechanism, not that everyone may read everything; restricted regulated data can be fully FAIR.
  • The interoperability gap — the consistent finding that real data is Findable but most often fails Interoperability, because metadata is hand-authored without a controlled vocabulary; why the model's own FC-I verdict is honestly "PARTIAL".
  • PROV reconciliation — recording a curated decision (CQ-15) as a prov:Activity that prov:used two attributed source claims, separating the vessel from the batch material without an owl:sameAs over-merge.
  • IDMP-on-the-same-node — attaching the ISO IDMP regulatory substance identity (CQ-16) to the very DS-001 node the release gate validated, so regulated identity and released quality are one interoperable entity.
  • Federation past the wall — the open problem of continuing the published thread through distributors and pharmacies whose systems, identifiers, and data quality the manufacturer does not control.

Where this leads

The model is built, validated, published as one FAIR-measured vocabulary, and honest about where interoperability and federation thin out. The running example is complete. Part VIII now steps out of the campaign to ask the empirical question the lifecycle assumed an answer to: does the real industry actually do any of this, and how mature is it? It opens with The Standards Bodies: Who Actually Builds Biopharma's Shared Vocabulary, surveying the pre-competitive consortia — Allotrope, the Pistoia Alliance, ISA-88/95, OPC UA and MTP, ISPE Pharma 4.0, BioPhorum, GS1, and the OAGi/NIIMBL biomanufacturing-ontology effort — that produce the shared vocabularies every earlier chapter quietly leaned on.