Skip to main content

Validation: The Release Gate and SHACL

📍 Where we are: Part VI · Validation — the lifecycle phase where the model is held to account. Following the SAMOD test-first loop and NeOn's evaluation scenario, this chapter turns the release specification, the finish criteria, the cell-bank requirement, and the upper-spine disjointness into closed-world gates the graph enforces — and runs them against the campaign that has been building since the cell bank.

A monoclonal-antibody campaign spends months converging on one decision: may this lot of mAb-A go into a patient's vein? Everything the book has modeled feeds that decision. The SEC monomer purity on DS-001 says how much of the lot is the intact antibody rather than fragments or clumps. The HMW aggregate level says how much has clustered into the high-molecular-weight species that drive immunogenicity — an aggregated antibody can provoke anti-drug antibodies in the patient, so this is a safety attribute, not a cosmetic one. The CEX main peak says how homogeneous the charge profile is, a proxy for the post-translational modifications that shift efficacy. The host-cell protein count says how much CHO-cell debris survived purification. And the viral clearance summed across orthogonal downstream steps says the lot is safe from adventitious agents. Validation is where the ontology stops describing those attributes and starts deciding on them. The fact established in the upper spine and the axioms chapter is that OWL and SHACL do different jobs: OWL reasons in an open world, where a missing required result is merely "unknown"; SHACL validates in a closed world, where a missing required result is a failure, now. A release decision cannot tolerate "unknown" — a missing sterility test is a failed lot, not an open question — so the release specification is naturally a set of SHACL shapes: gates a lot's data must pass before it may claim release [1].

The simple version

Before a plane takes off, a checklist must be complete — every item present, checked, and initialed. A blank or a missing signature stops the flight, no matter how good the plane looks. A reasoner is the engineer who can deduce new facts but shrugs at a blank line ("maybe it's fine, I wasn't told otherwise"). SHACL is the inspector with the checklist who refuses to clear the flight until every line is genuinely filled. This chapter builds that inspector for an antibody lot — the release gate, the finish gate, the cell-bank gate, and two guards that catch impossible types — and is honest that a complete checklist is not the same as a good flight.

Start from the questions

This chapter is the home of the book's closed-world competency questions, defined in the specification, and each one is a real release question a QC group asks of every batch. CQ-08 asks whether a released drug-substance lot carries exactly one in-spec value for every required CQA — monomer, HMW, CEX-main, HCP, protein concentration — the panel that decides whether DS-001 is the antibody it claims to be. CQ-09 asks whether a released lot is attributably signed with a status from the controlled set, because under GMP an unsigned release is no release. CQ-10 asks whether the finished drug-product lots meet the finish-specific criteria the bulk never had — sterility, appearance, fill volume. CQ-11 asks whether out-of-spec lots are flagged on exactly the failing path and nowhere else, so an investigation aims at the real defect. And CQ-17 asks whether the working cell bank WCB-CHO-001 — the root of the whole genealogy — is fully characterized before a single seed culture is grown from it. These five share a shape no SPARQL SELECT can express, because "is a required test missing?" is not a question about the triples that exist — it is a question about the triples that should exist and do not. That is precisely what SHACL is for [1].

The release gate: the specification is a shape, release is conformance

A release specification lists, for the drug substance and the drug product, the required tests and their acceptance criteria — monomer purity, aggregate level, charge-variant main peak, host-cell protein, protein concentration — structured by guidance like ICH Q6B for biotech products [2]. Each limit encodes a real mAb risk: 95 % monomer because fragments and aggregates below that fraction signal a product that has degraded; a 2.0 % HMW ceiling because aggregates above it raise the immunogenicity risk; a 60–80 % CEX-main window because a charge profile drifting outside it means the antibody's modifications are no longer the validated ones; a 100 ppm HCP ceiling because residual CHO protein is itself an immunogen. Modeled, that specification is bp:ReleaseShape, the actual shape in shapes.ttl, targeting both lot-level released materials at once — the bulk substance and the filled product:

# shapes.ttl — the release gate (excerpt): every released lot's full CQA panel, complete and in spec.
bp:ReleaseShape a sh:NodeShape ;
sh:targetClass bp:DrugSubstance , bp:DrugProduct ;

# Exactly one monomer result, a float, at or above the spec floor.
sh:property [
sh:path bp:monomerPct ;
sh:name "SEC %monomer" ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:datatype xsd:float ;
sh:minInclusive 95.0 ;
sh:message "Monomer purity is missing, duplicated, or below the 95.0 % release limit." ] ;

# HMW aggregate at or below its upper limit — the criterion DS-004/DP-004 trip.
sh:property [
sh:path bp:hmwPct ;
sh:name "SEC %HMW aggregate" ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:datatype xsd:float ;
sh:maxInclusive 2.0 ;
sh:message "HMW aggregate is missing or above the 2.0 % release limit." ] ;

# CEX main charge-variant peak within its window.
sh:property [
sh:path bp:cexMainPct ;
sh:name "CEX %main (charge variant)" ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:datatype xsd:float ;
sh:minInclusive 60.0 ; sh:maxInclusive 80.0 ;
sh:message "CEX main peak is missing or outside the 60.0-80.0 % window." ] ;

# Host-cell protein at or below its upper limit.
sh:property [
sh:path bp:hcpPpm ;
sh:name "host-cell protein (ppm)" ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:datatype xsd:float ;
sh:maxInclusive 100.0 ;
sh:message "HCP is missing or above the 100 ppm release limit." ] ;

# Protein concentration within the formulation window.
sh:property [
sh:path bp:proteinConcMgPerMl ;
sh:name "protein concentration (mg/mL)" ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:datatype xsd:float ;
sh:minInclusive 45.0 ; sh:maxInclusive 55.0 ;
sh:message "Protein concentration is missing or outside the 45-55 mg/mL window." ] ;

# Release status drawn from a controlled set.
sh:property [
sh:path bp:releaseStatus ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:in ( "PASS" "OOS" "PENDING" ) ] ;

# An attributable signature (21 CFR Part 11 / Annex 11).
sh:property [
sh:path bp:approvedBy ;
sh:minCount 1 ;
sh:message "Release record is unsigned." ] .

Read it as the checklist made executable, and notice it answers two competency questions at once. CQ-08 is the panel rows: the lot must carry exactly one monomer result (minCount/maxCount — no cherry-picking the favorable repeat when an assay is run twice) that is at or above the floor (minInclusive 95.0), an HMW result at or below its limit (maxInclusive 2.0), and CEX-main, HCP, and protein concentration each in their declared windows. That the same bp:ReleaseShape targets both bp:DrugSubstance and bp:DrugProduct is not an accident of convenience: the panel attributes are inherited down the genealogy. The release monomerPct sits on the drug-substance lot DS-001 — the convergence node where eleven ancestors of upstream lineage meet — and the product lots filled from it carry the same panel because they are the same antibody, simply in vials. CQ-09 is the last two rows: the releaseStatus must be drawn from the controlled set ("PASS" "OOS" "PENDING") — not a free-text flag an operator could type "released?" into — and the lot must bear an attributable approvedBy signature, the electronic-records requirement of 21 CFR Part 11 and EU Annex 11 [3]. The gate is not prose in an SOP a human must remember to apply; it is a rule the graph enforces on every lot, automatically and identically — the point at which the months-long quality thread becomes a control rather than a record.

The finish gate: drug-product criteria the bulk substance does not carry

A drug substance is a bulk liquid measured by concentration; a drug product is a sterile, filled, sealed unit you count, and that change of kind brings criteria the bulk never had. Fill-finish takes DS-001, combines it with excipients — polysorbate 80, histidine buffer, sucrose, modeled as real bp:hasComponent edges so the formulation is traceable structure rather than a buried recipe — and fills it under sterile conditions into a vial-stopper-seal container-closure system CCS-001. That container-closure is part of the product's identity, not packaging incidental to it, because container-closure integrity is the quality that protects sterility and stability across shelf life. Sterility itself is enabled by the fill-finish process: a bulk in a holding tank is not asked whether it is sterile in a filled-unit sense, but a sealed vial absolutely is. So those criteria belong on a separate shape, bp:DrugProductFinishShape, targeting only bp:DrugProduct — the model's answer to CQ-10 [3]:

# shapes.ttl — the finish gate: drug-product-specific criteria the bulk substance does not carry.
bp:DrugProductFinishShape a sh:NodeShape ;
sh:targetClass bp:DrugProduct ;

sh:property [
sh:path bp:sterilityResult ;
sh:name "sterility" ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:in ( "STERILE" ) ;
sh:message "Sterility result is missing or not STERILE." ] ;

sh:property [
sh:path bp:appearance ;
sh:name "appearance" ;
sh:minCount 1 ;
sh:message "Appearance description is missing." ] ;

sh:property [
sh:path bp:fillVolumeMl ;
sh:name "fill volume (mL)" ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:datatype xsd:float ;
sh:minInclusive 0.1 ;
sh:message "Fill volume is missing or implausibly low." ] .

Splitting the gates is the modeling point, and it is the continuant/occurrent and bulk-versus-discrete distinctions made enforceable. Every bp:DrugProduct must pass both bp:ReleaseShape (the shared antibody panel) and bp:DrugProductFinishShape (the finish criteria), while a bp:DrugSubstance is held only to the panel. The product lot DP-001 carries sterilityResult "STERILE", an appearance string — "clear, colourless, essentially free of visible particles" — and fillVolumeMl "1.0", so it clears the finish gate; the bulk DS-001 is never asked for a sterility result, because the shape does not target it. Two shapes, one product, no contradiction. Note what the finish gate does not yet check: the identity of an individual vial. A lot is tens of thousands of vials, and at release we model the lot, not the item — the lot-versus-item individuation tension is real, and the model is deliberately designed so serialization can slot item identity in later without a rebuild, but at the release gate the unit of validation is the filled lot.

Hero diagram of the release gate: on the left the DS-001 drug-substance node and the DP-001 drug-product node carrying their CQA panels (monomer 98.611, HMW 1.287, CEX main 70.686, HCP 12, protein 50.2) and signatures; in the centre two stacked SHACL shape boxes, ReleaseShape targeting both substance and product and DrugProductFinishShape targeting only the product with sterility, appearance, and fill volume; conformant lots flow through to a green released state with releaseStatus PASS; below, the OOS lots DS-004 and DP-004 hit the gate and divert to a red conforms-false path that emits a validation report naming focus node, result path hmwPct, MaxInclusive constraint, and value 2.41, feeding an investigation and a shared-fate impact query; a caption reads the specification is a shape and release is conformance. Release as a gate: a lot's CQA panel and signature are checked against the specification expressed as SHACL shapes — the shared ReleaseShape plus a product-only finish gate — and a violation emits a structured report that routes an investigation and a shared-fate impact query. Original diagram by the authors, created with AI assistance.

Two more closed-world gates: the cell bank and the disjointness guards

The same closed-world discipline gates the root of the genealogy, where it matters most. A working cell bank is the single material every downstream lot transitively derives from, so an error there propagates with full confidence to every batch — and a cell line is a living, mutating culture that drifts as it divides, the one node whose identity an IRI can stabilize on paper but biology never fully fixes. That is exactly why regulators expect a working bank to be characterized before use: WCB-CHO-001 carries four characterization results — identity (is this really the mAb-A CHO line and not a cross-contaminant?), sterility/mycoplasma, adventitious-agent viral safety, and genetic stability — each a bp:CharacterizationResult with verdict "PASS", and a passageNumber of 8 against a ValidatedPassageLimit of 40, because passage count bounds how long the culture has been dividing and therefore how far its productivity and product quality may have drifted. The OWL min-cardinality restriction on bp:WorkingCellBank states open-world that a working bank bears a characterization; bp:CellBankShape enforces it closed-world, as a failure if absent — the model's answer to CQ-17 [4]:

# shapes.ttl — the cell-bank gate (WCB-CHO-001 conforms: 4 characterizations + passage 8).
bp:CellBankShape a sh:NodeShape ;
sh:targetClass bp:WorkingCellBank ;
sh:property [
sh:path bp:hasCharacterization ;
sh:minCount 1 ;
sh:message "A working cell bank must carry at least one characterization result." ] ;
sh:property [
sh:path bp:passageNumber ;
sh:minCount 1 ; sh:maxCount 1 ; sh:datatype xsd:integer ;
sh:message "A working cell bank must record exactly one passage count." ] .

A working bank with no characterization recorded is incomplete, not contradictory — the open-world reasoner has nothing to object to, because absence is not falsehood to OWL. But for a manufacturing root node, "we have no identity test on file" is the worst possible state to ship from: it is precisely the misidentification that, once it has propagated transitively through eleven layers of genealogy, no downstream test can catch. The cell-bank gate refuses to call the bank usable until the evidence is genuinely present, turning a regulatory expectation (ICH Q5D characterization of cell substrates) into a rule the graph enforces.

And two disjointness guards close a gap the open-world reasoner leaves at the level of kinds of thing. The whole BFO-grounded spine rests on splitting continuants (a batch of antibody, which persists and bears qualities through time) from occurrents (a fermentation, which happens and then is over). bioproc.ttl asserts bp:Batch owl:disjointWith bp:CellCultureProcess (a continuant is never an occurrent) and bp:Material owl:disjointWith bp:Equipment (a batch of antibody is never the stainless-steel vessel that held it) — and these are not academic: a loader that mistakes the bioreactor BR-101 for the batch material BATCH-2026-001 would collapse the genealogy, attaching release facts to a vessel that processes a hundred batches a year and breaking every lineage trace through it. But the OWL-RL profile used by the runnable validator does not act on owl:disjointWith, so a planted conflation slips past it silently. The SHACL shapes are what actually catch it:

# shapes.ttl — the continuant/occurrent and material/equipment guards (closed-world).
bp:BatchNotProcessShape a sh:NodeShape ;
sh:targetClass bp:Batch ;
sh:not [ sh:class bp:CellCultureProcess ] ;
sh:message "A Batch (material/continuant) must not also be a CellCultureProcess (occurrent)." .

bp:MaterialNotEquipmentShape a sh:NodeShape ;
sh:targetClass bp:Material ;
sh:not [ sh:class bp:Equipment ] ;
sh:message "A Material (batch/lot/pool) must not also be Equipment (a persisting vessel)." .

This is the division of labour the whole book turns on: the OWL axiom states the necessary condition for a reasoner to use in inference; the SHACL shape enforces the same condition as a gate over the data we actually hold. On the clean campaign both guards stay silent, because the provenance reconciliation deliberately keeps the vessel BR-101 a separate node from the batch material BATCH-2026-001 — the continuant/occurrent split that lets us walk batch genealogy without ever traversing into equipment. Plant a second rdf:type fusing them, and MaterialNotEquipmentShape fires — exactly as the disjointness guard CQ promises.

Validation: the realistic out-of-spec mode, isolated to one path

validate.py parses bioproc.ttl + align.ttl + instances.ttl, applies the OWL-RL closure (2120 triples closing to 7137), and runs every shape. The honest detail of the running example is what kind of failure release exposes — and it is the failure mode that actually plagues antibody campaigns. The drug-substance lot DS-004 and its filled product DP-004 both have a monomer purity of 98.687 %, comfortably above the 95.0 % floor, so they pass monomerPct; their CEX-main, HCP, and protein concentration are all in spec too. What trips the gate is their HMW aggregate of 2.41 %, above the 2.0 % limit. This is realistic precisely because monomer purity and aggregate level are not the same measurement: a lot can be overwhelmingly intact antibody and still carry just enough of the high-molecular-weight clumps — the immunogenicity risk — to fail. Where did that aggregate come from? The graph cannot prove causation: monomer purity is a cumulative property of the whole purification chain, refined a little at capture, viral inactivation, polishing, and UF/DF, and owned by no single step. The release gate does not need to know where the aggregate formed; it needs only to refuse the lot. So the failure is a single sh:MaxInclusiveConstraintComponent violation at sh:resultPath bp:hmwPct. Here is the real pySHACL report for the whole-graph validation — CQ-11 made concrete, two results, one per OOS lot, both on the same path and nowhere else:

Validation Report
Conforms: False
Results (2):
Constraint Violation in MaxInclusiveConstraintComponent (http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:datatype xsd:float ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:maxInclusive Literal("2.0", datatype=xsd:decimal) ; sh:message Literal("HMW aggregate is missing or above the 2.0 % release limit.") ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:name Literal("SEC %HMW aggregate") ; sh:path bp:hmwPct ]
Focus Node: bp:DP-004
Value Node: Literal("2.41", datatype=xsd:float)
Result Path: bp:hmwPct
Message: HMW aggregate is missing or above the 2.0 % release limit.
Constraint Violation in MaxInclusiveConstraintComponent (http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:datatype xsd:float ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:maxInclusive Literal("2.0", datatype=xsd:decimal) ; sh:message Literal("HMW aggregate is missing or above the 2.0 % release limit.") ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:name Literal("SEC %HMW aggregate") ; sh:path bp:hmwPct ]
Focus Node: bp:DS-004
Value Node: Literal("2.41", datatype=xsd:float)
Result Path: bp:hmwPct
Message: HMW aggregate is missing or above the 2.0 % release limit.

A lot can be pure by one criterion and still fail on an aggregate, and the failure stays isolated to exactly the one path genuinely out of range — no spurious cascade onto the other panel values, which all pass. The report is the start of an investigation, not the end of one, and the structure of the genealogy is what makes the investigation tractable. Because the report is itself an RDF graph, a failure is queryable like any other fact, and it feeds straight into the shared-fate impact analysis the forward DS-to-DP fork makes computable. Fill-finish forked DS-001 forward into the sibling product lots DP-001 and DP-002 (bp:DS-001 bp:fillsInto bp:DP-001 , bp:DP-002); the OOS DP-004 derives instead from a separate substance lot DS-004, but both forks share the working cell bank WCB-CHO-001. So when DP-004 fails, the impact query walks up its lineage to the shared ancestor and back down to every drug product reachable from it — CQ-04 answered by traversal rather than by quarantining the whole site:

# queries/CQ-04.rq — when DP-004 fails, which drug products share its fate via the shared cell bank?
PREFIX bp: <https://example.org/bioproc#>
SELECT DISTINCT ?affected WHERE {
bp:DP-004 (bp:derivedFrom)+ ?shared . # an ancestor of the failed lot
?affected (bp:derivedFrom)+ ?shared . # anything else derived from it
?affected a bp:DrugProduct .
FILTER(?affected != bp:DP-004)
} ORDER BY ?affected
# -> DP-001, DP-002 (the siblings tracing to the same WCB-CHO-001)

The same backward walk that scopes the recall is the eleven-ancestor lineage the drug substance chapter built: from DS-001 through POLpool-001, VFpool-001, VIpool-001, PApool-001, CLAR-001, BATCH-2026-001, SEED-001, SEEDFLASK-001, and up the cell-bank tiers WCB-CHO-001, MCB-CHO-001, RCB-CHO-001. Genealogy that the OWL owl:TransitiveProperty on bp:derivedFrom computes for free is what turns a SHACL failure on one vial-lot into a precisely scoped question about exactly two siblings. CQ-09, the signature-and-status check, also has a SPARQL twin the catalog runs — an ASK that DS-001 carries an approvedBy signer and a releaseStatus from the controlled set — and the ReleaseShape enforces the same condition closed-world over every released lot:

# queries/CQ-09.rq — a released lot is attributably SIGNED, status from the controlled set (returns true).
PREFIX bp: <https://example.org/bioproc#>
ASK {
bp:DS-001 bp:approvedBy ?signer .
bp:DS-001 bp:releaseStatus ?status .
FILTER(?status IN ("PASS", "OOS", "PENDING"))
}

The unsolved part: SHACL checks completeness, not correctness

Here is the limit that matters most, and it generalizes the axioms chapter's warning that consistent is not correct. A SHACL gate verifies that a lot's record is complete, well-formed, in range, and signed — that every required test was run, recorded once, typed correctly, and inside its limit. It cannot verify that the record is true. A result that was confidently mislabeled — the HMW value entered against the wrong vial, a sample mix-up where a passing lot's SEC trace was filed under a failing lot's IRI, a transcription error that happens to land at 1.9 % instead of 2.4 % — passes the gate cleanly, because every structural rule is satisfied. "Complete" means every required test appears in the record; only data integrity upstream can vouch for whether the right sample was tested, on the right day, on a qualified instrument. SHACL is a powerful guard against the common failures (a missing sterility test, an out-of-range aggregate, an unsigned release, a duplicated monomer result), and those are most of them; but it is blind to a plausible, in-range falsehood. The gate proves the checklist is filled, not that the checks were honest.

This is also why the thread is, as we warned of the digital thread, ironclad inside one factory and fragile beyond it: the gate enforces completeness over the triples one plant holds, and says nothing about a sister site whose records never entered this graph. So release, in reality, is never only a gate. The SHACL pass is a necessary precondition that automation can enforce tirelessly and identically — a genuine advance over a human re-checking a paper checklist for an antibody lot under deadline — but the release decision still rests on QC judgment, deviation review, and the qualified-person sign-off that the gate records but does not replace. A graph that conflates "passed the SHACL gate" with "is a good batch" commits exactly the over-trust this book warns against at every step. The honest standard: model the specification as SHACL so completeness and range are mechanically guaranteed and investigations are routable, and be clear that correctness — that each result truly describes this lot of mAb-A — depends on data integrity upstream and human judgment the model supports rather than supplants.

Why it matters

Validation is the lifecycle phase where the ontology earns its keep, and modeling an antibody's release specification as SHACL turns a remembered, manually-applied checklist into an enforced, auditable gate. Every CQA the book modeled — monomer purity, HMW aggregate, charge-variant homogeneity, host-cell protein, the sterility and fill that fill-finish enabled, back to the design space that defined which attributes are critical for this molecule — converges here as a shape a lot must satisfy, and a failure routes straight into the impact analysis that scopes which sibling lots share the failed bank. The five closed-world gates (release, finish, cell bank, and the two disjointness guards) cover what no SPARQL query can: the things that should be present and are not, and the type conflations — a batch fused with its vessel, a continuant fused with the process that made it — that a reasoner waves through. This is where the quality thread becomes a control rather than a record, where a recall becomes a query rather than a guess, and where the model's honest limit, completeness rather than correctness, states most clearly what an ontology is for.

In the real world

Releasing a lot of antibody against a specification, with signed records under 21 CFR Part 11 and Annex 11, is the legally binding reality of GMP manufacturing, and the structure of biotech specifications is long codified in ICH Q6B [2][3]. SHACL is a settled W3C standard, and the open-source book runs exactly this kind of BatchShape gate in tested code — sh:conforms false with a validation result naming the focus node and constraint — proving the mechanism is not hypothetical [1]. The cell-bank gate mirrors the characterization that ICH Q5D expects of every cell substrate — identity, sterility, viral safety, and genetic stability on a CHO line whose passage history is tracked against a validated limit [4]. The released drug-substance node also carries a second, regulated identity — an ISO 11238 / IDMP substance identifier (UNII, MPID) hanging off the very same DS-001 the gate validated — so the lot crosses into a PQ-CMC / eCTD submission carrying the identity a regulator re-checks rather than being re-described from scratch. What remains a human and organizational reality, not a solved technical one, is everything the gate cannot see: the deviation investigations, the data-integrity culture, and the qualified-person judgment that stand behind a release the graph can structurally verify but not vouch for. The regulatory-semantics chapter shows the FDA's KASA platform applying rule-based structured checks to CMC data — the same closed-world idea, mandated at the submission boundary, even as the GxP validation regime keeps a live reasoner off the floor.

Key terms

  • Open world vs. closed world — OWL treats a missing fact as unknown (so it cannot fail a lot for an absent sterility test); SHACL treats a missing required fact as a violation. Release needs the closed-world half.
  • Release gate (bp:ReleaseShape) — the shape targeting both bp:DrugSubstance and bp:DrugProduct, requiring exactly one in-range value per CQA (monomer ≥ 95 %, HMW ≤ 2 %, CEX-main 60–80 %, HCP ≤ 100 ppm, protein 45–55 mg/mL) plus a controlled status and a signature; answers CQ-08 and CQ-09.
  • Finish gate (bp:DrugProductFinishShape) — the product-only shape adding sterility, appearance, and fill volume — the criteria fill-finish into a container-closure system enables; a product must clear both gates, a substance only the panel; answers CQ-10.
  • Cell-bank gate (bp:CellBankShape) — the closed-world twin of the working-bank OWL restriction, requiring the four characterizations (identity, sterility, viral, genetic) and exactly one passage count on the genealogy's root node; answers CQ-17.
  • Disjointness guardsBatchNotProcessShape and MaterialNotEquipmentShape, which catch the continuant/occurrent (batch vs. fermentation) and material/equipment (batch vs. vessel) conflations the OWL-RL profile leaves uncaught.
  • Shared-fate impact analysis (CQ-04) — the traversal up a failed lot's derivedFrom lineage to the shared WCB-CHO-001 and back down to its siblings, scoping a recall to DP-001/DP-002 by query rather than guesswork.
  • Validation report — the queryable RDF graph SHACL returns on failure (focus node, result path, constraint, value, severity, message), used to route an investigation; CQ-11 confirms it fires on exactly hmwPct for DS-004/DP-004.
  • Completeness versus correctness — SHACL guarantees a record is complete, well-formed, and in range, not that it is true; a plausible in-range falsehood — the right number on the wrong vial — passes, so the gate supports but does not replace human release judgment.

Where this leads

The model is now held to account: the gates run against a real mAb-A campaign, the HMW-aggregate out-of-spec mode is isolated to one path, the cell-bank root is proven characterized, and the disjointness guards catch what the reasoner waves through. But a validated ontology is not a finished one — specifications change as a molecule's knowledge deepens, terms are deprecated, and a new release criterion may need adding without breaking every lot already in the graph. The next chapter, Maintenance: Governance, Versioning, and Change, follows the model into its operational life: how a worked, change-controlled deprecation retires an old alias without silently overwriting it, how versioning keeps cross-book references valid, and how the executable competency-question suite becomes the regression test that every future change must still pass.