The Upper Spine: Continuants, Occurrents, and Why Everyone Builds on BFO
📍 Where we are: Part I · Foundations of the Model — Chapter 1. The preface promised a model built step by step. We start at the very top of that model, with the small set of categories every later term hangs from.
The preface made a promise that sounds almost too easy: build one shared model and every system can read every notebook as one story. But there is a trap hiding in it. If we let the cell-culture team build their ontology and the lab build theirs and the warehouse build theirs, we have not escaped the heterogeneity problem — we have promoted it one level up. A biologist's "process" and an engineer's "process" will drift apart exactly the way BR-101 and Lot 26-001 did, only now the disagreement is buried in the structure of the model itself, where it is far harder to see and fix.
The cure is to agree, before anyone writes a domain term, on what the most general kinds of thing even are. That agreement is an upper ontology, and this chapter is about the one most of science and industry has settled on.
Before you can write a dictionary, you need parts of speech. "Noun," "verb," and "adjective" are not in any particular dictionary — they are the categories every dictionary's entries fall into, which is exactly why dictionaries written by different people are still compatible. An upper ontology is parts of speech for reality: a tiny, domain-neutral set of categories — things that persist, things that happen, qualities, roles — that every bioprocess term, however specialized, slots into. Agree on those first, and a Bioreactor built by one team and a CellCultureProcess built by another cannot help but fit together.
What this chapter covers
We meet the Basic Formal Ontology (BFO) — the standardized upper ontology — and its one load-bearing distinction between continuants (things that persist) and occurrents (things that happen). We classify the everyday furniture of a bioprocess onto that spine, dissect a single batch to see all of BFO's categories at work at once, and then descend one level to the IOF Core mid-level that makes the abstraction usable for manufacturing. We close on the honest limit: a shared spine guarantees compatibility of structure, not agreement of choice.
What an upper ontology is, and why bioprocess needs one
An upper (or foundational) ontology is a small, domain-neutral vocabulary of the most general categories that everything falls under — independent of biology, of chemistry, of manufacturing [1]. It deliberately contains no Bioreactor and no Antibody. What it contains is the answer to "what kind of thing is a bioreactor, fundamentally?" — an object that persists — versus "what kind of thing is a fermentation?" — a process that unfolds. Build every domain ontology on the same upper categories and they become reusable and combinable by construction, the way the data-management book first argued; this chapter shows the spine itself.
The upper ontology that science and engineering have largely converged on is BFO, the Basic Formal Ontology. It is not a hobby project or one vendor's invention: it is published as an international standard, ISO/IEC 21838-2, which certifies BFO as a conformant top-level ontology [1], and its design rationale is laid out at book length by its authors [2]. When the Industrial Ontologies Foundry and the biopharma working groups chose a top, they chose this one — so learning BFO is learning the grammar the manufacturing ontologies you will actually import are written in.
The one distinction that does the work: continuants and occurrents
BFO's core move is to split everything in the world into two families, and almost every modeling decision in this book traces back to which side of the line a thing sits on [2].
A continuant is a thing that persists through time as a whole, keeping its identity while it gains and loses parts. A cell, a bioreactor, a vial, a batch of drug substance, the purity of that batch, the role a vessel plays as "the production reactor" — all continuants. They exist in full at every instant they exist at all. If you point at one now and again in an hour, you are pointing at the same whole thing.
An occurrent is a thing that happens — it unfolds in time and is never present all at once, because it has temporal parts. A fermentation, a Protein A capture step, the act of taking a sample, the whole manufacturing campaign — all occurrents. You cannot point at a fermentation the way you point at a tank; at any instant you catch only a slice of it.
The split sounds abstract until you notice how often people model the wrong one. A "batch" is a continuant — the material — but the making of that batch is an occurrent. Conflate them and your graph cannot say that one bioreactor (a persisting object) hosted three different fermentations (three distinct happenings) last month, because you have only one fuzzy "batch" idea doing both jobs. Keeping the two apart is what lets the model speak precisely about equipment, material, and activity all at once.
Inside the continuants: objects, qualities, and realizables
Continuants subdivide in a way bioprocess uses constantly [2]. An independent continuant is a thing that exists on its own — a material entity like the bioreactor, the cell, the drug-substance lot. A specifically dependent continuant is a thing that exists only by inhering in something else: a quality such as the 98.611 % monomer purity, which cannot float free but must be the purity of some lot; or a realizable entity — a role (a vessel playing the role of production reactor in this campaign, a part it could stop playing), a disposition (a chromatography resin's tendency to bind antibody), or a function (the designed purpose of a viral filter). A generically dependent continuant is the odd, important one: information that can be copied across carriers — the recipe, the master batch record, the specification, the certificate of analysis. The spec is not the paper it is printed on and not the screen it shows on; it is the content, which BFO lets us model as a thing in its own right that those carriers bear.
That three-way cut — object, quality, information — is the difference between a model that can and cannot answer real questions. "What is the purity?" asks for a quality inhering in a lot. "What does the spec require?" asks about a generically dependent continuant. "What vessel was it?" asks about an independent continuant playing a role. One word, "batch," cannot carry all three; BFO's categories can.
These categories are not vague labels — each is a published BFO 2020 term with a stable IRI, and the alignment file pins every local class to one. Material entity is obo:BFO_0000040, quality is obo:BFO_0000019, and process is obo:BFO_0000015. The realizable family is three sibling terms under realizable entity (obo:BFO_0000017): role (obo:BFO_0000023), disposition (obo:BFO_0000016), and function (obo:BFO_0000034) — which is why a resin's binding tendency, a vessel's production-reactor role, and a viral filter's designed purpose each land in a different box rather than blurring into one "property" field. Those opaque numeric IRIs are the point: a label like "role" drifts between languages and teams, but obo:BFO_0000023 is the same identifier in every BFO-grounded ontology on earth, so two systems that both align to it agree on meaning without ever having compared their English.
The BFO spine with bioprocess furniture hung on it: things that persist (objects, the qualities that inhere in them, the roles they play, the information about them) on the left; things that happen (the processes that make and purify) on the right.
Original diagram by the authors, created with AI assistance.
Anatomy of one batch, placed on the spine
The payoff of an upper ontology is best seen by taking one familiar thing apart and watching every BFO category show up inside it. Take BATCH-2026-001 — not as a single record, but as the cluster of distinct entities a careful model sees there.
The batch material itself — the cells and broth and, later, the purified protein — is a material entity, an independent continuant. The production bioreactor that held it is also a material entity, but a different one: equipment persists across many batches, so the model must not fuse the vessel into the material. The fermentation — the days of growth and production — is an occurrent, a process, with the batch material and the vessel both participating in it. The 98.611 % monomer purity is a quality, a specifically dependent continuant that inheres in the drug-substance lot and nowhere else. The vessel's standing as "this campaign's production reactor" is a role, a realizable entity it bears for the duration and could shed. And the master batch record that prescribed the whole thing is a generically dependent continuant — information, copyable, distinct from any printout.
One batch is many BFO entities at once: material (the broth and the vessel, kept separate), a process (the fermentation they participate in), a quality (the purity that inheres in the lot), a role (what the vessel is being used as), and information (the record that prescribed it).
Original diagram by the authors, created with AI assistance.
Notice what the discipline bought us. Because the vessel and the material are separate material entities, the graph can later say a different batch ran in the same vessel without contradiction. Because the fermentation is an occurrent that both participate in, "when did this happen?" has somewhere to attach. Because purity is a quality of the lot, it cannot be accidentally asserted of the empty tank. The upper ontology did not add facts; it gave every fact the right kind of home, which is what keeps the model honest as it grows.
Written as triples in the running example, that one batch fans out into the distinct BFO kinds — a material entity, a separate material entity for the vessel, the role it bears, and the occurrent that outputs the batch and occurs in the vessel:
# instances.ttl — one batch, its vessel, and its run, each a different BFO kind.
bp:BATCH-2026-001 a bp:Batch ; # a material entity (independent continuant), typed once
bp:derivedFrom bp:SEED-001 ;
bp:participatesIn bp:CCP-001 ; # ...that participates in an occurrent
bp:monomerPct "98.611"^^xsd:float . # ...and bears a quality
bp:BR-101 a bp:ProductionBioreactor ; # the vessel — a SEPARATE material entity
bp:hasRole bp:BR-101-role . # ...bearing a realizable role
bp:BR-101-role a bp:ProductionReactorRole . # a role (realizable entity)
bp:CCP-001 a bp:CellCultureProcess ; # the run — an occurrent (process)
bp:occursIn bp:BR-101 ; # ...that occurs in the vessel
bp:hasOutput bp:BATCH-2026-001 . # ...and outputs the batch material
From spine to plant: the IOF Core mid-level
BFO is deliberately too abstract to model with directly — it has no Equipment and no Material, only object and process. The bridge from that thin top to a usable manufacturing vocabulary is a mid-level ontology, and for industry that is the IOF Core, published by the Industrial Ontologies Foundry as a BFO-grounded set of concepts every manufacturing domain can share — equipment, materials, processes, capabilities, and the relations among them [5]. IOF, in turn, was modeled explicitly on the OBO Foundry, the life-sciences community that proved coordinated, principle-based ontology building works at scale: dozens of biomedical ontologies that interlock instead of overlap because they share an upper grounding and design rules [3].
So the stack has three rungs, and the knowledge-graph chapter of the open-source book climbs all of them: BFO at the top (what kind of thing), IOF Core in the middle (generic manufacturing), and a biopharma domain ontology at the bottom (Bioreactor, MasterRecipe, ChromatographyColumn). Our local bp: namespace is the fourth rung — a tiny site-specific vocabulary that aligns up to the domain ontology rather than reinventing it. Each rung specializes the one above, so under a reasoner a bp:Batch inherits its IOF type (an IOF material artifact) and its BFO type (a material entity, an independent continuant), with all the compatibility that guarantees. There is no IOF class literally named "material entity" — that term is BFO's; the IOF rung contributes material artifact, manufacturing process, piece of equipment, and the like.
Four rungs, each specializing the one above: a neutral upper ontology, an industrial mid-level modeled on the OBO Foundry, the biopharma domain, and the local vocabulary that aligns up to it — so a local class inherits compatibility for free.
Original diagram by the authors, created with AI assistance.
Those four rungs are not a metaphor — they are literally what the alignment file asserts. Each local bp: class declares itself a subclass of a real, published upper term, so a bp:Batch asserts it is an IOF material artifact and a BFO material entity. A caveat on provenance: these IRIs were verified, but not all from one place — the BFO and OBO terms against the EBI Ontology Lookup Service (OLS4), and the IOF terms against the published IOF release on GitHub, because OLS does not host IOF. Each IRI is live and dereferenceable:
# align.ttl — local vocabulary aligned UP to BFO 2020 + IOF Core + IOF biopharma (verified IRIs).
@prefix bp: <https://example.org/bioproc#> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix iof: <https://spec.industrialontologies.org/ontology/construct/> .
bp:Material rdfs:subClassOf obo:BFO_0000040 . # BFO 'material entity'
bp:Process rdfs:subClassOf obo:BFO_0000015 . # BFO 'process'
bp:CellCultureProcess rdfs:subClassOf iof:ManufacturingProcess . # IOF Core mid-level
bp:Batch rdfs:subClassOf iof:MaterialArtifact . # IOF Core
bp:CaptureChromatography rdfs:subClassOf iof:CaptureStep . # IOF biopharma unit op
bp:CellLine rdfs:subClassOf iof:CellLine . # IOF biopharma material
bp:Quality rdfs:subClassOf obo:BFO_0000019 . # BFO 'quality'
bp:InformationArtifact rdfs:subClassOf iof:InformationContentEntity .
There is a gap worth naming, because it is where most adoptions trip. Asserting rdfs:subClassOf iof:MaterialArtifact gives a real, shared IRI another team's IOF-aware tool can line up with — but on its own it does not make a reasoner conclude that a bp:Batch is a BFO material entity, because the chain from iof:MaterialArtifact up to BFO lives inside the IOF ontology, which the alignment file does not load. To turn the assertion into an inference, you owl:imports IOF Core — which itself imports BFO 2020 — and the reasoner then classifies your terms through the whole stack and checks them for consistency against it. IOF is also more than Core: it ships domain modules, and the biopharma module turns out to be substantial. Audited directly against the published release (Release_202602), it defines 171 classes — all marked Released, not provisional — including 44 unit-operation classes and 17 QbD-parameter classes. So the running example reuses real, verified IOF terms across the whole interior rather than re-minting them: iof:Bioreactor, iof:ChromatographyColumn, and iof:MasterRecipe for the equipment and recipe; iof:CaptureStep, iof:ViralClearance, iof:ViralInactivation, iof:ViralFiltration, iof:PolishingProcess, and iof:DrugProductFormulationProcess for the unit operations; iof:CellLine and iof:ClonedCellLine for the line and the clone it descends from; and iof:QualityAttribute, iof:ProcessParameter, and the normal-operating- and proven-acceptable-range expressions for the QbD scaffolding. Where IOF genuinely has no settled term, the alignment stays honest about the gap rather than papering over it: there is no bare iof:Specification (so bp:Specification aligns to iof:RequirementSpecification instead), and there is no fill-finish, aseptic-fill, or lyophilization class at all — so bp:FillFinishProcess stays a flagged local class. The running example does both halves — its align.ttl reuses the IOF terms it could confirm, and a companion bioproc-imports.ttl carries the real owl:imports — while keeping the offline validator scoped to what it can prove without fetching the external stack. The honest one-line summary: the alignment buys shared vocabulary for free and cross-ontology reasoning only once you import.
The relations are part of the spine too
An upper ontology standardizes not only the kinds of thing but the relations between them, and getting those right matters just as much. The biomedical community wrote them down early in the Relation Ontology (RO), arguing that relations like part of, participates in, has participant, is about, and derives from must be defined as carefully as classes or two ontologies will use "part of" to mean two different things [4]. BFO supplies the backbone relations: a continuant participates in an occurrent (the batch participates in the fermentation), a quality inheres in a continuant (the purity inheres in the lot), a role is realized in a process. Our workhorse derivedFrom — the genealogy edge from the preface — is a domain relation that sits cleanly on this backbone, relating one material entity to the parent material it came from. Declaring it once, with a clear definition, is what stops derivedFrom, madeFrom, and comesFrom from multiplying into three half-synonyms nobody can query across.
The unsolved part: a shared spine is necessary, not sufficient
It is tempting to read all this as: adopt BFO and the meaning problem is solved. It is not, and the gap is worth naming exactly. A shared upper ontology guarantees that two independently built ontologies are structurally compatible — that both agree a process is an occurrent and a purity is a quality. It does not guarantee that two modelers facing the same plant make the same choices: one may model "harvest" as a process, another as the material that results from it, and both can be BFO-conformant while still failing to line up. Compatibility of structure is not agreement of content.
Worse, the abstraction has a real cost that the field argues about openly. Classifying every entity correctly onto the spine takes genuine ontological expertise — the continuant/occurrent line is sharp in theory and slippery in practice (is a "batch" the material, the record, or the run? the honest answer is "three different entities," which is more work than most projects budget for). Over-modeling is its own failure: a graph where every trivial thing is decomposed into roles and dispositions becomes unusable. So the upper spine is a foundation, not a finished building. It rules out whole categories of incompatibility and silent error — which is a lot — but the discipline of choosing the same classes, and of stopping the modeling at a useful depth, remains a human practice this book returns to in model governance and in the final verdict. BFO tells you what kinds of thing exist; it cannot tell you that your colleague modeled the harvest the way you did.
Why it matters
Every later chapter in this book names new entities — a target, a design space, a chromatography pool, a release specification — and every one of them will be placed on this spine. The reason to do that work is leverage: a term anchored to BFO is interoperable with every other BFO-grounded ontology on earth without a single bespoke adapter, which is the entire promise of the FAIR principles made structural. Skip the spine and you get a vocabulary that works inside one project and nowhere else — the private-dialect trap, rebuilt in more expensive materials.
In the real world
BFO's reach is not aspirational. It is the upper ontology under a large share of the OBO Foundry's biomedical ontologies, it is an ISO/IEC standard, and it is the grounding the Industrial Ontologies Foundry chose for manufacturing — including the biopharma ontologies aimed at monoclonal-antibody lines like the one this book models [1][5]. In practice you rarely hand-classify entities against raw BFO; you import a domain ontology that already did it, and you author your local terms in an editor like Protégé that can check your alignment. The spine is mostly invisible in daily use, exactly as parts of speech are invisible while you write — present in every sentence, noticed only when something is grammatically wrong.
Key terms
- Upper (foundational) ontology — a small, domain-neutral vocabulary of the most general categories, on which domain ontologies build so they stay compatible.
- Basic Formal Ontology (BFO) — the standardized upper ontology (ISO/IEC 21838-2) used across science and industry.
- Continuant — an entity that persists through time as a whole, present in full at every instant it exists (a cell, a vessel, a lot, a purity, a role).
- Occurrent — an entity that happens and unfolds in time, with temporal parts (a fermentation, a capture step, a campaign).
- Independent continuant / material entity — a thing that exists on its own, such as the bioreactor or the drug-substance lot.
- Specifically dependent continuant — a quality (the monomer purity) or realizable entity (role, disposition, function) that exists only by inhering in something else.
- Generically dependent continuant — copyable information, such as the recipe, master batch record, specification, or certificate of analysis.
- IOF Core — the BFO-grounded mid-level manufacturing ontology that supplies shared concepts (equipment, material, process) for domain ontologies to specialize.
- OBO Foundry — the life-sciences community whose coordinated, principle-based ontologies inspired the industrial equivalent.
- Relation Ontology (RO) — the effort to define relations (part of, participates in, derives from) as carefully as classes so they mean the same thing across ontologies.
Where this leads
We have the spine: the categories every term hangs from and the relations that wire them. The next chapter, Classes, Relations, and Axioms: Building the Vocabulary, descends from the upper categories to the actual act of authoring — defining a Batch class, a derivedFrom relation, and the axioms that turn a loose vocabulary into something a reasoner can check and extend. We move from what kinds of thing exist to how you write them down.