Architecture and Integration: ISA-95, OT/IT, and the Edge-to-Cloud Path
π Where we are: Chapter 6 named the information systems that each own a slice of plant data; this chapter shows the architecture that stacks them into one coherent whole β from the lowest sensor to the cloud.
In the last chapter we met the cast of systems that run a biomanufacturing plant: the historian (which records every time-stamped sensor reading), the MES (Manufacturing Execution System, which drives the step-by-step batch record), the LIMS (Laboratory Information Management System, which holds quality-control test results), and the ELN (Electronic Laboratory Notebook, where scientists record experiments). Each owns its own data, and between them lie integration seams β the places where one system must hand data to another. Where the seams leak, you get data silos: islands of information that never talk to each other.
This chapter is about the floor plan that organizes those systems. Just as a building has a structural blueprint, a factory's computers and machines follow a reference architecture β an agreed map of layers, who sits where, and how information moves between them.
Think of a tall office building. The lobby and loading dock (the factory floor) handle the physical work. The middle floors coordinate the day's jobs. The top floor is the executive suite, planning weeks ahead. Information rides the elevator up as reports, and instructions ride down as orders. The reference architecture in this chapter is the building code that specifies which function belongs on which floor β and which elevators are allowed to stop where β the rules that keep the building both safe and efficient.
What this chapter coversβ
We start with the classic layered map of an automated plant, then explain the cultural clash between the two worlds it joins, the translation layer that turns raw signals into meaning, and the modern path that carries data from the machine edge all the way to the cloud.

An integrated upstream/downstream plant. Physically one facility; in data terms, a layered architecture from field sensors up to enterprise systems.
Integrated USP/DSP plant. "USP DSP Plant" by CC1984USA, licensed under CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/), via Wikimedia Commons; used unmodified. This image is licensed under CC BY-SA 4.0 and may be reused under the same license; this license applies to the image only, not to the rest of this book.
The ISA-95 / Purdue hierarchy: data flows up, commands flow down, and OT meets IT.
Original diagram by the authors, created with AI assistance.
The layered map: Purdue and ISA-95β
We met the Purdue/ISA-95 levels briefly in Chapter 5; here we unfold all five. In 1994, Theodore Williams and colleagues at Purdue University formalized a way to describe any large industrial enterprise as a stack of layers β the Purdue Enterprise Reference Architecture, usually shortened to the Purdue model [1]. That model became the backbone of the international standard ISA-95 (also published as IEC 62264), which defines how the systems that control a factory connect to the systems that run the business [2].
ISA-95 organizes a plant into five numbered levels [2]:
- Level 0 β the physical process and its field instruments. The physical reality: the bioreactor itself, its broth, its valves and pumps, plus the probes and sensors bolted to it. This is the molecule being made and the instrumentation that touches it β a pH probe measures, a valve or pump acts.
- Level 1 β basic control. The control function that reads those field instruments and commands them, second by second. Here lives the PLC (Programmable Logic Controller, a rugged industrial computer) β for example a Rockwell Allen-Bradley CompactLogix or a Siemens SIMATIC controller programmed in TIA Portal: it reads a pH probe and drives a pump to hold a setpoint. This is the regulatory-control layer that touches the process moment by moment.
- Level 2 β supervisory control. The automation an operator watches and steers across the whole unit. Here live the SCADA systems (Supervisory Control and Data Acquisition) and HMIs (Human-Machine Interfaces) shown on screen; a DCS (Distributed Control System) β such as the ABB Ability System 800xA or Emerson DeltaV widely used in biopharma β typically spans Levels 1-2, combining basic and supervisory control in one platform.
- Level 3 β manufacturing operations. The plant-coordination floor: the MES (for example Siemens Opcenter/SIMATIC IT, Rockwell FactoryTalk PharmaSuite, or Dassault SystΓ¨mes Apriso), the historian (such as AVEVA/OSIsoft PI System, AVEVA Historian (formerly Wonderware Historian), Aspen InfoPlus.21, or Honeywell PHD), and the LIMS (for example LabWare LIMS, Thermo Scientific SampleManager, or Benchling) from Chapter 6 live here, scheduling work and assembling the batch record.
- Level 4 β enterprise. The business systems β ERP (Enterprise Resource Planning, which handles orders, inventory, and finance) β that plan production weeks and months ahead.
A complementary standard, ISA-88 (IEC 61512), refines the picture for batch manufacturing β the dominant mode in biopharma, where one defined quantity of product is made at a time. ISA-88 breaks the physical plant into a tidy hierarchy β process cell, unit, equipment module, control module β and separates the recipe (what to make) from the equipment (what makes it) [7]. That separation is why the same bioreactor can run a recipe today and a different one tomorrow.
The crucial idea is direction of flow. Data flows up: a pH reading from a probe at Level 0 is acquired by the controller at Level 1, becomes a trend in the historian at Level 3, and finally a yield report at Level 4. Commands flow down: a production order at Level 4 becomes a scheduled batch at Level 3, a recipe and supervisory action at Level 2, a setpoint enforced by the controller at Level 1, and finally a valve opening on the equipment at Level 0.
The ISA-95 / Purdue levels. Orders descend; measurements ascend.
Two worlds that must meet: OT and ITβ
The lower levels (0 through 2) form the world of OT β Operational Technology: the computers that directly run physical equipment. The upper levels (3 and 4) are IT β Information Technology: the ordinary world of databases, networks, and laptops. These two worlds grew up with opposite priorities, and that difference is the central tension of plant architecture.
IT and OT both care about the CIA triad β confidentiality, integrity, availability β but they emphasize its members differently. IT teams typically prioritize confidentiality first: keep the data secret. OT typically reverses the emphasis, putting availability first, and adds safety as an overriding concern that sits outside the triad altogether [3]. A control system that pauses for a software update could ruin a two-week cell culture or, worse, create a hazard. An OT engineer may decline a routine patch that an IT administrator considers mandatory β and both are justified within their operational and regulatory logic.
Bringing these worlds together is called OT/IT convergence, and a growing research literature treats it as one of the defining challenges of industrial digitalization: the systems must connect for data to flow, yet their cultures, lifecycles, and risk models pull apart [5]. The standard architectural answer is to not let them touch directly. Between the OT levels and the IT levels sits a DMZ (demilitarized zone) β a buffer network where carefully chosen data can be exchanged without exposing the control systems to the open enterprise network. This zones-and-conduits approach β grouping equipment into security zones and allowing traffic only through defined conduits β is formalized across the IEC 62443 family of standards, which together define seven foundational security requirements, a zones-and-conduits model, and graded security levels for industrial networks. Within that family, IEC 62443-3-3 sets out the detailed system requirements and the security levels derived from those foundations [4]. The U.S. NIST SP 800-82 guide gives parallel, widely cited guidance on segmenting OT networks and the topologies that keep them safe [3].
Segmentation is an architecture decision, not a bolt-on. If you wait until a plant is built to think about zones and DMZs, retrofitting them is painful and often incomplete. Security boundaries belong on the blueprint.
From raw tags to meaning: the contextualization layerβ
A sensor does not emit knowledge; it emits a tag β a cryptic name and a number, like FIC_204.PV = 12.4. By itself that is meaningless. It only becomes information when you know it is the feed flow rate, in litres per hour, for Bioreactor 2, during batch B-2207, in the production suite. Adding that surrounding meaning is called contextualization, and the layer that does it is the connective tissue of a modern plant [5].
The workhorse protocol that bridges Level 2 control systems to Level 3 operations systems is OPC UA (Open Platform Communications Unified Architecture, standardized as the IEC 62541 series), which moves data with a typed, self-describing information model rather than as bare numbers β the default connectivity layer in most modern plants.
A popular pattern here is the UNS β Unified Namespace: a single, organized, real-time map of the whole plant where every piece of data lives at a meaningful address (for example, Enterprise/Site/Suite/Bioreactor2/FeedFlow). Instead of each system asking every other system for data point-to-point β the tangle that creates silos β every system publishes its data to the namespace and subscribes to what it needs. This publish-subscribe wiring is typically carried by a message broker using the lightweight MQTT protocol, often with the Sparkplug specification, which adds a standard structure for industrial topics and payloads so that subscribers know exactly how to interpret what arrives [6]. A Sparkplug topic and birth payload for that same feed-flow tag might look like this:
// Topic: spBv1.0/Site1/DBIRTH/PLC1/BR2
{
"timestamp": 1718308800000,
"metrics": [
{
"name": "Suite/Bioreactor2/FeedFlow",
"alias": 204,
"timestamp": 1718308800000,
"dataType": "Float",
"value": 12.4,
"properties": {
"engUnit": { "type": "String", "value": "L/h" },
"batchId": { "type": "String", "value": "B-2207" }
}
}
],
"seq": 1
}
In Sparkplug B, sequence number 0 is reserved for the NBIRTH that opens the session, so the NBIRTH (seq 0) precedes this DBIRTH, which carries seq 1.
The same reading, landed in the historian or a data lake as one contextualized row, carries every field a downstream model needs:
timestamp,batch_id,equipment,tag,value,eng_unit
2024-06-13T20:00:00Z,B-2207,Bioreactor2,FIC_204.PV,12.4,L/h
A related but distinct idea is the data fabric β not a synonym for the UNS or the broker, but an architectural approach that provides a unified, integrated access layer over many scattered sources, weaving them into one queryable whole. A UNS can be one way to feed such a fabric.
The Unified Namespace does not replace the historian, MES, or LIMS from Chapter 6. It sits beside them as a shared, real-time meeting point, so each system can contribute and consume data without bespoke point-to-point links.
Edge to cloud: where the computing happensβ
The last architectural question is where the data is processed. Two answers, used together, define the modern pattern.
Edge computing means processing data right next to the equipment that produces it. This matters for PAT β the Process Analytical Technology framework from Chapter 4, whose real-time, in-line measurements feed control decisions during the run. When a measurement must drive a correction on the spot β say a Level 1 controller holding a pH or dissolved-oxygen setpoint β sending the reading to a distant data center and waiting for a reply is far too slow for time-critical control. Low-latency control belongs at the edge.
Cloud computing means processing and storing data in large, remote, elastic data centers. This is ideal for the opposite job: pooling years of batch data into a data lake (a vast store of raw and processed data), training analytics and machine-learning models across many runs, and keeping records for the long retention periods that regulators require.
Real plants therefore run a hybrid architecture: fast, deterministic decisions at the edge; heavy analytics and durable storage in the cloud; the contextualization layer streaming curated data upward between them.
Why it mattersβ
For data management, architecture is destiny. The level a system sits on determines what data it can reach, what it must hand off, and where the silos form. A historian at Level 3 can only enrich a tag if Level 1 actually published the context. An analytics model in the cloud is only as good as the contextualized stream the UNS feeds it. Get the layering right and data flows as a connected thread; get it wrong and you spend the project's life writing brittle one-off connectors between systems that were never meant to meet. This is also where regulation lands: in GMP (Good Manufacturing Practice) manufacturing, EU Annex 11 governs computerised systems, explicitly requiring that the architecture preserve data integrity and be validated as fit for purpose across this OT/IT span [8]. In the U.S., 21 CFR Part 11 imposes parallel requirements for electronic records, electronic signatures, and system validation across the same boundary.
In the real worldβ
Walk into a modern biomanufacturing facility and you can read the ISA-95 levels off the equipment: PLCs and a DCS in the control room (Levels 1-2), an MES and historian in the server room (Level 3), an ERP in the corporate cloud (Level 4). The work of U.S. institutes such as NIIMBL β and programs like its real-time lab-data integration efforts β targets precisely the seams in this stack: making a LIMS result at Level 3 land, with its full context intact, where an enterprise model at Level 4 or an operator at Level 2 can act on it. Vendors increasingly ship Unified Namespace and MQTT/Sparkplug tooling out of the box [6], and the OT/IT convergence surveyed in the recent literature is moving from aspiration to default plant design [5]. The reference architectures are not academic: they are the shared language that lets a sensor vendor, a SCADA vendor, and an ERP vendor build systems that actually connect.
Key termsβ
- Reference architecture β an agreed map of a system's layers and how they connect.
- Purdue model β the foundational layered model of an industrial enterprise, basis for ISA-95.
- ISA-95 (IEC 62264) β the standard defining Levels 0-4 and how control systems link to business systems.
- ISA-88 (IEC 61512) β the batch-control standard separating recipe from equipment in a physical hierarchy.
- Levels 0-4 β physical process and field instruments, basic control, supervisory control, operations, and enterprise.
- PLC β Programmable Logic Controller; the basic/regulatory controller at Level 1 that reads instruments and drives actuators.
- SCADA / DCS β supervisory control and HMI software at Level 2; a DCS typically spans Levels 1-2.
- ERP β Enterprise Resource Planning; the business system at Level 4.
- OT / IT β Operational Technology (runs equipment) vs. Information Technology (runs data and business).
- CIA triad β confidentiality, integrity, availability; IT typically emphasizes confidentiality first, OT availability first, with safety an added concern outside the triad.
- DMZ β a buffer network between OT and IT that limits direct exposure.
- Zones and conduits β the IEC 62443 segmentation model grouping equipment and controlling traffic.
- Tag β a raw sensor name-and-value, meaningless without context.
- Contextualization β adding the surrounding meaning that turns a tag into information.
- UNS (Unified Namespace) β a single real-time map where all plant data lives at meaningful addresses.
- OPC UA (IEC 62541) β the typed, self-describing connectivity protocol that bridges Level 2 control to Level 3 operations.
- MQTT / Sparkplug β the lightweight publish-subscribe protocol and its industrial structuring spec.
- Data fabric / data lake β an architectural approach giving a unified, integrated access layer over scattered sources (distinct from a UNS); and a large store for raw and processed data.
- Edge / cloud computing β processing next to the equipment vs. in remote elastic data centers.
- PAT β Process Analytical Technology; the framework from Chapter 4 for designing, analyzing, and controlling manufacturing through timely measurement (not the instruments themselves).
Where this leadsβ
We now have the floor plan and the elevators β but an elevator only helps if everyone agrees how to load it. In Connectivity and Interoperability Standards we survey the languages that actually move biomanufacturing data with structure and meaning: OPC UA, MTP, SiLA 2, AnIML and Allotrope, and B2MML/ISA-95 messaging. There we draw the distinction that animates the rest of the book β the difference between moving bytes and preserving meaning β the question whose deep dive waits in Part IV.