Automation and Process Control Data
π Where we are: We move one rung up from the instruments themselves to the automation layer that reads them, runs the process, and turns their signals into structured records.
In the previous chapter, Instruments and Sensors as Data Sources, we met the physical devices that sit closest to the process stream β the in-line and on-line probes and spectrometers that measure temperature, pH, dissolved oxygen, or the chemical fingerprint of a culture. But a sensor on its own is just a voltage. Something has to read it many times a second, decide what to do about it, push a pump or a valve in response, and write down what happened. That "something" is the automation and control layer, and it is one of the busiest data factories in the entire plant.
This is the silent machinery between the sensor and the database. It does not just observe the process β it runs it, and in doing so it emits a continuous stream of numbers, alarms, operator actions, and recipe values that will later become the official record of how the batch was made.
Think of an aircraft autopilot. Sensors report the plane's altitude and speed; the autopilot compares them against the flight plan and nudges the controls to stay on course; and a flight data recorder logs every reading and every adjustment. A modern bioreactor works the same way. Controllers hold each condition on its target, and a "flight recorder" captures the whole run β the data that proves the medicine was made correctly.
What this chapter coversβ
We will climb the control hierarchy from sensor to controller to the screens operators watch; meet the ISA-88 standard that gives batch recipes their shape; catalog the four kinds of data this layer emits; touch on the modern equipment and interfaces it runs on; and see why all of this becomes the backbone of the electronic batch record.
From sensor to setpoint: the control hierarchyβ
Automation in a process plant is organized as a layered stack, often called the Purdue model and formalized in the ISA-95 standard (internationally, IEC 62264) as Levels 0 through 4 [2][3]. For this chapter we only need the bottom of that stack: Level 0 is the physical process with its sensors and actuators, Level 1 is the controllers, and Level 2 is supervision; Level 3 is the MES (Manufacturing Execution System), which schedules and tracks batches, and Level 4 is enterprise planning and business systems. The chapter Architecture and Integration: ISA-95 lays out all five levels in full, so here we stay close to the equipment.
At Level 1 sit the workhorses: the PLC (Programmable Logic Controller), a rugged industrial computer, and the DCS (Distributed Control System), a network of controllers that work together to run an entire process or plant in a coordinated way. (We introduce the DCS at Level 1 for simplicity, but in practice it spans basic and supervisory control β Levels 1-2 β as the Architecture and Integration: ISA-95 chapter details.) In a real biopharm plant these are recognizable products: a Siemens SIMATIC S7-1500 PLC driving a skid, or an Emerson DeltaV, Yokogawa CENTUM, or ABB Ability System 800xA DCS coordinating a whole suite. These devices read sensors and drive actuators dozens of times per second. The logic they run is itself standardized: IEC 61131-3 defines the programming languages β such as Structured Text, Ladder Diagram, Function Block Diagram, and Sequential Function Chart (well suited to the step sequences ISA-88 describes) β in which control engineers write that logic [6].
The core job of a controller is closed-loop control. It compares the process value (the live measurement β say, the broth at 36.4 Β°C) against the setpoint (the target β 37.0 Β°C) and acts to close the gap, adjusting a heater until the two match. The difference between them, the error, is what the controller works to drive toward zero. This single idea β measure, compare to target, correct, repeat β is the heartbeat of the plant. It is also exactly the capability the FDA's 2004 Guidance for Industry β PAT: A Framework for Innovative Pharmaceutical Development, Manufacturing, and Quality Assurance set out to encourage: real-time process control, built on continuous sensing and the PLC/DCS automation that acts on it, as the preferred path to building quality into a process rather than testing for it afterward. Which parameters get a controller and a setpoint in the first place is not arbitrary either: that control strategy is shaped by the ICH quality guidelines β notably Q9 (Quality Risk Management) and Q10 (Pharmaceutical Quality System) β which steer attention toward the parameters that actually drive product quality.
Above the controllers, at Level 2, sits SCADA (Supervisory Control and Data Acquisition) and its visible face, the HMI (Human-Machine Interface) β the screens through which operators watch trends, change setpoints, and respond to problems [2]. The operator rarely touches a sensor directly; they act through the HMI, and every action they take is a data point in its own right.
The automation layer doesn't just act on data β it manufactures a new class of it: setpoints, alarms, and as-executed recipes.
Original diagram by the authors, created with AI assistance.
ISA-88: the grammar of a batchβ
Most biologics today are still made as a batch process β a defined quantity of product run as a sequence of ordered steps (continuous/perfusion processing, which we meet later, is the growing exception). To keep batch automation consistent and portable across vendors and sites, the industry uses ISA-88 β specifically Part 1 (ANSI/ISA-88.00.01-2010; its 1995 predecessor was adopted internationally as IEC 61512-1:1997), which defines the models and terminology for batch control β a standard that gives batch control a shared vocabulary and a layered set of models [1]. Its roots reach back to NAMUR's earlier recommendation NE 33, which laid out the first requirements for recipe-based operation and directly informed ISA-88 [4].
ISA-88 separates two things that plants used to tangle together: the equipment (what you have) and the procedure (what you do with it) [1][7]. The physical model breaks the plant into a hierarchy β site, area, process cell, unit (for example, one bioreactor), equipment module, control module. The procedural model mirrors it with procedures, unit procedures, operations, and phases (the smallest reusable action, such as "transfer" or "inoculate"). Keeping the two apart is what lets a recipe be written once and reused on different equipment [7].
The recipe is the heart of it. ISA-88 defines a recipe as it travels and is refined through four forms [1]:
- A general recipe is equipment-independent β the product's essential know-how, the science of how to make it.
- A site recipe adapts that to a particular plant.
- A master recipe binds the procedure to specific equipment and becomes the template a batch is run from.
- A control recipe is a single live instance of the master recipe for one actual batch β and, once the batch finishes, it carries the as-executed values: what really happened, not just what was planned.
When a recipe has to leave the control system and travel to other software β an MES, a scheduler, an analytics platform β it is commonly serialized in BatchML (Batch Markup Language), the XML implementation of ISA-88 maintained by MESA International alongside B2MML (the ISA-95 implementation), which the Connectivity and Interoperability Standards chapter discusses. A control-recipe parameter then looks roughly like this:
<RecipeElement>
<ID>INOCULATE.Temperature.Setpoint</ID>
<Parameter>
<Value><ValueString>37.0</ValueString><UnitOfMeasure>degC</UnitOfMeasure></Value>
</Parameter>
</RecipeElement>
The gap between the master recipe (the plan) and the control recipe (the reality) is exactly where the most valuable manufacturing data lives. The plan says "hold 37.0 Β°C for 14 days." The as-executed record says "held 36.8 to 37.1 Β°C, with two brief excursions on day 6." Quality is judged on the second story, not the first.
What the automation layer actually emitsβ
Running the process generates four broad streams of data, and understanding their shapes is essential to managing them downstream.
Tag time-series. A tag is a single named, addressable variable in the control system β BR101.Temp.PV for bioreactor 101's temperature process value. A single bioreactor exposes dozens of such tags, following a consistent naming convention: BR101.pH.PV, BR101.DO.PV (dissolved oxygen), BR101.Agit.PV (agitation speed), BR101.Temp.SP (the temperature setpoint), BR101.AcidPump.OUT, and so on. Each tag is sampled rapidly and continuously, producing high-frequency time-series β thousands of tags, each a long column of timestamped numbers. Written out, a few rows of one tag's history are about as plain as data gets:
timestamp, tag, value, unit
2024-06-14T10:05:23Z, BR101.Temp.PV, 36.95, degC
2024-06-14T10:05:24Z, BR101.Temp.PV, 36.96, degC
2024-06-14T10:05:25Z, BR101.Temp.PV, 37.01, degC
This is the raw, dense material that the process historian (next chapter) is built to store, and that later multivariate analysis (modeling many process variables together, Part V) depends on.
Alarms and events (A&E). When a process value crosses a configured limit, the system raises an alarm β an audible or visual signal demanding operator attention. ISA-18.2 is the authoritative standard for managing these: it defines an alarm's full lifecycle and insists on rationalization (deciding which alarms are genuinely necessary) and prioritization so operators are not buried in noise [5]. Each alarm and each routine event is a timestamped record of something the process did or that an operator was told β typically carrying a timestamp, a source tag, an alarm identifier, a human-readable message, and a priority, like this:
timestamp, tag, alarm_id, message, priority
2024-06-14T10:42:11Z, BR101.Temp.PV, TI-101-HI, Temperature High, HIGH
Operator actions. Every setpoint change, manual override, batch start or hold, and acknowledgment made through the HMI is logged. Under U.S. regulation 21 CFR Part 11 and the EU's GMP Annex 11 (EudraLex Volume 4, "Computerised Systems"), electronic records like these must be trustworthy and, where required, tied to electronic signatures identifying who did what and when [8][10].
Recipe parameters and their as-executed values. Finally, the layer records the recipe parameters it was given and the values it actually achieved β the marriage of plan and reality described above.
A critical thread runs through all four: the audit trail. The FDA defines it as a secure, computer-generated, time-stamped record that lets a reviewer reconstruct the course of events, and it expects these trails to be reviewed [9]. Tags, alarms, and operator actions only become trustworthy records when the layer captures them in a way no one can silently alter.
The equipment underneathβ
Modern biomanufacturing increasingly runs on single-use systems β sterile, disposable plastic assemblies β and skids, which are pre-assembled, self-contained units (a chromatography skid, a buffer-prep skid) that arrive with their own local controllers and tags. Single-use platforms from vendors like Sartorius (the BIOSTAT STR stirred-tank bioreactor, or the ambr bench-scale system) and Cytiva (the Xcellerex XDR series) ship with controllers embedded in the skid hardware, each carrying its own tag namespace. Each skid is thus a small island of automation that the site must integrate into the larger whole.
The bridge that carries a skid's data to the rest of the plant is most often OPC (Open Platform Communications) β a family of open interface standards (the older OPC is now largely superseded by OPC UA, for Unified Architecture) that lets controllers and software from different vendors exchange tags and events using a standard protocol, reducing the custom point-to-point wiring every pairing used to require. It does not erase integration work entirely: both ends must support the OPC UA specification, and the meaning of each tag β what BR101.Temp.PV actually represents β still has to be mapped between systems. OPC lives at the edge between the control layer and the information systems above it. (We treat OPC UA and the broader connectivity standards in depth in Connectivity and Interoperability Standards.)
Why it mattersβ
Here is the payoff for data management: the streams this layer emits are not a byproduct of manufacturing β they are the official manufacturing record. The electronic batch record (EBR), the legal document attesting that a batch was made correctly, is one the MES (next chapter) assembles largely from this control and recipe data: the as-executed parameters, the operator signatures, the alarm history, and the tag trends [8][9].
Because that record carries regulatory weight, the way the automation layer structures its data has consequences far downstream. If tags are named inconsistently across units, if alarms are not rationalized, if operator actions are not signed, or if the audit trail can be edited, then the EBR β and every analysis built on top of it β inherits those flaws [5][9]. Conversely, a clean, ISA-88-structured, well-tagged control system is what makes later multivariate analysis possible at all: you can only model a process whose data is consistently labeled, time-aligned, and trustworthy.
In the real worldβ
ISA-88 and ISA-95 are not academic ideals; they are the lingua franca of vendor batch software β products like Siemens Opcenter Execution Pharma, KΓΆrber's PAS-X, and Emerson Syncade β and the way real plants describe their processes to an MES (Manufacturing Execution System), the Level 3 system that orchestrates batch execution [2][3]. When NIIMBL's SABRE pilot facility β a cGMP biomanufacturing and workforce-training center being built at the University of Delaware to scale up and de-risk advanced biomanufacturing β comes online and runs intensified single-use perfusion processes, its skids, controllers, and historians will be speaking exactly this language β tags flowing up through OPC, recipes phrased in ISA-88 terms, alarms managed against ISA-18.2 expectations, and audit trails kept to Part 11 standards [1][5][8]. The standards are what let a process developed in one facility be transferred to another and still produce data that looks the same.
Key termsβ
- Control hierarchy (Purdue / ISA-95 levels) β the layered stack from sensors (Level 0) through controllers (Level 1) and supervision (Level 2) to plant and business systems (Levels 3-4).
- PLC / DCS β a Programmable Logic Controller and a Distributed Control System; the industrial computers that run closed-loop control.
- SCADA / HMI β the supervisory layer and its operator screens.
- Closed-loop control β continuously comparing the process value to the setpoint and correcting the difference.
- Setpoint vs. process value β the target a controller aims for vs. the live measurement it reads.
- ISA-88 (IEC 61512) β the batch control standard separating equipment from procedure and defining recipe types.
- Recipe types β general, site, master, and control recipes; the last carries the as-executed values.
- Phase β the smallest reusable procedural action in ISA-88.
- Tag β a single named variable in the control system, sampled into high-frequency time-series.
- Alarms and events (A&E) β timestamped signals (governed by ISA-18.2) that something needs attention or was logged.
- Audit trail β a secure, time-stamped record letting a reviewer reconstruct what happened.
- Electronic batch record (EBR) β the legal record that a batch was made correctly, assembled largely from control data.
- Single-use / skid β disposable plastic process assemblies and pre-built self-contained units with their own controllers.
- OPC β an open interface standard for exchanging tags and events between vendors' systems.
Where this leadsβ
The automation layer produces the raw streams; it does not, by itself, store, organize, or reconcile them for the long term. That job belongs to a constellation of plant information systems sitting above the control layer β each owning a different slice of the data. The next chapter, The Plant Information Systems: Historian, MES, LIMS, and ELN, introduces them in turn: the process historian that swallows the tag time-series, the MES that executes the batch and assembles the EBR, the LIMS that holds laboratory results, and the ELN that records experiments β and the integration seams and data silos that open up between them.