The Plant Information Systems: Historian, MES, LIMS, and ELN
π Where we are: We climb one floor above the automation layer to meet the four information systems β historian, MES, LIMS, and ELN β that each own a slice of the plant's data.
In the previous chapter, Automation and Process Control Data, we sat down at the machine room floor: the PLCs, DCS, and SCADA systems that run the process and stream out setpoints, alarms, events, and recipes, all structured by the ISA-88 batch-control standard. Those systems are brilliant at acting in the moment. But a single second's worth of sensor readings is useless unless something catches it, stores it, gives it meaning, and keeps it for years. That "something" is not one system β it is a constellation of them, sitting on the floor above the controllers, each speaking its own dialect.
This chapter introduces the four members of that constellation: the process historian (time-series data), the MES (batch execution), the LIMS and ELN (the laboratory's world), plus a few important relatives. Each owns a different slice of the truth. And the gaps between them β the integration seams β are where this entire book finds its central problem.
Imagine a hospital recording one patient. A bedside heart monitor scribbles a continuous trace every second (that's the historian). A nurse fills in the official treatment chart, signing off each step against the doctor's orders (that's the MES). The lab files blood-test results in its own system (that's the LIMS). And a researcher keeps a notebook of experiments tried along the way (that's the ELN). All four describe the same patient β but they don't automatically talk to each other, and they each call the patient something different.
What this chapter coversβ
We'll meet each system in turn β what it is, what data it owns, and why it exists separately β then step back to see the integration seams and the silos that form along them.
One physical plant, many information systems β the data's value lives in the seams between them.
Original diagram by the authors, created with AI assistance.
The process historian: remembering every secondβ
A process historian is a database built for one job: storing time-series data β long streams of measurements, each stamped with the exact time it was taken. Every measured point in the plant β a temperature probe, a pH electrode, a flow meter β is a tag, a named channel that produces a value many times a minute. Tags follow a structured naming convention so a human (and a machine) can decode them at a glance: a tag like BR-201.Temp.PV reads as bioreactor 201, temperature, process value, while BR-201.Temp.SP is the same probe's setpoint. A mid-sized biomanufacturing line can hold tens of thousands of tags, the fastest of which produce a value every second or more often. Over a month-long production campaign, this accumulates billions of data points β around 26 billion in a single 30-day run if 10,000 tags each sample once per second.
Each stored point is more than a number. A single historian row pairs the timestamp, the tag, the value, and a quality flag that says whether the instrument was healthy when it reported:
2024-06-13T14:03:07.123Z,BR-201.Temp.PV,36.8,GOOD
2024-06-13T14:03:08.123Z,BR-201.Temp.PV,36.8,GOOD
2024-06-13T14:03:09.123Z,BR-201.pH.PV,7.02,GOOD
Why not just use an ordinary relational database (the spreadsheet-like tables most business software runs on)? Because relational databases buckle under that firehose of writes, and storing every raw point would be ruinously expensive. Commercial historians like OSIsoft PI (now AVEVA PI System) β among the most widely deployed historians in biopharma β along with GE Proficy Historian and Honeywell PHD are purpose-built to ingest tens of thousands of tags and answer queries across years of them. They solve the volume problem with compression β classically the swinging-door (and related deadband) algorithms that keep only the points needed to reconstruct the signal within a defined tolerance and discard the redundant ones in between. Swinging-door compression, for instance, discards points that fall within a defined deadband around the projected trend line. The result is enormous storage savings while the shape of the curve is preserved.
That trade-off is also a data-integrity question. Suppose a pH probe reads 7.00, 7.02, 7.01, 7.03, 7.00, 7.02 β a flutter of Β±0.03 over 30 seconds. A swinging-door algorithm with a 0.05 pH deadband would keep only the first and last point and discard the four in between, and you would lose nothing that matters. But widen that deadband to 0.15, and a real excursion from 7.00 up to 7.15 that lasts only 10 seconds can vanish entirely β and with it your ability to prove you caught the deviation. Compression that is too aggressive can quietly erase a real excursion, so regulators expect the original record and its meaning to survive. The FDA's data-integrity guidance and the harmonized PIC/S guidance both insist that captured process data remain complete, attributable, and reconstructable across its lifecycle β historian compression settings included [5][8].
A historian answers "what was the temperature at 14:03:07 on Tuesday?" in milliseconds across years of data. A relational database would struggle to even hold the question. Different jobs, different tools.
MES and the electronic batch record: the system of recordβ
If the historian watches, the MES β Manufacturing Execution System β governs. Sitting between the control floor and the business systems above, the MES manages how a batch is actually made: it dispatches work instructions, enforces the approved recipe step by step, and refuses to let an operator skip ahead or use the wrong material. Commercial MES platforms built to enforce recipes this way include AVEVA Wonderware, Siemens Opcenter Execution, and KΓΆrber's Werum PAS-X (a pharma-specific MES).
Its signature output is the EBR β electronic batch record β the digital replacement for the old paper binder that documented every action in making a batch. The MES enforces the master recipe (the approved, master template for how the product is made) and produces, for each lot, a complete signed account of what was done, by whom, and when. This makes the MES the system of record for batch execution: the single authoritative source for "how this batch was manufactured."
Because that record is legally binding, it must satisfy the U.S. FDA's 21 CFR Part 11, the regulation governing electronic records and electronic signatures. Part 11 requires secure, computer-generated, time-stamped audit trails, controls over who can do what, and signatures bound to records so they cannot be transplanted or repudiated [4]. In EU-approved facilities the comparable rulebook is EU Annex 11 (EudraLex Volume 4, Good Manufacturing Practice, Annex 11: Computerised Systems, 2011), which lays down parallel requirements for validation, audit trails, and access controls. The EBR also unlocks review by exception: instead of a reviewer reading thousands of compliant entries, the system flags only the deviations β the steps that fell outside limits β so human attention goes where it's actually needed.
LIMS and ELN: the laboratory's two worldsβ
Manufacturing makes the product; the laboratory decides whether it's good enough to release. Two systems own that world.
The LIMS β Laboratory Information Management System β tracks samples and results. When a vial is pulled from a bioreactor, the LIMS assigns it an identity, routes it to the right tests, records who ran each test on which instrument, holds the specifications (the pass/fail limits a result must meet, set using validated analytical procedures under harmonized guidelines such as ICH Q2(R2), Validation of Analytical Procedures), and judges each result against them. LIMS vendors such as LabVantage, Thermo Fisher SampleManager, and STARLIMS specialize in exactly this sample-and-result tracking. It is the system of record for quality-control (QC) data β the structured, regulated answer to "did this batch meet spec?"
The ELN β Electronic Lab Notebook β is the digital descendant of the bound paper notebook. Where the LIMS handles routine, structured testing, the ELN captures the exploratory, narrative work: the experiments a scientist designs, the conditions tried, the reasoning, the unexpected result worth chasing. ELN platforms such as IDBS E-WorkBook, Labguru, and Benchling let scientists record experiments in this freer form. The boundary blurs in practice β many vendors bundle both β but the distinction matters: LIMS answers "is this sample within spec?", while the ELN answers "what did we try, and why?"
Both fall squarely under data-integrity expectations summarized by the modern ALCOA+ principle β data should be Attributable, Legible, Contemporaneous, Original, Accurate, and (the "+") Complete, Consistent, Enduring, and Available β which applies to any GxP record regardless of which system holds it. The FDA's data-integrity Q&A and PIC/S PI 041 apply these expectations to laboratory and chromatography systems just as firmly as to the shop floor [5][8].
A controlled laboratory workspace. The physical plant is mirrored by a constellation of information systems β fed by continuous sensor streams (the historian), governed by step-by-step batch instructions (the MES), tested offline against analytical results (the LIMS), with experiment notes kept separately (the ELN) β each owning a different slice of the data.
Laminar-flow cabinet. Image by syed sajidul islam, licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/), via Wikimedia Commons; used unmodified.
The supporting castβ
Four systems don't tell the whole story. Around them sit several relatives:
- SCADA archives β the supervisory-control layer often keeps its own short-term store of operational history, separate from the long-term plant historian.
- BMS / EMS β the Building Management System and Environmental Monitoring System watch the room, not the process: cleanroom temperature, humidity, differential pressure, and airborne-particle and microbial counts. In a sterile facility these records are part of the release decision.
- CDS β the Chromatography Data System acquires and processes the signals from analytical chromatography instruments; widely used platforms include Waters Empower, Agilent OpenLab CDS (successor to its ChemStation platform), and Shimadzu LabSolutions. The CDS category as a whole sits in a specialized, heavily regulated data world that regulators have long treated as a data-integrity focus area, which is why data-integrity guidance singles out chromatography systems for particular attention [5].
- ERP β the Enterprise Resource Planning system, up at the business level, owns materials, inventory, and orders, and exchanges information with the MES at the boundary between operations and enterprise.
Each of these is, in the language of computerized-system validation, a GxP-relevant system that must be assessed and validated for fitness β the risk-based discipline laid out in GAMP 5, the industry's standard playbook for assuring such systems [6].
How the pieces fit β and where they don'tβ
Here is the constellation in one view, anchored to the layered hierarchy that the ANSI/ISA-95 standard defines for who owns what between sensors and the enterprise [1].
Each box owns a different slice of the same batch. The single-headed arrows are one-way sources (a PLC feeds the historian); the double-headed arrows are bidirectional flows where data must be reconciled across a boundary (MES β ERP). Those seams β where the arrows meet β are exactly where silos form when the reconciliation breaks down.
The problem this book keeps circling lives in those seams. Each system was built by a different vendor, for a different purpose, with its own internal vocabulary. The same manufacturing batch appears in the historian as a tag prefix (e.g., BR201_BATCH0156), in the MES batch record as a formal batch number (BATCH_2024_0156), in the LIMS as the parent of many test samples (S-0156-001, S-0156-002, and so on), and in the ERP as an inventory lot (LOT-22A-MABX) β each with its own identifier scheme. Worse, the LIMS entries are themselves subdivisions: the batch is one thing, but the samples drawn from it are many, so tracing a single sample result back to the original batch conditions means bridging three or four systems. The same real-world batch wears several different names, and nothing automatically knows they belong together.
This is the classic data silo: rich data, trapped in incompatible boxes. The peer-reviewed literature on digital twins in biopharma puts it bluntly β the field's data sits in disconnected sources, and an integrated data layer is the prerequisite for breaking the silos [9]. It is also exactly what the FAIR principles β that data should be Findable, Accessible, Interoperable, and Reusable β were written to fix [2].
Why it mattersβ
For data management, the lesson is that there is no single "plant database." There is a federation of systems of record, each authoritative for its own slice, and value is created β or lost β at the boundaries between them. A question as simple as "which culture conditions gave the highest-purity batch?" requires joining the historian (conditions), the MES (which batch), and the LIMS (purity) β three systems, three vocabularies, three identifier schemes. If the seams aren't bridged, that question simply can't be answered, no matter how much data was collected.
In the real worldβ
The industry's response has two layers. Technically, the move is toward OPC UA (Open Platform Communications Unified Architecture), a vendor-neutral standard that carries not just values but their meaning β semantics β across the seams between control systems, historians, and the MES/enterprise layers [7]. Strategically, ISPE's Pharma 4.0 operating model frames this as a digital-maturity journey: converging IT and OT (the business and operational technology worlds), integrating architectures, and deliberately eliminating data silos rather than tolerating them [3]. This is the same integration imperative this book pursues: shared standards and ontologies so that a result in the LIMS and a tag in the historian can be recognized as describing the same real-world thing.
Key termsβ
- Process historian β a database optimized for storing high-volume time-series data, using compression to retain the signal cheaply [5].
- Tag β a named channel in the historian for one measured point, producing many timestamped values.
- Time-series data β measurements indexed by the exact time they were taken.
- MES (Manufacturing Execution System) β the manufacturing-operations-level system that governs batch execution [1]; the recipe it enforces is structured by the ISA-88 batch standard.
- EBR (electronic batch record) β the digital, signed record of how a specific batch was made.
- Master recipe β the approved template defining how a product is manufactured.
- Review by exception β reviewing only the flagged deviations rather than every compliant entry.
- System of record β the single authoritative source for a given slice of data.
- LIMS (Laboratory Information Management System) β the system that tracks samples, tests, specs, and QC results.
- ELN (Electronic Lab Notebook) β the digital notebook capturing exploratory experiments and reasoning.
- CDS (Chromatography Data System) β software that acquires and processes chromatography data [5].
- BMS / EMS β Building / Environmental Monitoring Systems that record cleanroom conditions.
- ERP (Enterprise Resource Planning) β the enterprise-level system for materials, inventory, and orders.
- Audit trail β a secure, time-stamped record of who changed what and when, required by 21 CFR Part 11 [4].
- ALCOA+ β data should be Attributable, Legible, Contemporaneous, Original, Accurate, plus Complete, Consistent, Enduring, and Available [5].
- Data silo β valuable data trapped in a system that can't easily share it; the opposite of FAIR [2].
- OPC UA β a vendor-neutral standard for exchanging data and its meaning across system boundaries [7].
Where this leadsβ
We now have the cast of systems and a hard look at the seams between them. The next chapter, Architecture and Integration: ISA-95, OT/IT, and the Edge-to-Cloud Path, gives the constellation a map: the ISA-95 / Purdue hierarchy that organizes everything from Level 0 sensors to Level 4 enterprise, the convergence of operational and information technology, the contextualization layer that finally lets these systems agree on what they mean, and the modern path that carries plant data from the edge all the way to the cloud.