Skip to main content

Speaking OT: OPC UA, MQTT, and Sparkplug B

๐Ÿ“ Where we are: Part II ยท Capturing the Process โ€” we built a deterministic bioreactor and named its tags; now we make that data speak over the two protocols every modern plant floor runs on, with real security and an honest look at where field deployments get it wrong.

The simple version

Imagine the bioreactor is a person who only knows how to mumble numbers. Two helpers translate for the rest of the plant. The first, OPC UA, is a meticulous librarian: ask it anything and it hands you not just the value but a labelled card โ€” "this is temperature, in degrees Celsius, measured at 14:32:07, and I'm confident it's good." The second, MQTT with Sparkplug B, is a town crier on a party line: every device announces itself when it wakes up ("BR101 is online, here are my metrics"), shouts changes as they happen, and โ€” cleverly โ€” leaves a sealed death certificate with the switchboard so that if it drops dead mid-shift, everyone is told instantly. This chapter stands up both, wires them to our simulated CHO bioreactor, and is blunt about the part most people skip: actually turning on the security.

What this chapter coversโ€‹

In Chapter 4 we gave every signal a disciplined name like BR101.Titer.PV. A name with no transport is just a label on an empty box. This chapter fills the box and ships it. We cover:

  • The OPC UA information model โ€” a self-describing address space where each tag carries its value plus type, engineering unit, timestamp, and a quality flag โ€” and how to stand up a server and client with open-source stacks (open62541, node-opcua, asyncua).
  • MQTT publish/subscribe with Eclipse Mosquitto, and the Sparkplug B birth/death lifecycle that turns a dumb message bus into a stateful, self-discovering one.
  • Real OPC UA security: the Basic256Sha256 policy, application certificates, and strict trust lists โ€” and the uncomfortable evidence that most field servers misconfigure exactly this.
  • The difference between syntactic transport (moving bytes safely) and semantic transport (moving meaning), and why quality flags and timestamps are not optional decoration.

Everything here is backed by code in the companion repo that actually runs on your laptop. Let's open the box.

OPC UA: the self-describing librarianโ€‹

The reason OPC UA (formally IEC 62541) dominates the modern plant floor is that it refuses to ship a number naked. Its core idea is an address space: a browsable tree of nodes, where a node is not just a value but an object carrying metadata โ€” its data type, its engineering unit, its access rights, and references to other nodes [1]. A client can connect to a server it has never seen, browse the tree, and discover what is there without a pre-shared spec sheet. That self-description is the whole point: a temperature node tells you it is a temperature, in ยฐC, right now, and that the reading is trustworthy.

Our companion repo models exactly this. The file examples/chapters/05-connectivity-opcua-mqtt/opcua_server.py stands up an OPC UA server using asyncua, the pure-Python asyncio implementation from the FreeOpcUa project [2]. It exposes our BR101 bioreactor as an address space โ€” one node per live process variable โ€” and replays the deterministic fed-batch trace into those nodes. Here is the server's heart:

# examples/chapters/05-connectivity-opcua-mqtt/opcua_server.py
ENDPOINT = "opc.tcp://0.0.0.0:4841/bioproc/"
NAMESPACE = "https://example.org/bioproc"

# the tags the server publishes (a subset of the historian tags, the live PVs)
TAGS = ["BR101.Temp.PV", "BR101.pH.PV", "BR101.DO.PV", "BR101.Agitation.PV",
"BR101.Titer.PV", "BR101.OnlineGlucose.PV"]


async def build_server() -> tuple[Server, dict]:
server = Server()
await server.init()
server.set_endpoint(ENDPOINT)
server.set_server_name("BR101 Bioreactor (simulated)")
idx = await server.register_namespace(NAMESPACE)

objects = server.nodes.objects
br = await objects.add_object(idx, "BR101")
nodes = {}
for tag in TAGS:
var = await br.add_variable(idx, tag, 0.0)
await var.set_writable()
nodes[tag] = var
return server, nodes

Three details earn their keep. First, the endpoint opc.tcp://0.0.0.0:4841/bioproc/ is OPC UA's native binary TCP transport โ€” efficient, and the default a client dials. Second, register_namespace claims a URI (https://example.org/bioproc) so our BR101 object lives in its own namespace, not tangled with the server's built-in nodes; clients resolve it by index, which is why the client code later asks for the namespace number before browsing. Third, we add BR101 as an object and hang variables under it โ€” that hierarchy is the self-description. A browsing client sees BR101 โ†’ BR101.Titer.PV and immediately knows the titer belongs to that bioreactor.

When a value is written, asyncua stamps it with a Good status code and a source timestamp by default [2]. That pairing โ€” value plus quality plus time โ€” is the librarian's index card, and it is the part naive transports throw away.

Reading it back: the round-trip proofโ€‹

The repo doesn't just describe this; it proves it runs. The demo_roundtrip() function starts the server, replays ten minutes of the trace near peak titer, then connects a client and reads a node back:

# examples/chapters/05-connectivity-opcua-mqtt/opcua_server.py
async def demo_roundtrip() -> dict:
"""Start the server, replay a few steps, read a node with a client, stop."""
from asyncua import Client

server, nodes = await build_server()
state = fed_batch.simulate().state
async with server:
# replay 10 minutes near peak titer so the read is interesting
for step in range(19000, 19010):
await _pump(nodes, state, step)
await asyncio.sleep(0.2)
async with Client(ENDPOINT.replace("0.0.0.0", "127.0.0.1")) as client:
node = await client.nodes.objects.get_child(
[f"{await _ns(client)}:BR101", f"{await _ns(client)}:BR101.Titer.PV"])
titer = await node.read_value()
return {"endpoint": ENDPOINT, "tags": len(nodes), "read_titer_g_L": round(float(titer), 3)}

Run it and you get a deterministic result (the simulator is pinned to SIM_SEED=2026, so this is byte-identical on your machine):

{"endpoint": "opc.tcp://0.0.0.0:4841/bioproc/", "tags": 6, "read_titer_g_L": 4.902}

The client browsed BR101, found BR101.Titer.PV by name, and read 4.902 g/L โ€” the titer at minute 19009 of our golden batch. The underlying state the server pumped in is just as concrete:

step=19000 temp_C=36.96 pH=6.967 DO_pct=31.6 titer_g_L=4.896 glucose_g_L=1.416
step=19001 temp_C=36.98 pH=6.951 DO_pct=33.2 titer_g_L=4.897 glucose_g_L=1.413
...
step=19009 titer_g_L=4.902

This is the smallest honest unit of "connectivity": a value left one process and arrived intact in another, with its identity preserved. The chapter's test suite (tests/test_chapters.py::test_ch05_opcua_roundtrip) asserts exactly that โ€” tags == 6 and 0 < read_titer_g_L < 10 โ€” so the round-trip can never silently rot.

The same job in other languagesโ€‹

asyncua is our reference, but OPC UA is a polyglot world and the book uses three open stacks deliberately. open62541 is a C99 implementation (MPL v2.0) certified against the OPC UA Server Profile and supporting the Basic256Sha256 security policy โ€” the right choice when you need a tiny, fast server embedded near the instrument [3]. node-opcua is an MIT-licensed Node.js/TypeScript SDK, ideal when your collector lives in the application tier alongside web services [4]. They all speak the same wire protocol, so a node-opcua collector, the UaExpert desktop browser, or Telegraf's OPC UA input plugin can subscribe to our asyncua server without modification. That interoperability is the standard doing its job.

Turning on the security (the part everyone skips)โ€‹

Here is the uncomfortable truth this chapter refuses to soften. OPC UA can be locked down beautifully โ€” the OPC UA Security Model (OPC 10000-2 / IEC 62541-2) defines signed-and-encrypted channels, application certificates, and trust lists that say which peers a server will even talk to [5]. The Basic256Sha256 security policy uses SHA-256 and 2048-bit-plus RSA keys; it is the modern baseline.

But can and does are different planets. A 2020 internet-wide measurement study found that 92% of reachable OPC UA deployments had insecure configurations, and โ€” most damning โ€” of 564 servers advertising the Basic256Sha256 policy, 409 presented certificates that did not even match that policy, falling back to MD5/SHA-1 signatures or short keys [6]. In other words, most field servers claim strong security and then hand you a broken credential. The protocol was never the weak link; the deployment was.

So when you move past our laptop demo (which uses an open opc.tcp:// endpoint for teaching), real security is a few deliberate steps. With asyncua, you load the server's own certificate and key, set the allowed policy, and โ€” the step everyone forgets โ€” pin a trust list so the server rejects any client whose certificate isn't on it:

# Illustrative hardening โ€” what the production deployment adds on top of the
# demo server; this is NOT in opcua_server.py (the runnable demo has no TLS).
from pathlib import Path

from asyncua import ua
from asyncua.crypto.truststore import TrustStore
from asyncua.crypto.validator import CertificateValidator, CertificateValidatorOptions

await server.load_certificate("certs/server-cert.pem")
await server.load_private_key("certs/server-key.pem")
server.set_security_policy([ua.SecurityPolicyType.Basic256Sha256_SignAndEncrypt])

# Strict trust: only clients whose certs live in the trust folder may connect.
# A TrustStore loads the trusted peer certs (and CRLs), and a CertificateValidator
# rejects any client that isn't trusted.
trust_store = TrustStore(trust_locations=[Path("certs/trusted")], crl_locations=[])
await trust_store.load()
validator = CertificateValidator(
CertificateValidatorOptions.TRUSTED | CertificateValidatorOptions.PEER_CLIENT,
trust_store,
)
server.set_certificate_validator(validator)

The lesson the data forces on us: advertising Basic256Sha256 is worthless if you accept self-signed strangers or skip certificate validation. The honest checklist is encrypt the channel, validate the chain, and keep the trust list short and reviewed. This is also where pure OSS does fine on the wire but offers you no built-in certificate lifecycle โ€” issuance, rotation, revocation. In a GxP plant you bolt that onto a real PKI (a Global Discovery Server or your site CA), and you document it. The stack is free; the discipline is not.

A two-lane diagram of the plant connectivity backbone. The top lane shows OPC UA: the BR101 bioreactor as a browsable address-space tree, each node carrying value, unit, timestamp and quality, connected over a Basic256Sha256 signed-and-encrypted channel to a collector, with a trust list gating which clients may join. The bottom lane shows MQTT with Sparkplug B: BR101 publishing an NBIRTH announcement and DBIRTH metric definitions to the Mosquitto broker, ongoing DDATA change messages, and a pre-registered NDEATH will message the broker fires automatically when the device drops, with the historian subscribing downstream.

Two complementary transports: OPC UA answers "tell me everything about this tag, securely, on request," while Sparkplug B over MQTT announces "here is who I am and what changed," and guarantees the network learns the instant a device dies. Original diagram by the authors, created with AI assistance.

MQTT and Sparkplug B: the self-announcing town crierโ€‹

OPC UA is request-driven and heavyweight; it shines for rich browsing and secure point-to-point reads. But a plant with hundreds of devices and thin network links also wants a lightweight, fan-out path. That is MQTT (OASIS standard, also published as ISO/IEC 20922) โ€” a publish/subscribe protocol where devices publish to topics and a broker fans messages out to whoever subscribed [7]. It is famously frugal, which is why it runs on everything from a soil sensor to a bioreactor skid.

Our broker is Eclipse Mosquitto [8]. The dev-stack config, examples/platform/mosquitto/mosquitto.conf, is short and โ€” importantly โ€” honest about being dev-only:

# examples/platform/mosquitto/mosquitto.conf
# Mosquitto broker config for the local dev stack (Chapter 5).
# Dev-only: anonymous access on the plain 1883 listener. Chapter 25 (operating &
# securing) replaces this with TLS + per-client ACLs; never ship anonymous in
# a real plant.
listener 1883
allow_anonymous true

# enable the $SYS topic tree so the healthcheck can confirm the broker is alive
sys_interval 10

persistence true
persistence_location /mosquitto/data/
log_dest stdout

Read the comment as a promise: allow_anonymous true on the plain 1883 listener is fine for a laptop and forbidden on a plant. Mosquitto supports MQTT over TLS with client certificates [8]; Chapter 25 swaps this file for a TLS listener and per-client access-control lists. Showing the insecure dev config and labelling it loudly is exactly the discipline this book preaches โ€” we never let a convenient default sneak into production.

Sparkplug B: giving the bus a heartbeatโ€‹

Raw MQTT has a problem for industrial use: it is stateless and topic-anarchic. Any device can publish anything to any string, and if a device falls off the network, subscribers have no idea โ€” they just stop hearing from it, indistinguishable from "nothing changed." Sparkplug B (Eclipse Sparkplug 3.0.0) is the open specification that fixes this by defining a strict topic namespace and a birth/death lifecycle [9]. The reference encodings live in Eclipse Tahu (EPL-2.0), which provides Sparkplug B implementations in Java, Python, and C [10].

The lifecycle has four message types you must understand:

  • NBIRTH โ€” Node Birth. When an edge node (say, the BR101 controller) connects, it publishes a birth certificate announcing itself.
  • DBIRTH โ€” Device Birth. For each device under that node, a birth message that defines every metric: name, datatype, and current value. This is the self-description, the moment the bus learns BR101 has a Titer.PV that is a float in g/L.
  • NDEATH / DDEATH โ€” Node/Device Death. The announcement that the node or device has gone offline.

The genius is how death is delivered. Sparkplug leverages MQTT's Will message feature: when a device connects, it registers its NDEATH payload with the broker in advance [7][9]. If the device's connection drops โ€” crash, cable pull, power loss โ€” the broker itself publishes the pre-registered death certificate. No polling, no timeout guessing. The network learns within the keep-alive window that BR101 is gone. For a process where a silent dead sensor could mean an unnoticed temperature excursion, that guarantee is worth a great deal.

A Sparkplug topic and a decoded DBIRTH metric look like this โ€” the shape Tahu produces for our bioreactor:

topic: spBv1.0/newark/DBIRTH/BR101/reactor
{
"timestamp": 1768759740000,
"metrics": [
{ "name": "BR101.Titer.PV", "datatype": "Float", "value": 4.902, "properties": { "unit": "g/L", "quality": 192 } },
{ "name": "BR101.Temp.PV", "datatype": "Float", "value": 36.96, "properties": { "unit": "degC", "quality": 192 } }
]
}

Notice the topic structure: spBv1.0 (Sparkplug version) / newark (the group, our site โ€” lowercase to match the UNS path convention from Chapter 4) / DBIRTH (message type) / BR101 (edge node) / reactor (device). That rigid namespace is what lets any Sparkplug-aware consumer discover the whole plant by listening, and it dovetails with the Unified Namespace idea we develop in later chapters. The metric carries unit and a quality code โ€” the same value-plus-meaning discipline as OPC UA, just delivered by announcement instead of by request. A note on that number: 192 (0xC0) is the OPC DA (Classic) Good quality code, not an OPC UA status. Many Sparkplug edge nodes are fronting legacy OPC DA servers and carry the DA-style quality straight through, which is why you see it on the wire. The OPC UA server we built earlier reports Good differently โ€” UA's StatusCode Good is simply 0 (0x00000000), which is exactly what asyncua stamps on each write.

Syntactic vs semantic: moving bytes vs moving meaningโ€‹

It is worth naming the deepest idea in this chapter. Syntactic transport is moving bytes from A to B without corruption โ€” TLS, TCP, message framing. Both OPC UA and MQTT do this well. Semantic transport is moving meaning: the byte 4.902 is useless unless the receiver also learns it is a titer, in g/L, measured at a known instant, with Good quality. OPC UA carries that semantics in its address space; Sparkplug carries it in DBIRTH metric definitions. A plain MQTT message of 4.902 to topic temp carries syntax but almost no semantics โ€” which is precisely why Sparkplug exists. Throughout the rest of the book, every time data crosses a boundary, the question is the same: did the meaning survive, or just the bytes?

Why it mattersโ€‹

This connectivity backbone is the floor everything else stands on. If the transport loses a sample, mislabels a unit, or drops quality flags, then your historian, your SPC charts, and your soft sensor are all faithfully analyzing garbage. Get the transport right โ€” value, unit, timestamp, quality, all preserved, all secured โ€” and the rest of the platform inherits trustworthy inputs for free.

There is a regulatory edge too. Both EU GMP Annex 11 and FDA 21 CFR Part 11 expect controls that preserve the authenticity and integrity of records as they move between systems โ€” a documented chain of custody, not just a hopeful network [11][12]. A signed-and-encrypted OPC UA channel with a reviewed trust list, or a TLS-secured Sparkplug bus with per-client ACLs, is how you make the transport attributable โ€” you can say who sent what and prove it wasn't tampered with in flight. The insecure-by-default configs we showed are explicitly the thing an inspector would flag.

In the real worldโ€‹

Walk onto a modern mAb floor and you will find OPC UA almost everywhere control systems meet the IT world โ€” Emerson DeltaV, Siemens PCS 7, and AVEVA PI all speak it โ€” while MQTT/Sparkplug increasingly carries high-fan-out telemetry and edge data. Our fed-batch CHO + Protein A line is the dominant approved-antibody modality, and its sensors have been talking OPC for two decades. The intensified/continuous variant โ€” perfusion with multi-column capture โ€” only multiplies the tag count, which is exactly when a lightweight Sparkplug bus earns its place beside OPC UA.

NIIMBL, the U.S. public-private Institute for biopharmaceutical manufacturing, funds interoperability work of precisely this flavor, and its SABRE facility โ€” a pilot-scale cGMP (current Good Manufacturing Practice) facility at the University of Delaware that broke ground in April 2024 and remains under construction as of mid-2026 โ€” is the kind of site where an open-source connectivity layer would sit alongside validated commercial control systems.

And the honest verdict for this layer is unusually kind to open source. The OSS stacks here are genuinely production-grade: open62541 [3], node-opcua [4], and asyncua [2] are real OPC UA implementations; Mosquitto [8] and Tahu [10] are mature Eclipse projects. You can build the entire transport backbone on them with no commercial license. What OSS does not hand you is the GxP last mile: a managed PKI for certificate rotation and revocation, a vendor on the hook when an auditor calls, and validation evidence out of the box. No OSS broker or stack is 21 CFR Part 11-compliant by default โ€” compliance is a property of your configured, validated system, not of the download. The wire is open; the certificate lifecycle, the ACL review, and the validation are work you own. Watch the licenses as you scale, too: the OPC UA stacks here are permissive (MPL/MIT/LGPL) and Mosquitto/Tahu are EPL/EDL, but the commercial broker EMQX moved to a BSL license, so don't assume every MQTT option is free for production.

Key termsโ€‹

  • OPC UA / IEC 62541 โ€” a self-describing industrial protocol whose address space carries value plus type, unit, timestamp, and quality [1].
  • Address space โ€” the browsable tree of nodes (objects and variables) an OPC UA server exposes; the source of self-description.
  • Quality flag โ€” a status code travelling with every reading so consumers know whether to trust it. OPC UA's Good is StatusCode 0 (0x00000000), while the legacy OPC DA (Classic) Good code is 192 (0xC0) โ€” the value a Sparkplug bridge fronting a DA server often passes through.
  • Basic256Sha256 โ€” the modern OPC UA security policy (SHA-256, 2048-bit+ RSA) for signed-and-encrypted channels [5].
  • Trust list โ€” the explicit set of peer certificates a server will accept; the step most field deployments botch [6].
  • MQTT โ€” a lightweight publish/subscribe protocol with a central broker; OASIS/ISO 20922 [7].
  • Broker โ€” the MQTT server (here, Mosquitto) that fans published messages out to subscribers.
  • Will message โ€” an MQTT message the broker publishes on a client's behalf when it disconnects unexpectedly; the basis of Sparkplug death certificates [7].
  • Sparkplug B โ€” an open spec adding a strict topic namespace and a birth/death lifecycle to MQTT [9].
  • NBIRTH / DBIRTH / NDEATH / DDEATH โ€” Sparkplug node/device birth and death certificates; DBIRTH defines every metric, deaths announce going offline.
  • Syntactic vs semantic transport โ€” moving bytes safely vs moving meaning (value + unit + time + quality) intact.
  • cGMP โ€” current Good Manufacturing Practice; the binding expectation of controlled, documented, reproducible manufacturing.

Where this leadsโ€‹

Our bioreactor now speaks fluent OT โ€” OPC UA for rich, secure, browsable reads, and Sparkplug B over MQTT for self-announcing, fan-out telemetry. But raw floor traffic rarely flows straight into a historian; it gets filtered, reshaped, buffered, and routed first. The next chapter, The Edge Gateway: Routing Floor Data with Node-RED, Telegraf & NiFi, builds that middle layer โ€” the open-source plumbing that pulls from these protocols and delivers clean, contextualized streams to everything downstream.