Target and Concept: Learning Where a Molecule Should Start

📍 Where we are: Part II · Discovery & Development, Learned — Chapter 4. Part I built the foundations: why bioprocess breaks the data-science rulebook, what makes data the binding fuel, and how models are validated under GxP. Now we start walking the spine from its very first node — the choice of what to make, before any of it has to be manufactured.

A manufacturing book that begins at the bioreactor begins too late. Long before WCB-CHO-001 exists, before there is a clone or a process or a single gram of mAb-A (the running-example monoclonal antibody, or "mAb"), someone chose a biological target to hit, a modality to hit it with, and — implicitly, often carelessly — committed the company to making a particular kind of molecule for years. That commitment is the seed of every manufacturing problem and every manufacturing success downstream. This chapter is about the machine learning that lives at that first node, and about a harder, quieter idea that the rest of Book 5 depends on: manufacturability is a property you choose at concept, not one you discover at scale-up.

We must be honest about scope right away. Most of the AI you read about at "the start of drug development" is drug-discovery AI — protein-structure prediction, generative chemistry, target identification from omics — and it is a vast, fast-moving field of its own that this book does not try to cover. Our lens is narrow on purpose: we care about the slice of concept-stage learning that bears on whether the resulting molecule can be made reliably, at quality, at cost. That slice is small, under-studied, and exactly where the discovery world and the manufacturing world fail to talk to each other.

The simple version

Before an architect draws a skyscraper, someone decides which building to put up and on which plot of land. Choose a swamp and the most brilliant engineering downstream still fights the foundations forever. Drug discovery is the architecture competition — dazzling, and largely someone else's job. This chapter is the site survey: the early, unglamorous question of whether the ground can hold the building you are about to design. A molecule that "scores well" against its biological target but folds badly, aggregates, or cannot be expressed (manufactured) in CHO cells (Chinese Hamster Ovary, the workhorse production cell line) is a tower on a swamp. The cheapest place to learn that is here, at concept — and machine learning is starting to be able to tell you.

What this chapter covers

The first node of the manufacturing spine: target, mechanism of action (MoA), and modality as the choices that set everything downstream.
Where drug-discovery AI ends and manufacturing-relevant concept-stage learning begins — an honest boundary, drawn deliberately.
Target tractability as a prediction problem: how ML scores whether a target is reasonable to drug, what evidence-integration actually computes, and what that has (and has not) to do with manufacturing.
Manufacturability-by-design: the mindset, the in-silico developability signals available at concept, the real methods that compute them, and why they are weak, sparse, and worth using anyway.
The discovery → manufacturing handoff gap — a real, well-documented hole in the spine, and what a learned target-profile record would carry across it.
The anatomy of one manufacturability-aware target profile, the series signature applied to the very first decision, dissected field by field.
The assembly discipline: a runnable sketch that fuses the (illustrative) concept-stage predictions with the real downstream CQAs (Critical Quality Attributes — measured product-quality properties that must stay within spec) in one gradeable record — the chapter's actual contribution to the suite, since no confident predictor can honestly be shipped here.
The unsolved part: concept-stage manufacturability prediction is the weakest-grounded ML in this whole book, and pretending otherwise is dangerous.

The first node: target, mechanism, modality

Everything in this series — five books, one process — hangs off a spine of unit operations, and that spine has a head. Before molecule discovery generates candidate sequences, before cell-line development picks the clone that becomes WCB-CHO-001, a program must answer three coupled questions:

Target — which biological molecule (a receptor, a cytokine, an enzyme) does the disease hinge on, such that engaging it changes outcomes?
Mechanism of action (MoA) — how do we want to engage it: block it, degrade it, recruit an immune effector, deliver a payload?
Modality — what kind of molecule embodies that mechanism: a monoclonal antibody, a bispecific, an antibody-drug conjugate, a fusion protein, a cell or gene therapy?

They are coupled, not independent, and the coupling runs one way: biology constrains mechanism, mechanism constrains modality, and modality — chosen last and most casually — fixes the manufacturing problem. A target that is only reachable by blocking a protein-protein interface with a large flat epitope pushes you toward an antibody; a target that needs simultaneous engagement of two receptors on the same cell pushes you toward a bispecific; an intracellular target reachable by no biologic at all pushes you out of this book entirely. By the time the third question is answered, the manufacturing difficulty is already set, usually by people optimizing for the first.

For our running example the answers are fixed and familiar: the target is the antigen mAb-A binds, the mechanism is straightforward antigen engagement by an IgG, and the modality is a standard monoclonal antibody made in CHO. That ordinariness is the point. An IgG1 mAb in CHO is the most manufacturable thing in biologics — IgG1 being the most common antibody subclass, and a platform with decades of accumulated process knowledge, Protein A capture that exploits the conserved Fc (the antibody's constant "tail" region, identical across all IgG1s; conserved here is the biology sense — the same across molecules), well-understood analytics, and a release panel the whole industry shares. The platform exists because the constant region is conserved: every IgG1 presents the same Fc to the same Protein A resin and folds with the same disulfide topology, so a CDMO can run a new mAb on a process it has run a hundred times.

What is not conserved is the variable region — the Fv (the variable, antigen-binding part) and its CDRs (the loops that actually contact the target), which differ from antibody to antibody — and that is exactly where the developability fight lives: two platform IgG1s can capture identically on Protein A yet differ wildly in aggregation, viscosity, and charge-variant burden, all of which trace to their non-conserved variable domains. The modality choice at concept is why BATCH-2026-001 can reach a monomer purity of 98.611% by SEC at all. Choose a fragile bispecific with two non-native chain-pairing problems, or a sticky high-viscosity format that gels at the high concentrations a subcutaneous product demands, and the same downstream machinery struggles — the Protein A step may not even bind, the polishing windows narrow, and the high-concentration formulation a subcutaneous product needs becomes physically impossible. The first node sets the difficulty of every node after it.

This is where Book 5 diverges from a discovery textbook. A discovery scientist asks "will engaging this target help patients?" A manufacturing-minded reader asks a second question the discovery scientist often defers: "and can we make, at quality and scale, the molecule that engages it?" The two questions are not the same, and the machine learning that answers each is different. Book 4 modeled this same head-of-spine node as a knowledge-graph entity — the target and product concept that the realized drug-product lot conforms to, the head of the idea-to-vial thread (the genealogy itself roots downstream, in the frozen CHO cell bank the lot traces back to); Book 5 asks what can be learned and predicted there.

Drawing the boundary: where discovery AI ends

It would be easy, and wrong, to spend this chapter cataloguing AlphaFold-style structure prediction, generative small-molecule chemistry, and omics-driven target identification. Those are real, they are extraordinary, and they are not manufacturing. Drawing the boundary cleanly is itself a contribution, because the hype routinely blurs it — a press release about an "AI-designed drug candidate" tells you nothing about whether that candidate can be expressed in a 2,000 L bioreactor.

Here is the boundary this book uses. We treat as out of scope (acknowledged, not covered): target identification from genomics/transcriptomics, protein-structure prediction, generative de novo binder design as a discovery activity, and small-molecule cheminformatics. These are the headline acts of "AI for drug discovery," and they are someone else's chapter. We treat as in scope the concept-stage learning that produces a signal a manufacturing organization can act on:

Target tractability / druggability — but only insofar as a target's properties constrain the modality, which in turn constrains manufacturability. (A target reachable only by a membrane-spanning multispecific is a manufacturing commitment, not just a biology one.)
Manufacturability-by-design signals — in-silico predictions, available from sequence or early structure, of properties like aggregation propensity, expressibility, solubility, and viscosity that determine whether a molecule will survive cell culture, purification, and high-concentration formulation.
The handoff — the information, learned or measured at concept, that must travel down the spine for downstream models to do their jobs, and that today mostly does not.

The line is not arbitrary; it is the line of agency. A structure prediction changes how a discovery chemist reasons about a binding pocket and changes nothing a process engineer can act on. An aggregation-propensity score that flags a candidate as a downstream purification risk is the same kind of model — sequence in, property out — but it lands on a desk where someone can choose a different candidate, design a different polishing step, or budget for the fight. We keep the second kind and acknowledge the first.

The next chapter, Molecule Discovery, goes deep on the generative-design-plus-developability-prediction loop once a target is fixed and you are choosing among candidate sequences. This chapter sits one step earlier, where you are choosing the target and modality themselves — and where manufacturability is still a faint signal you can either listen for or ignore.

Target tractability as a prediction problem

The most mature concept-stage ML is target tractability: scoring how amenable a biological target is to therapeutic intervention. Framed as a learning task it is a supervised classification/ranking problem, though in practice the best-known systems are closer to evidence integration than to a single trained classifier — a distinction worth keeping straight.

The features are everything known about a target — genetic association with disease (GWAS hits, Mendelian links, rare-variant burden), expression patterns across tissues (so you can flag a target expressed everywhere and likely to cause on-target toxicity), the existence and shape of a binding pocket, prior chemical or biological matter against it, safety signals from gene knockouts and knockdowns, and network position in pathways. The label, learned from history, is whether targets like this one have yielded approved drugs — or, more weakly, have reached the clinic. The output is a tractability assessment, and the good ones split it by modality: "small-molecule tractable" versus "antibody tractable" versus "intractable today," because the answer genuinely differs by molecule class. A target buried inside the cell is antibody-intractable but may be small-molecule tractable; a cell-surface receptor with a clean extracellular domain is the reverse.

Mechanically, the canonical public system — the Open Targets platform [1] — does not train one monolithic model. It computes per-evidence scores from a dozen independent data sources, harmonizes them onto a common target-disease grid, and aggregates them into a weighted association score, then layers a separate rules-plus-evidence tractability assessment on top (structural data for pocket detection, clinical-precedence flags, ligand-based features). It is "machine learning" in the broad sense of learned weights and curated evidence pipelines rather than a deep net — and that matters for how you read its output: it is interpretable, auditable, and conservative, but it inherits every bias in its evidence base.

Open, peer-reviewed resources have made this concrete. Open Targets integrates genetic, genomic, transcriptomic, and chemical/drug evidence into scored target-disease associations and an explicit small-molecule/antibody tractability assessment — evidence-integration learning and modality-aware tractability at the head of the pipeline [1]. The framing matters for us: tractability is mostly answering "is this a good idea biologically, and reachable by some modality?" It is not answering "is the molecule that reaches it manufacturable?" Those can sharply disagree. A target that is beautifully antibody-tractable might only be engageable by a tetravalent bispecific with a known propensity to aggregate — high biological tractability, low manufacturability. The tractability model, trained on approvals, will not warn you, because plenty of hard-to-make molecules still got approved; the manufacturing pain never made it into the label. The signal that would warn you — "molecules of this modality, against epitopes like this, are expensive to purify" — is simply not in the feature set. This is the same censored-label pathology the unsolved-part section returns to: the molecules whose manufacturing pain killed them are precisely the ones absent from the training set, so the model cannot learn the boundary it most needs.

So target tractability earns its place in a manufacturing book only at one remove: it is the step where modality gets implicitly decided, and modality is the single largest lever on manufacturability. The discipline this chapter argues for is to make that implicit decision explicit — to carry a manufacturability expectation alongside the tractability score, rather than discover the consequences three years later at tech transfer, where a fragile format meets a process designed for a robust one.

Evidence

Target tractability ML (e.g., Open Targets evidence integration) is (production) as a discovery decision-support tool and rests on peer-reviewed, independent, publicly auditable infrastructure [1]. Its relevance to manufacturing is indirect and, to our knowledge, not formally validated — there is no published model that takes a target and predicts a downstream manufacturing outcome. Treat the manufacturing inference in this section as a reasoned argument from modality, not an established result.

Manufacturability-by-design: the mindset before the model

"Quality by Design" (QbD), which the data book and Book 3 both lean on, says you build quality into a process rather than testing it in at the end. Manufacturability-by-design pushes the same logic one step further upstream: build manufacturability into the molecule rather than engineering around it later. The mindset is older than the ML — protein engineers have long known to avoid free (unpaired) cysteines that scramble disulfides, Asn-Gly deamidation hotspots that create charge variants, methionine oxidation sites in the CDRs, and N-linked glycosylation sequons (Asn-X-Ser/Thr) in the wrong places. Glycosylation cuts both ways across this chapter's central boundary: an unwanted sequon is a molecule-intrinsic liability knowable at concept, but the realized glycan profile (afucosylation, high-mannose, galactosylation) that governs effector function and clearance is a process-and-host outcome no concept-stage sequence model can fix. But machine learning is beginning to turn that craft knowledge into quantitative, sequence-level predictions you can compute before a gene is even synthesized.

The vocabulary is developability: the set of biophysical properties that determine whether a candidate will survive expression, purification, formulation, storage, and administration. At concept, the relevant developability signals you can compute or estimate from sequence (and, with structure prediction, from a model of the fold) include:

Expressibility / titer potential — will CHO cells make enough of it? A molecule the production bioreactor can only push to a fraction of platform titer is a cost-of-goods problem forever; a 1 g/L molecule and a 5 g/L molecule occupy the same plant footprint, so in the simplest accounting the lower-titer molecule carries several times the upstream unit cost (illustrative — downstream and fill-finish costs do not scale with titer the same way).
Aggregation propensity — the tendency to form the high-molecular-weight species that SEC measures (recall BATCH-2026-001 carries SEC_HMW_pct = 1.287% against a 0–3% spec); high intrinsic aggregation makes every purification and formulation step harder and is the single most common developability liability that kills programs.
Solubility and viscosity — decisive for high-concentration formulation, which is where a subcutaneous mAb lives or dies; a viscous molecule may be undeliverable through a fine-gauge needle in a prefilled syringe regardless of efficacy. Viscosity is the hardest property to predict because it is driven by weak, concentration-dependent self-association, not by any single residue.
Chemical and conformational stability — deamidation, oxidation, isomerization, and unfolding liabilities that surface as charge variants (CEX) and as degradation over shelf life; these set the cold-chain burden and the shelf-life claim.
Immunogenicity and sequence liabilities — humanness scores and predicted MHC-II / T-cell epitopes, which bear on both clinical safety and the analytical burden downstream.

The peer-reviewed picture in 2024–2025 is genuinely encouraging for antibody developability specifically, and it is worth being precise about what the real methods do. The dominant approach is supervised regression/classification on engineered features: take an antibody sequence, build a model of its Fv structure — the variable, antigen-binding region of the antibody, often predictable directly from sequence, which is what makes the whole thing scale to thousands of candidates — then compute physicochemical descriptors — surface hydrophobicity patches, net charge and charge asymmetry, the patch dipole (an asymmetry in surface charge), CDR-region properties — and regress measured developability assays on those descriptors.

PROPERMAB is an integrative framework that predicts multiple antibody developability metrics — hydrophobic-interaction-chromatography (HIC) retention time, a lab measure of how sticky a molecule is on a hydrophobic column; high-concentration viscosity; and others — from sequence- and structure-derived molecular features, with the structure features predictable directly from sequence so the whole pipeline runs at repertoire scale [2].
For the hardest property, DeepViscosity attacks high-concentration viscosity head-on with an ensemble of 102 artificial neural networks trained on a relatively large and diverse panel (229 mAbs) to classify, on that panel, whether a molecule will be low-viscosity (at or under 20 cP) or high-viscosity (above 20 cP) at the 150 mg/mL therapeutic concentration its measurements were taken at — a large-data ensemble model for the property that defeats simpler descriptors [3].
And interpretable sequence-based viscosity models go one step further: they not only predict low-viscosity IgG1 variants from the Fv-region sequence but, because the model is interpretable, point to the specific residues driving viscosity, enabling designed mutations that experimentally reduce it — prediction plus a redesign handle [4].

The Therapeutic Antibody Profiler and similar flag-based screens established the prior generation of this idea: compute a handful of biophysical descriptors and compare a candidate against the distribution of approved antibodies (the clinical-stage antibody landscape), flagging outliers that sit in the tails. That comparison-to-approved-distribution logic is exactly the "is this molecule like the ones that got made?" prior, made quantitative.

Two honest caveats keep this from being a solved problem. First, almost all of this maturity is antibody-specific — the moment a program chooses a non-mAb modality, the predictive tooling thins out dramatically, exactly when manufacturability risk is highest (a bispecific has chain-pairing and stability failure modes no mAb descriptor captures — heavy-chain mispairing yields homodimer and half-antibody byproducts that the conserved-Fc Protein A step cannot cleanly resolve, which is why heterodimerization engineering such as knob-into-hole exists at all). Second, most developability ML operates on candidate sequences (a Chapter 5 concern) rather than on the target/modality choice this chapter is about; the concept-stage version is more of a prior — "molecules of this class, against targets like this, tend to be hard/easy to make" — than a precise prediction. The signal at concept is real but coarse, and the published accuracies, being model-and-dataset specific, are evidence the approach works on curated panels, not a guarantee on the next molecule from a new epitope class.

Evidence

Antibody developability prediction (PROPERMAB feature-based regression [2]; DeepViscosity ensemble classification [3]; interpretable sequence-based viscosity ML [4]) is (research) trending toward decision-support, peer-reviewed and largely independent or academic. Reported accuracies are model-and-dataset specific and, like every concept-stage number in this book, should be read as evidence the approach works on curated data, not as a guarantee on your next molecule. No tier-1 evidence exists for concept-stage manufacturability prediction of the target/modality choice itself.

The handoff gap: the hole at the head of the spine

Here is the most important — and least flattering — fact in this chapter. The information generated at concept and discovery, the very information that would let downstream manufacturing models do their jobs, largely fails to travel down the spine. Discovery and manufacturing are different organizations, on different timelines, with different data systems, different vocabularies, and different incentives. The discovery group optimizes for binding affinity and efficacy and hands off a sequence; the manufacturing group inherits that sequence and re-discovers, empirically and expensively, the developability properties that were often knowable — or even computed and then discarded — at concept.

This is recognized in the field as a genuine spine gap, not a niche complaint. The phenomenon has been documented directly: developability liabilities knowable at the discovery/concept stage are routinely re-discovered downstream because discovery and manufacturing operate on different timelines, tools, and incentives [5]. It is why so much of what this book describes downstream is, in effect, paying for a decision made upstream without manufacturing in the room. The out-of-specification (OOS) sibling BATCH-2026-004 in our running dataset fails on HCP_ng_per_mg = 128.0 against a spec ceiling of 100.0 — host-cell proteins from the CHO host carried through purification, a clearance problem set by cell line, harvest, and polishing, not by the molecule's sequence — a process failure, not a molecule failure, and so not one this chapter could have predicted. But many real OOS events and many "this molecule is just hard" verdicts do trace back to a concept-stage choice that nobody flagged because the information never crossed the handoff: the aggregation that fouls a polishing column, the hydrophobic patch that smears across HIC, the self-association that gels the drug substance. The honest distinction matters — concept-stage ML owns the molecule-intrinsic liabilities, not the process excursions — and BATCH-2026-004 is the example that keeps the two from being conflated.

What would close the gap is not a clever model so much as a discipline of carrying a structured record across the boundary — the same contextualization discipline Book 2 applies to a data point and Book 4 applies to a knowledge-graph node. A learned, manufacturability-aware target profile would travel with the program: the target and modality, the tractability evidence, every in-silico developability prediction with its uncertainty, and an explicit manufacturability expectation that downstream models can read, test against reality, and feed back. The technology to compute the pieces increasingly exists. The pipe to carry them — and the organizational will to act on a manufacturing signal that costs the discovery team time and may kill their favorite candidate — mostly does not. That is the unglamorous frontier.

What makes the pipe trustworthy rather than just present is that the record is semantically grounded — bound to a shared vocabulary, not to brittle column names. Book 4 gives the target/concept node a stable identity: it is a knowledge-graph entity reached by an IRI (an Internationalized Resource Identifier — a globally unique, web-style name for a thing, the same identifier scheme Book 4 uses for every tag and lot), and the genealogy spine reaches it by the typed bp:derivedFrom edge that roots every realized lot, hop by hop, in the frozen CHO cell bank. The practical payoff is concrete. A developability feature pulled by its ontology IRI — bp:aggregationPropensity of this program node — survives a system migration, a column rename, and a vendor swap that would silently break a feature keyed to a spreadsheet header; that is the difference between a feature an audit can trace and one that quietly points at the wrong column after a refactor. And the same derivedFrom edge that the knowledge graph uses for lineage is exactly the grouping key a leave-one-batch-out cross-validation needs (the batch-grouped splitting Part I made non-negotiable): two release lots that share a parent must travel to the same side of the train/test cut, and the only audit-grade way to know they share a parent is to walk the typed lineage edge rather than to string-match a batch_id. The ontology does not just store the record — it is what lets a model join and split it correctly.

The first node of the spine, learned: target, mechanism, and modality feed a tractability model and a panel of coarse-but-real in-silico developability signals; the dashed handoff gap is the documented hole through which manufacturability knowledge usually leaks away, and the manufacturability-aware target profile is the record that should — but rarely does — carry it across. Original diagram by the authors, created with AI assistance.

A worked sketch: assembling a concept-stage target profile

This chapter contributes no heavyweight new model — concept-stage manufacturability is too weakly grounded to ship a confident predictor, and saying so honestly is part of the lesson. What it can show is the assembly discipline: pulling the running example's identity together with the kind of concept-stage signals that should accompany it, so the record exists to be carried across the handoff. The light helper lives alongside the suite's shared examples/platform/ml/dataio.py loader (which the rest of Book 5 uses to read the committed datasets), and reads the genealogy and release data we already have to anchor the profile in real values rather than invented ones. The design rule is strict: outcomes are real, predictions are labelled illustrative — the snippet is allowed to make up a developability score (no validated concept-to-manufacturing model exists to produce a real one) but is never allowed to make up a CQA, which it reads from the committed release dataset.

# examples/platform/ml/target_profile.py
# Concept-stage target profile for mAb-A. Honest by construction: the manufacturing
# OUTCOMES are real (read from the release dataset); the concept-stage PREDICTIONS are
# labelled illustrative, because no validated concept->manufacturing model exists yet.
# A custom __str__ gives a deterministic layout so the chapter can embed it verbatim.
from dataclasses import dataclass, field
import pandas as pd
from dataio import DATASETS  # shared loader used across examples/platform/ml/

@dataclass
class TargetProfile:
    program: str
    target: str
    mechanism: str
    modality: str
    tractability_score: float          # discovery-side, modality-aware (illustrative)
    developability: dict               # concept-stage in-silico signals (illustrative); 'ci' = symmetric epistemic uncertainty half-width (+/-), not a frequentist confidence interval (so score 0.78, ci 0.15 means 0.78 +/- 0.15)
    realized_cqas: dict = field(default_factory=dict)  # what actually happened, downstream

def realized_cqas_for(batch_id: str) -> dict:
    """The downstream truth this concept choice eventually produced — REAL values."""
    df = pd.read_csv(DATASETS / "hplc_results.csv")
    b = df[df.batch_id == batch_id].set_index("test")
    return {
        "SEC_monomer_pct": float(b.loc["SEC_monomer_pct", "value"]),
        "SEC_HMW_pct":     float(b.loc["SEC_HMW_pct", "value"]),   # aggregation, realized
        "release":         "PASS" if (b["result"] == "PASS").all() else "OOS",
    }

profile = TargetProfile(
    program="mAb-A",
    target="(antigen, discovery-defined)",
    mechanism="antigen engagement (IgG1)",
    modality="monoclonal antibody, CHO",
    tractability_score=0.81,           # illustrative
    developability={                   # illustrative concept-stage predictions, each with uncertainty
        "expressibility":   {"score": 0.78, "ci": 0.15, "note": "platform IgG1, high titer prior"},
        "aggregation":      {"score": 0.12, "ci": 0.10, "note": "low predicted HMW propensity"},
        "viscosity_risk":   {"score": 0.20, "ci": 0.18, "note": "below high-conc concern threshold"},
    },
    realized_cqas=realized_cqas_for("BATCH-2026-001"),
)
print(profile)

Running it prints the record (the print(profile) stdout, then the appended interpretation comment lines):

TargetProfile(program='mAb-A', target='(antigen, discovery-defined)',
  mechanism='antigen engagement (IgG1)', modality='monoclonal antibody, CHO',
  tractability_score=0.81,
  developability={
    'expressibility': {'score': 0.78, 'ci': 0.15, 'note': 'platform IgG1, high titer prior'},
    'aggregation':    {'score': 0.12, 'ci': 0.1, 'note': 'low predicted HMW propensity'},
    'viscosity_risk': {'score': 0.2, 'ci': 0.18, 'note': 'below high-conc concern threshold'}},
  realized_cqas={'SEC_monomer_pct': 98.611, 'SEC_HMW_pct': 1.287, 'release': 'PASS'})

# The concept-stage prediction (low aggregation, score 0.12) is CONSISTENT
# with the realized SEC_HMW_pct of 1.287%
# (well inside the 0..3% spec) — but a single consistent example proves
# nothing. The point is structural: predictions and outcomes now live in
# ONE record that can be carried across the handoff and graded as data accrues.

The design rule does all the work: the developability scores are labelled illustrative (placeholders for the kind of signal a concept-stage tool would emit, each carrying the uncertainty band the Anatomy section reads field by field), while the numbers that are real — 98.611, 1.287, the PASS verdict — come straight from hplc_results.csv for the golden batch, read through the same DATASETS loader every other chapter uses, so the running example's IDs and values stay identical across the book. Putting the weak early prediction and the eventual hard outcome in the same record is the whole point: one consistent example is not validation, but a thousand such records accrued over years are the only training set that could ever close the gap — and they only exist if someone builds the record now.

Two properties of that record make it admissible rather than merely convenient, and both come from the ontology side of the series. First, the prediction and the outcome are not the same kind of thing, and the model must never conflate them: in BFO terms (the Basic Formal Ontology, the upper ontology Book 4 aligns to) the developability score is a continuant — a standing claim about the molecule that persists — while the realized CQA is the outcome of an occurrent, the actual culture-and-purification run that happened years later, so the card keeps the predicted aggregation and the measured SEC_HMW_pct in separate blocks rather than overwriting one with the other. Second, before such a record is fed to any downstream model, the same closed-world discipline that gates a release lot can gate the training input. Book 4's release gate is a SHACL shape (the Shapes Constraint Language — a standard for validating that graph data is present, singular, and in range) that fails a lot whose CQA panel is incomplete; a concept-completeness shape is the same idea pointed upstream — a profile that is missing its modality, or carries a developability score with no uncertainty band, or has lost its program key, is rejected before it can poison a feature matrix:

# A concept-completeness shape (illustrative): the same closed-world gate the release
# uses, pointed at the training input. A profile that fails this is not a usable row.
bp:TargetProfileShape a sh:NodeShape ;
    sh:targetClass bp:TargetProfile ;
    sh:property [ sh:path bp:program  ; sh:minCount 1 ; sh:maxCount 1 ] ;   # the join key
    sh:property [ sh:path bp:modality ; sh:minCount 1 ;                       # the difficulty lever
                  sh:message "Modality missing: the largest manufacturability lever is unrecorded." ] ;
    sh:property [ sh:path bp:developabilityUncertainty ; sh:minCount 1 ;
                  sh:message "A concept-stage score with no uncertainty band invites false confidence." ] .

This is the FAIR (Findable, Accessible, Interoperable, Reusable) and validation machinery of the data and ontology books doing real ML work: SHACL guarantees the training input is complete and in-range the same way it guarantees a lot's release panel is, so "garbage in" is caught at the gate rather than discovered three folds into cross-validation.

Anatomy of a manufacturability-aware target profile

Every chapter in this series dissects one record. Here the record is the artifact that should cross the handoff gap: a concept-stage target profile that fuses the discovery decision with a manufacturing expectation and, eventually, the realized outcome. It is the head-of-spine analogue of the soft-sensor prediction record and the batch node — the same idea that a useful artifact carries its provenance, its uncertainty, and the means of its own grading.

One target profile, unpacked: the discovery-side choice (target, mechanism, modality, tractability), the weak-but-explicit concept-stage developability predictions each with its uncertainty, the realized CQAs filled in years later from real release data, and the relationships that should carry it across the handoff gap and feed the grading loop back. The card's whole purpose is to make a concept-stage choice auditable against what it eventually produced. Original diagram by the authors, created with AI assistance.

Read the card field by field, the way a CMC lead inheriting the program would.

Header — program and node. mAb-A at node target-and-concept: the primary key that lets this record join the genealogy and lets every downstream artifact (WCB-CHO-001, BATCH-2026-001) trace its lineage back to a concept decision. Without this key, the profile is an orphan and the handoff cannot close.
Discovery block — target, mechanism, modality, tractability_score. The inherited decision, and crucially every field here is labelled discovery-side, so no one mistakes a binding-driven choice for a manufacturing-vetted one. target is deliberately (antigen, discovery-defined) — a placeholder, because the specific antigen is the discovery team's to name and is irrelevant to the manufacturing argument. modality ("monoclonal antibody, CHO") is the field that actually determines difficulty. tractability_score (0.81) is tagged illustrative and modality-aware — it is a biology signal that has entered this record only because modality rode in on it.
Manufacturability block — developability. This chapter's contribution. Each entry is a triple of score, ci (the uncertainty band, drawn as a horizontal bar), and a human-readable note. The ci is not decoration: a concept-stage prediction with a hidden uncertainty is worse than no prediction, because it invites false confidence in a number that is barely better than a prior, and at concept this band is overwhelmingly epistemic — uncertainty about the model and its sparse training data, not measurement noise — which is precisely why it can dwarf the point estimate. expressibility 0.78 ± 0.15, aggregation 0.12 ± 0.10, viscosity_risk 0.20 ± 0.18 — three properties, each coarse, each honest about being coarse.
Realized-outcome block — realized_cqas. The green core, deliberately empty at concept and filled in years later from real data: SEC_monomer_pct = 98.611, SEC_HMW_pct = 1.287, release = PASS. This is what turns the profile from a guess into a record that can be graded — the predicted low aggregation (0.12) can finally be checked against the realized 1.287% HMW. It is the only block in the card that is not illustrative, and it is the reason the card is worth building.
Relationships panel. Where the handoff lives: a forward edge across the dashed gap to molecule discovery, cell-line development, and the production bioreactor — the nodes that inherit the choice — and a backward feedback edge, grade-the-prediction, that closes the loop once outcomes accrue. The forward edge is what should travel and rarely does; the backward edge is what would make the next program's prediction better and almost never exists today.
Footer — the asymmetry. The same asymmetry that haunts every soft sensor in this book: the prediction is cheap and now; the truth is expensive and later, arriving only after a multi-year, multi-million-dollar program runs to commercial manufacturing. That asymmetry is the whole reason the record must persist the prediction at the time it is made — by the time the truth arrives, no one remembers (or trusts) what was predicted unless it was written down with its uncertainty.

The unsolved part: this is the weakest-grounded ML in the book

It would be dishonest to dress this chapter as a success story. Concept-stage manufacturability prediction is, by a wide margin, the least well-validated machine learning in Book 5, and the reasons are structural rather than fixable with more compute. Three reinforce each other into a trap.

The first reason is the cold-start problem at its most extreme — the binding constraint Part I named for all of bioprocess ML, here in its purest form [6]. To learn "which concept-stage choices lead to manufacturable molecules," you need many programs that each went from concept all the way to commercial manufacturing with the concept-stage features recorded. A large company runs a few such programs to commercial per year; most candidates die before manufacturing for reasons unrelated to manufacturability (efficacy, safety, portfolio); and the survivors take years — often a decade — to produce a label. The training set is not merely small; it is almost nonexistent, it grows at the speed of drug development itself, and every row that does exist took hundreds of millions of dollars to generate. No amount of compute manufactures the data.

The second reason is survivorship bias baked into every label. The only molecules with a "manufacturable" label are the ones that got made — which means the model never sees the molecules that were killed for manufacturability reasons at concept, and so never learns the failure boundary it most needs. It is the censored-data problem in its harshest form: the negatives that matter are exactly the ones removed from the dataset before training. A tractability model trained on approvals has the identical blind spot pointed the other way: hard-to-make molecules that were approved anyway teach the model that difficulty does not matter, because difficulty never made it into a label.

The third reason is the handoff gap itself as a data problem. Even where a program does run end to end, the concept-stage features and the manufacturing outcomes live in different systems, were never linked by a common key, and often the concept-stage predictions were never persisted at all. You cannot train a model to bridge a gap whose two sides were never connected — which is precisely why the assembly-discipline record above matters more, today, than any predictor. The record is not a deliverable in lieu of a model; it is the precondition for ever having one. Until programs routinely persist a keyed, uncertainty-tagged concept profile that later joins to release data, the training set for concept-stage manufacturability ML literally cannot be assembled, no matter how good the methods become.

The honest verdict, then, is the one Part I set up and the final chapter will return to: at the head of the spine, ML is a structured-prior generator, not an oracle. It can surface modality-level manufacturability concerns ("high-concentration viscosity risk for this format," "non-platform polishing likely for this bispecific"), encode them with uncertainty, and carry them forward to be tested — and that is genuinely valuable. It cannot tell you, at concept, whether your specific molecule will hit 98.611% monomer or land in the HCP ditch that sank BATCH-2026-004. Anyone selling the second thing is selling the swamp.

What this chapter adds to the model suite

This chapter is deliberately light on new code, in proportion to how lightly grounded the science is. It contributes:

examples/platform/ml/target_profile.py — the small, honest helper sketched above: a TargetProfile dataclass that fuses the discovery decision, the (illustrative) concept-stage developability predictions with explicit uncertainties, and the real realized CQAs read from hplc_results.csv. Its purpose is the assembly discipline, not prediction.
A reliance on the suite's shared examples/platform/ml/dataio.py loader, which every later chapter uses to read the committed datasets, keeping the running example's IDs and values identical across the book.

No model is trained here on purpose. The contribution is a record schema and a stance: make the weak early signal explicit, attach its uncertainty, bind it to the outcome it will eventually be graded against, and build the pipe that carries it across the handoff. The heavyweight predictors begin in Chapter 5, where there is enough data — candidate sequences with measured developability — to actually learn.

Even this light helper honors the suite's reproducibility contract, which is the whole reason the open-source companion is trustworthy rather than just illustrative. target_profile.py reads only committed datasets through the shared loader, makes no network call, and — because its real numbers come from a fixed hplc_results.csv rather than from any sampled draw — prints the same record on every machine, so the verbatim stdout embedded above is something a reader can regenerate, not take on faith; the suite-wide run_all.py driver and the fixed SEED that the open-source analytics stack leans on are what keep the rest of the book's trained models equally re-runnable. The contract is deliberately strict here: a chapter that openly refuses to ship a confident predictor has to be doubly honest that the one thing it does ship is exactly reproducible.

Why it matters

The whole rest of Book 5 is, in a sense, the consequence of this node. Every soft sensor, every hybrid model, every computer-vision inspector downstream is working on a molecule and a modality chosen here. Get the first node right — pick a manufacturable modality, carry a manufacturability-aware profile across the handoff, listen to the weak concept-stage signal instead of ignoring it — and the downstream models have an easier job and the process has fewer fights. Get it wrong, or simply fail to carry the information, and the most sophisticated downstream ML in the world is reduced to mitigating a decision it was never allowed to influence. Manufacturability really is chosen at concept; the only question is whether anyone with manufacturing knowledge — human or model — was in the room when it was chosen. That is why a manufacturing book has to start here, even though the ML here is the thinnest in the book.

In the real world

The honest industry picture matches the honest scientific one. Target tractability decision-support (Open Targets and commercial equivalents) is routine in discovery, but as a biology tool [1] — its manufacturing relevance is rarely formalized, because no one has wired its modality output to a manufacturing-outcome model. Antibody developability prediction is the brightest spot: peer-reviewed, increasingly accurate sequence-and-structure models for aggregation, viscosity, and related liabilities are real and being adopted, though they sit mostly at the candidate-sequence stage of Chapter 5 rather than the target/modality stage [2][3][4]. A few large companies have stood up "manufacturability index" data lakes that score and rank candidates against accumulated internal manufacturing history — the CLD 4.0 line of work, which digitalizes cell-line data into a structured lake, computes a cell-line manufacturability index, and runs ML risk assessment over it, points in this direction [7] — but these are first-party systems whose performance is self-reported, and they live closer to cell-line development than to concept. These first-party "manufacturability index" systems are, in effect, an early developability assessment — the CMC discipline that already feeds a program's Quality Target Product Profile (QTPP) — pulled forward; the gap this chapter names is that the same discipline almost never reaches the concept/target-and-modality stage, where the leverage is highest. That is the telling pattern: the manufacturability scoring that does exist has migrated to the latest stage where data is dense (clones), not the earliest stage where the leverage is highest (concept).

On governance, this node sits before the Good Manufacturing Practice (GMP) boundary, so it is not directly touched by the draft EU/PIC/S GMP Annex 22 restrictions on AI in critical GMP tasks [8] or the FDA's model-credibility framework, which explicitly scopes nonclinical, clinical, post-marketing, and manufacturing phases and does not cover drug discovery [9] — a concept-stage developability prediction is decision support, not a GMP record. But the framing already matters: if a concept-stage model's output ever propagates into a regulatory submission or a control strategy, it inherits the credibility expectations of wherever it lands, keyed to its context of use. The discipline of attaching uncertainty and provenance to the target profile now is precisely what makes that propagation defensible later. The realistic 2026 state, consistent with the broader survey reality that AI/ML has the most pilots and the fewest scaled deployments, is that concept-stage manufacturability ML is (research) edging toward (pilot) decision support — promising, peer-reviewed in its antibody-developability core, and nowhere near an oracle.

Key terms

mAb (monoclonal antibody) — the running example's modality; IgG1 is its most common, most manufacturable subclass.
CHO (Chinese Hamster Ovary) — the standard mammalian host cell line that expresses (manufactures) the antibody.
CQA (Critical Quality Attribute) — a measured product-quality property (e.g. SEC monomer %) that must stay within spec; the real outcomes in this chapter's record.
Target — the biological molecule a therapy is designed to engage; the first choice at the head of the manufacturing spine.
Mechanism of action (MoA) — how the therapy engages the target (block, degrade, recruit, deliver).
Modality — what kind of molecule embodies the mechanism (mAb, bispecific, ADC, fusion, cell/gene therapy); the single largest lever on manufacturability.
Target tractability / druggability — an ML-scored (often evidence-integrated, modality-aware) estimate of how amenable a target is to therapeutic intervention; a biology signal that only indirectly bears on manufacturing via modality.
Manufacturability-by-design — building manufacturability into the molecule at concept rather than engineering around it later; the QbD mindset pushed one step upstream.
Developability — the biophysical properties (expressibility, aggregation, solubility, viscosity, stability, immunogenicity) that determine whether a candidate survives the spine.
In-silico developability prediction — computing developability signals from sequence (and predicted structure) before a molecule is made, typically via feature-based regression/classification on engineered physicochemical descriptors; mature for antibodies, thin for other modalities.
The handoff gap — the documented hole through which manufacturability knowledge generated at concept/discovery fails to reach manufacturing.
Manufacturability-aware target profile — the proposed structured record that carries the target/modality decision, its concept-stage predictions with uncertainty, and the realized outcomes across the handoff to be graded.
Semantically-grounded feature — a model input pulled by its ontology IRI (a globally unique, web-style identifier) rather than by a fragile column name, so it survives renames, migrations, and vendor swaps and stays traceable in an audit.
derivedFrom as grouping key — using the typed lineage edge that roots every lot in the cell bank to decide which records share a parent, so leave-one-batch-out cross-validation splits by true genealogy instead of by string-matched IDs.
Concept-completeness shape (SHACL) — a closed-world validation shape, the release gate pointed upstream, that rejects a target profile missing its key, modality, or uncertainty band before it can enter a training set.
Continuant vs occurrent (BFO) — the upper-ontology distinction that keeps a standing developability prediction (continuant) separate from the run that later produced the measured CQA (occurrent), so the two are never conflated in the record.
Cold start (at concept) — the extreme small-data regime where almost no programs have run concept-to-commercial with concept-stage features recorded, so concept-stage manufacturability ML is barely trainable.
Survivorship bias (in labels) — the distortion from only ever labelling molecules that survived to manufacturing, hiding the failure boundary a model most needs to learn.

Where this leads

The target is chosen, the modality is fixed, and — at best — a manufacturability-aware profile is ready to travel down the spine. The next chapter, Molecule Discovery: Generative Design and Developability Prediction, steps from the target/modality choice to the choice among candidate sequences, where there is finally enough data to train real models: the generative-design loop that proposes molecules and the developability predictors that score them, run together so that what gets advanced is not just the best binder but the best makeable binder. It is where the weak concept-stage prior of this chapter becomes a quantitative selection pressure.

What this chapter covers​

The first node: target, mechanism, modality​

Drawing the boundary: where discovery AI ends​

Target tractability as a prediction problem​

Manufacturability-by-design: the mindset before the model​

The handoff gap: the hole at the head of the spine​

A worked sketch: assembling a concept-stage target profile​

Anatomy of a manufacturability-aware target profile​

The unsolved part: this is the weakest-grounded ML in the book​

What this chapter adds to the model suite​

Why it matters​

In the real world​

Key terms​

Where this leads​