molforge.metrics¶

metrics ¶

Evaluation and benchmarking metrics.

The metrics here are task-level: they grade how good a prediction is relative to a known reference. They build on top of the lower-level geometry in :mod:molforge.structure (RMSD, superposition, etc.).

What's here:

Fold-similarity metrics (single-chain): - :func:tm_score — TM-score (Zhang & Skolnick 2004). Length- normalized, fold-level. > 0.5 ≈ same fold. - :func:gdt_ts — CASP's GDT-TS. Average pass-fraction at 1/2/4/8 Å. - :func:gdt_ha — GDT high-accuracy. 0.5/1/2/4 Å (near-experimental). - :func:gdt_per_cutoff — per-cutoff fractions for custom analysis. - :func:lddt — alignment-free local Distance Difference Test (Mariani et al. 2013). What AlphaFold's pLDDT estimates. - :func:lddt_per_residue — per-residue lDDT (the per-residue confidence pLDDT actually predicts).

Complex-quality metrics (multi-chain docking): - :func:dockq — DockQ score (Basu & Wallner 2016). Single-number docking quality with per-component breakdown. - :func:fnat, :func:irms, :func:lrms — the underlying CAPRI measures (fraction of native contacts, interface RMSD, ligand RMSD).

All metrics return float scalars (or, where relevant, NumPy arrays or dicts) in the conventional direction — higher = better for TM-score / GDT / lDDT / DockQ; lower = better for RMSDs.

fnat ¶

fnat(
    model: Protein,
    reference: Protein,
    *,
    chain_a: str | None = None,
    chain_b: str | None = None,
    cutoff: float = _FNAT_CUTOFF,
) -> float

Fraction of native interface contacts recovered in the model.

Parameters:

Name	Type	Description	Default
`model`	`Protein`	Predicted complex.	required
`reference`	`Protein`	Native complex.	required
`chain_a`	`str \| None`	Which chain to compare on the receptor side. If `None`, uses the first protein chain shared between `model` and `reference`. Chain IDs must match between model and reference.	`None`
`chain_b`	`str \| None`	Which chain to compare on the partner side. See `chain_a` for default behavior.	`None`
`cutoff`	`float`	Heavy-atom distance defining a contact (default 5 Å).	`_FNAT_CUTOFF`

Returns:

Type	Description
`float`	`fnat` in `[0, 1]`. 1.0 = every native contact recovered.

irms ¶

irms(
    model: Protein,
    reference: Protein,
    *,
    chain_a: str | None = None,
    chain_b: str | None = None,
) -> float

Interface RMSD — backbone RMSD over the interface residues only.

lrms ¶

lrms(
    model: Protein,
    reference: Protein,
    *,
    chain_a: str | None = None,
    chain_b: str | None = None,
) -> float

Ligand RMSD — superpose the receptor (larger chain) and measure the ligand (smaller chain) RMSD.

gdt_ha ¶

gdt_ha(model: Protein, reference: Protein) -> float

GDT-HA: high-accuracy variant of GDT-TS with tighter cutoffs.

Parameters:

Name	Type	Description	Default
`model`	`Protein`	The predicted structure.	required
`reference`	`Protein`	The native / target structure.	required

Returns:

Type	Description
`float`	GDT-HA in `[0, 1]`. Higher = better.

gdt_per_cutoff ¶

gdt_per_cutoff(
    model: Protein,
    reference: Protein,
    *,
    cutoffs: tuple[float, ...] = _GDT_TS_CUTOFFS,
) -> dict[float, float]

Per-cutoff fractions used internally by GDT-TS / GDT-HA.

Useful for plotting accuracy curves or building custom metrics.

Parameters:

Name	Type	Description	Default
`model`	`Protein`	Predicted structure to score.	required
`reference`	`Protein`	Reference structure (e.g. native or experimental).	required
`cutoffs`	`tuple[float, ...]`	Distance cutoffs in Å. Defaults to GDT-TS's (1, 2, 4, 8).	`_GDT_TS_CUTOFFS`

Returns:

Type	Description
`dict[float, float]`	Dict mapping cutoff to fraction of residues within that cutoff
`dict[float, float]`	after optimal superposition.

gdt_ts ¶

gdt_ts(model: Protein, reference: Protein) -> float

GDT-TS: the CASP standard metric for fold-level prediction quality.

Parameters:

Name	Type	Description	Default
`model`	`Protein`	The predicted structure.	required
`reference`	`Protein`	The native / target structure.	required

Returns:

Type	Description
`float`	GDT-TS in `[0, 1]`. Higher = better.

lddt_per_residue ¶

lddt_per_residue(
    model: Protein,
    reference: Protein,
    *,
    inclusion_radius: float = 15.0,
    thresholds: tuple[float, ...] = _DEFAULT_THRESHOLDS,
) -> NDArray[np.float32]

Per-residue lDDT (the per-residue confidence pLDDT estimates).

Parameters:

Name	Type	Description	Default
`model`	`Protein`	Predicted structure.	required
`reference`	`Protein`	Native / target structure.	required
`inclusion_radius`	`float`	see :func:`lddt`.	`15.0`
`thresholds`	`tuple[float, ...]`	see :func:`lddt`.	`_DEFAULT_THRESHOLDS`

Returns:

Type	Description
`NDArray[float32]`	`(n_residues,)` float32 array. Residues with no pair partners
`NDArray[float32]`	within `inclusion_radius` get `NaN`.

tm_score ¶

tm_score(
    model: Protein,
    reference: Protein,
    *,
    normalize_by: str = "reference",
) -> float

Compute TM-score between two CA-aligned structures.

Parameters:

Name	Type	Description	Default
`model`	`Protein`	The model (predicted / candidate) structure.	required
`reference`	`Protein`	The reference (target / native) structure.	required
`normalize_by`	`str`	Length used to compute `d0` and as the denominator in the TM formula: `"reference"` (default) — match the reference's length. Use this for "how good is the prediction relative to the target". `"model"` — match the model's length. Use when you want "how much of the model agrees with the reference". `"shorter"` / `"longer"` — the convention in some papers; uses min/max of the two lengths.	`'reference'`

Returns:

Type	Description
`float`	TM-score in `[0, 1]`. Higher is better.

Raises:

Type	Description
`ValueError`	If the structures don't have equal CA counts.