Skip to content

molforge.metrics

metrics

Evaluation and benchmarking metrics.

The metrics here are task-level: they grade how good a prediction is relative to a known reference. They build on top of the lower-level geometry in :mod:molforge.structure (RMSD, superposition, etc.).

What's here:

Fold-similarity metrics (single-chain): - :func:tm_score — TM-score (Zhang & Skolnick 2004). Length- normalized, fold-level. > 0.5 ≈ same fold. - :func:gdt_ts — CASP's GDT-TS. Average pass-fraction at 1/2/4/8 Å. - :func:gdt_ha — GDT high-accuracy. 0.5/1/2/4 Å (near-experimental). - :func:gdt_per_cutoff — per-cutoff fractions for custom analysis. - :func:lddt — alignment-free local Distance Difference Test (Mariani et al. 2013). What AlphaFold's pLDDT estimates. - :func:lddt_per_residue — per-residue lDDT (the per-residue confidence pLDDT actually predicts).

Complex-quality metrics (multi-chain docking): - :func:dockq — DockQ score (Basu & Wallner 2016). Single-number docking quality with per-component breakdown. - :func:fnat, :func:irms, :func:lrms — the underlying CAPRI measures (fraction of native contacts, interface RMSD, ligand RMSD).

All metrics return float scalars (or, where relevant, NumPy arrays or dicts) in the conventional direction — higher = better for TM-score / GDT / lDDT / DockQ; lower = better for RMSDs.

fnat

fnat(
    model: Protein,
    reference: Protein,
    *,
    chain_a: str | None = None,
    chain_b: str | None = None,
    cutoff: float = _FNAT_CUTOFF,
) -> float

Fraction of native interface contacts recovered in the model.

Parameters:

Name Type Description Default
model Protein

Predicted complex.

required
reference Protein

Native complex.

required
chain_a str | None

Which chain to compare on the receptor side. If None, uses the first protein chain shared between model and reference. Chain IDs must match between model and reference.

None
chain_b str | None

Which chain to compare on the partner side. See chain_a for default behavior.

None
cutoff float

Heavy-atom distance defining a contact (default 5 Å).

_FNAT_CUTOFF

Returns:

Type Description
float

fnat in [0, 1]. 1.0 = every native contact recovered.

irms

irms(
    model: Protein,
    reference: Protein,
    *,
    chain_a: str | None = None,
    chain_b: str | None = None,
) -> float

Interface RMSD — backbone RMSD over the interface residues only.

lrms

lrms(
    model: Protein,
    reference: Protein,
    *,
    chain_a: str | None = None,
    chain_b: str | None = None,
) -> float

Ligand RMSD — superpose the receptor (larger chain) and measure the ligand (smaller chain) RMSD.

gdt_ha

gdt_ha(model: Protein, reference: Protein) -> float

GDT-HA: high-accuracy variant of GDT-TS with tighter cutoffs.

Parameters:

Name Type Description Default
model Protein

The predicted structure.

required
reference Protein

The native / target structure.

required

Returns:

Type Description
float

GDT-HA in [0, 1]. Higher = better.

gdt_per_cutoff

gdt_per_cutoff(
    model: Protein,
    reference: Protein,
    *,
    cutoffs: tuple[float, ...] = _GDT_TS_CUTOFFS,
) -> dict[float, float]

Per-cutoff fractions used internally by GDT-TS / GDT-HA.

Useful for plotting accuracy curves or building custom metrics.

Parameters:

Name Type Description Default
model Protein

Predicted structure to score.

required
reference Protein

Reference structure (e.g. native or experimental).

required
cutoffs tuple[float, ...]

Distance cutoffs in Å. Defaults to GDT-TS's (1, 2, 4, 8).

_GDT_TS_CUTOFFS

Returns:

Type Description
dict[float, float]

Dict mapping cutoff to fraction of residues within that cutoff

dict[float, float]

after optimal superposition.

gdt_ts

gdt_ts(model: Protein, reference: Protein) -> float

GDT-TS: the CASP standard metric for fold-level prediction quality.

Parameters:

Name Type Description Default
model Protein

The predicted structure.

required
reference Protein

The native / target structure.

required

Returns:

Type Description
float

GDT-TS in [0, 1]. Higher = better.

lddt_per_residue

lddt_per_residue(
    model: Protein,
    reference: Protein,
    *,
    inclusion_radius: float = 15.0,
    thresholds: tuple[float, ...] = _DEFAULT_THRESHOLDS,
) -> NDArray[np.float32]

Per-residue lDDT (the per-residue confidence pLDDT estimates).

Parameters:

Name Type Description Default
model Protein

Predicted structure.

required
reference Protein

Native / target structure.

required
inclusion_radius float

see :func:lddt.

15.0
thresholds tuple[float, ...]

see :func:lddt.

_DEFAULT_THRESHOLDS

Returns:

Type Description
NDArray[float32]

(n_residues,) float32 array. Residues with no pair partners

NDArray[float32]

within inclusion_radius get NaN.

tm_score

tm_score(
    model: Protein,
    reference: Protein,
    *,
    normalize_by: str = "reference",
) -> float

Compute TM-score between two CA-aligned structures.

Parameters:

Name Type Description Default
model Protein

The model (predicted / candidate) structure.

required
reference Protein

The reference (target / native) structure.

required
normalize_by str

Length used to compute d0 and as the denominator in the TM formula:

  • "reference" (default) — match the reference's length. Use this for "how good is the prediction relative to the target".
  • "model" — match the model's length. Use when you want "how much of the model agrees with the reference".
  • "shorter" / "longer" — the convention in some papers; uses min/max of the two lengths.
'reference'

Returns:

Type Description
float

TM-score in [0, 1]. Higher is better.

Raises:

Type Description
ValueError

If the structures don't have equal CA counts.