molforge.structure¶
structure ¶
Structural analysis: superposition, RMSD, contacts, geometry, DSSP, SASA, dihedrals.
Workhorses for analyzing the geometric properties of protein structures and comparing them.
Common entry points
- :func:
rmsd— RMSD between two structures (with optional superposition). - :func:
superpose— Kabsch / Umeyama optimal rigid-body alignment. - :func:
contact_map/ :func:distance_map— residue-residue contact and distance matrices. - :func:
residue_contacts— all-atom contacts as a sorted list. - :func:
radius_of_gyration, :func:centroid, :func:center_of_mass— bulk geometric properties. - :func:
translate, :func:rotate, :func:center_at_origin— in-place coordinate transforms. - :func:
dssp/ :func:dssp_3state— Kabsch-Sander secondary- structure assignment (8-state and 3-state). - :func:
sasa/ :func:sasa_per_residue/ :func:total_sasa— solvent-accessible surface area (Shrake-Rupley). - :func:
phi/ :func:psi/ :func:omega/ :func:phi_psi_omega/ :func:ramachandran/ :func:dihedral— backbone dihedral angles.
SuperpositionResult
dataclass
¶
SuperpositionResult(
rotation: NDArray[float64],
translation: NDArray[float64],
rmsd: float,
n_atoms: int,
mobile_aligned: NDArray[float32],
)
Result of a structural superposition.
Attributes:
| Name | Type | Description |
|---|---|---|
rotation |
NDArray[float64]
|
|
translation |
NDArray[float64]
|
|
rmsd |
float
|
Root-mean-square deviation of the superposed structures. |
n_atoms |
int
|
Number of atoms used in the superposition. |
mobile_aligned |
NDArray[float32]
|
|
contact_map ¶
contact_map(
protein: Protein,
*,
cutoff: float = 8.0,
atom_choice: AtomChoice = "cb",
exclude_neighbors: int = 0,
) -> NDArray[np.bool_]
Binary contact map at cutoff Å.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
structure to analyze. |
required |
cutoff
|
float
|
distance below which residues are in contact (default 8.0 Å, the CASP standard for CB-CB). |
8.0
|
atom_choice
|
AtomChoice
|
which atom defines the residue position — defaults
to |
'cb'
|
exclude_neighbors
|
int
|
Set the diagonal band of width
|
0
|
Returns:
| Type | Description |
|---|---|
NDArray[bool_]
|
|
NDArray[bool_]
|
residue |
distance_map ¶
Compute a residue-by-residue distance map.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
structure to analyze. |
required |
atom_choice
|
AtomChoice
|
per-residue representative point — |
'ca'
|
Returns:
| Type | Description |
|---|---|
NDArray[float32]
|
|
NDArray[float32]
|
between the representative points. |
residue_contacts ¶
residue_contacts(
protein: Protein,
*,
cutoff: float = 5.0,
chain_a: str | None = None,
chain_b: str | None = None,
) -> list[tuple[tuple[str, int], tuple[str, int], float]]
List inter-residue contacts at the all-atom level.
Unlike :func:contact_map, this enumerates contacts as triples of
((chain_a, resid_a), (chain_b, resid_b), distance) and uses the
"any atom within cutoff" definition.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
structure to analyze. |
required |
cutoff
|
float
|
distance threshold in Å (default 5.0). |
5.0
|
chain_a
|
str | None
|
If both |
None
|
chain_b
|
str | None
|
see |
None
|
Returns:
| Type | Description |
|---|---|
list[tuple[tuple[str, int], tuple[str, int], float]]
|
Sorted list of contact tuples. |
dihedral ¶
dihedral(
p1: NDArray[floating],
p2: NDArray[floating],
p3: NDArray[floating],
p4: NDArray[floating],
) -> float
Compute the dihedral angle (in degrees) between four 3D points.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
p1
|
NDArray[floating]
|
|
required |
p2
|
NDArray[floating]
|
|
required |
p3
|
NDArray[floating]
|
|
required |
p4
|
NDArray[floating]
|
|
required |
Returns:
| Type | Description |
|---|---|
float
|
Angle in degrees in |
float
|
formula which avoids the numerical issues of acos near |
float
|
|
dihedrals_batch ¶
Vectorized dihedral over an array of atom quartets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quartets
|
NDArray[floating]
|
|
required |
Returns:
| Type | Description |
|---|---|
NDArray[float64]
|
|
omega ¶
ω (omega) angles per residue, degrees, NaN where undefined.
phi ¶
φ (phi) angles per residue, degrees, NaN where undefined.
phi_psi_omega ¶
phi_psi_omega(
protein: Protein,
) -> tuple[
NDArray[np.float64],
NDArray[np.float64],
NDArray[np.float64],
]
Per-residue backbone dihedrals (φ, ψ, ω) in degrees.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
structure to analyze. |
required |
Returns:
| Type | Description |
|---|---|
NDArray[float64]
|
Three |
NDArray[float64]
|
degrees. Entries where the angle is undefined (chain termini, |
NDArray[float64]
|
missing backbone atoms) are |
psi ¶
ψ (psi) angles per residue, degrees, NaN where undefined.
ramachandran ¶
Per-residue (φ, ψ) pairs for Ramachandran-plot construction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
structure to analyze. |
required |
Returns:
| Type | Description |
|---|---|
NDArray[float64]
|
|
NDArray[float64]
|
undefined contain |
dssp_3state ¶
Return the per-residue 3-state secondary-structure string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
structure to analyze. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A string of |
str
|
residue. |
bounding_box ¶
Axis-aligned bounding box of a structure.
Returns:
| Type | Description |
|---|---|
tuple[NDArray[float64], NDArray[float64]]
|
|
center_at_origin ¶
Translate the structure so its centroid is at the origin (in place).
center_of_mass ¶
Mass-weighted center of mass. Alias for centroid(mass_weighted=True).
centroid ¶
Geometric (or mass-weighted) centroid of a structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
input structure. |
required |
mass_weighted
|
bool
|
If True, weight by atomic mass (i.e. compute the center of mass instead). |
False
|
Returns:
| Type | Description |
|---|---|
NDArray[float64]
|
|
radius_of_gyration ¶
Radius of gyration — RMS distance from atoms to the center of mass.
A standard compactness metric: smaller Rg means more globular.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
input structure. |
required |
mass_weighted
|
bool
|
If True (default), use mass-weighted Rg. |
True
|
Returns:
| Type | Description |
|---|---|
float
|
Radius of gyration in angstroms. |
rotate ¶
Apply a 3x3 rotation in place around the origin.
For a rotation around the centroid, translate to origin first, rotate,
then translate back. Use :func:center_at_origin as a helper.
translate ¶
Translate protein in place by vector.
Mutates the underlying AtomArray.coords directly — both
hierarchical and linear views reflect the change immediately.
rmsd_per_residue ¶
rmsd_per_residue(
mobile: Protein,
reference: Protein,
*,
subset: AtomSubset = "ca",
align: bool = True,
) -> NDArray[np.float32]
Per-residue RMSD after (optionally) aligning the structures globally.
Useful for spotting which loops moved between two conformations or where a folding model disagrees with experiment.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mobile
|
Protein
|
First structure (the one that is moved to align with reference). |
required |
reference
|
Protein
|
Second structure; must have the same residue count as |
required |
subset
|
AtomSubset
|
Atom selector for both the global alignment and the per-residue comparison. |
'ca'
|
align
|
bool
|
Whether to superpose first. |
True
|
Returns:
| Type | Description |
|---|---|
NDArray[float32]
|
|
rmsd_raw ¶
RMSD between two equal-length coordinate sets, no alignment.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
a
|
NDArray[floating]
|
First |
required |
b
|
NDArray[floating]
|
Second |
required |
Returns:
| Type | Description |
|---|---|
float
|
Root-mean-square deviation in the input units (Å for biology). |
sasa_per_residue ¶
sasa_per_residue(
protein: Protein,
*,
probe_radius: float = 1.4,
n_sphere_points: int = 100,
) -> NDArray[np.float64]
Per-residue SASA, summed across atoms in each residue.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protein
|
Protein
|
input structure. |
required |
probe_radius
|
float
|
see :func: |
1.4
|
n_sphere_points
|
int
|
see :func: |
100
|
Returns:
| Type | Description |
|---|---|
NDArray[float64]
|
|
NDArray[float64]
|
in array order. |
total_sasa ¶
Total solvent-accessible surface area (Ų).
kabsch_rmsd ¶
kabsch_rmsd(
mobile: NDArray[floating],
reference: NDArray[floating],
*,
weights: NDArray[floating] | None = None,
) -> float
Return the minimum-RMSD over all rigid-body alignments.
Convenience wrapper around :func:superpose for when you only want
the RMSD value.
superpose ¶
superpose(
mobile: NDArray[floating],
reference: NDArray[floating],
*,
weights: NDArray[floating] | None = None,
) -> SuperpositionResult
Superpose mobile onto reference by optimal rigid-body fit.
Implements the Kabsch / Umeyama algorithm via SVD of the weighted covariance matrix. The returned rotation is guaranteed to be a proper rotation (det = +1), not a reflection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mobile
|
NDArray[floating]
|
|
required |
reference
|
NDArray[floating]
|
|
required |
weights
|
NDArray[floating] | None
|
Optional |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
SuperpositionResult
|
class: |
SuperpositionResult
|
post-superposition RMSD, and aligned mobile coords. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If shapes mismatch or fewer than 3 atoms are given (degenerate; rotation under-determined). |