Skip to content

molforge.wrappers.md

md

MD-engine wrappers.

Concrete engines
  • :class:OpenMM — implemented (Python-first MD, GPU-accelerated)
  • :class:GROMACS — implemented (CLI-based; the classic MD workhorse)
Shared
  • :class:MDEngine — abstract base for the engine contract
  • :class:MDEngineNotInstalledError — raised when an engine's dependencies (OpenMM, or the gmx executable) aren't found.

All engines expose the same prepare -> minimize -> run flow so users can swap engines without rewriting their pipeline.

MDEngine

Bases: ABC

Abstract base for MD engines (OpenMM, GROMACS, ...).

Subclasses live under :mod:molforge.wrappers.md and must implement :meth:prepare, :meth:minimize, and :meth:run. The contract is deliberately small so users can swap engines without rewriting their pipeline.

Attributes:

Name Type Description
name str

Human-readable engine name (set by subclasses).

prepare abstractmethod

prepare(
    protein: Protein, *, force_field: str, **kwargs: object
) -> Simulation

Build a :class:Simulation from a protein structure.

Concrete engines handle the engine-specific setup: parameterize the system against the force field, build the topology, place the structure in a (possibly periodic) simulation box, add solvent if requested, etc.

minimize abstractmethod

minimize(
    simulation: Simulation,
    *,
    max_iterations: int = 1000,
    tolerance: float = 10.0,
    **kwargs: object,
) -> Simulation

Energy-minimize the system in place and return it.

Parameters:

Name Type Description Default
simulation Simulation

A :class:Simulation (typically just returned from :meth:prepare).

required
max_iterations int

Limit on minimizer steps.

1000
tolerance float

Convergence tolerance (kJ/mol/nm).

10.0

Returns:

Type Description
Simulation

The same :class:Simulation with updated coordinates.

run abstractmethod

run(
    simulation: Simulation,
    *,
    n_steps: int,
    save_every: int = 1,
    **kwargs: object,
) -> Trajectory

Integrate the simulation for n_steps and return a :class:Trajectory containing the recorded frames.

Parameters:

Name Type Description Default
simulation Simulation

A :class:Simulation.

required
n_steps int

Number of integrator steps to run.

required
save_every int

Record a frame every save_every steps. A trajectory has n_steps // save_every + 1 frames (the +1 is the initial state).

1

Returns:

Name Type Description
A Trajectory

class:Trajectory.

MDEngineNotInstalledError

Bases: ImportError

Raised when an MD engine's heavy dependencies aren't installed.

Simulation dataclass

Simulation(
    topology: Protein,
    coordinates: NDArray[float32],
    velocities: NDArray[float32] | None = None,
    time: float = 0.0,
    force_field: str = "",
    temperature: float = 300.0,
    timestep: float = 0.002,
    engine_handle: object | None = None,
    metadata: dict[str, object] = dict(),
)

The state of an in-progress MD simulation.

Attributes:

Name Type Description
topology Protein

The system's :class:Protein (atoms + connectivity).

coordinates NDArray[float32]

(n_atoms, 3) float32 current positions in Å.

velocities NDArray[float32] | None

(n_atoms, 3) float32 current velocities. None until the simulation has been initialized with a thermostat target.

time float

Current simulation time (ps).

force_field str

Force-field name (e.g. "amber99sb", "amber14-all").

temperature float

Thermostat target temperature (K).

timestep float

Integrator timestep (ps).

engine_handle object | None

Engine-private. Not part of the public API. An opaque reference to whatever live state the engine wrapper that produced this :class:Simulation needs to resume it — for OpenMM this is the openmm.app.Simulation object, for GROMACS a handle to the run directory, and so on. Its concrete type is intentionally object: callers must not inspect it, depend on its type, or set it themselves. It is engine wrapper ↔ engine wrapper plumbing.

Two consequences worth being explicit about:

  • It is not serialized. :class:Simulation is a plain dataclass, but engine_handle typically wraps C-extension state that cannot be pickled. Any persistence layer must drop this field and have the engine wrapper rebuild it on resume.
  • It carries no semantic-versioning guarantee. The set of things that may appear here, and the fact that the field exists at all, can change between minor releases.

For per-simulation data you do want to read, use :attr:metadata.

metadata dict[str, object]

Free-form engine-specific extras. Unlike engine_handle this is plain, inspectable data (strings, numbers, arrays) and is safe to read and serialize.

Trajectory dataclass

Trajectory(
    topology: Protein,
    coordinates: NDArray[float32],
    times: NDArray[float64] | None = None,
    energies: NDArray[float64] | None = None,
    temperatures: NDArray[float64] | None = None,
    metadata: dict[str, object] = dict(),
)

A frame-indexed MD trajectory.

Attributes:

Name Type Description
topology Protein

A :class:molforge.core.Protein defining the atoms, their elements/names/connectivity. The topology is the same across all frames.

coordinates NDArray[float32]

(n_frames, n_atoms, 3) float32 array of per-frame coordinates in Å.

times NDArray[float64] | None

(n_frames,) float array of simulation time per frame, in picoseconds. None if not recorded.

energies NDArray[float64] | None

(n_frames,) float array of potential energies (kJ/mol). None if not recorded.

temperatures NDArray[float64] | None

(n_frames,) float array of instantaneous temperatures (K). None if not recorded.

metadata dict[str, object]

engine-specific extras (force field name, integrator, timestep, etc.).

n_frames property

n_frames: int

Number of frames in the trajectory.

n_atoms property

n_atoms: int

Number of atoms (same across all frames).

frame

frame(i: int) -> Protein

Return frame i as a :class:Protein snapshot.

The returned Protein shares the topology of this trajectory but has its own coordinate array.

GROMACS

GROMACS(
    *,
    gmx_executable: str = "gmx",
    water_model: str = "none",
    box_margin: float = 1.0,
    box_type: str = "cubic",
    verbose: bool = False,
)

Bases: MDEngine

Wrapper around the GROMACS MD engine.

Parameters:

Name Type Description Default
gmx_executable str

Name or path of the GROMACS driver binary. Defaults to "gmx"; set this when GROMACS is installed under a different name (e.g. "gmx_mpi") or not on PATH. Resolution is lazy — construction never touches the filesystem, so a GROMACS() instance is cheap to create even where GROMACS is not installed.

'gmx'
water_model str

Water model passed to pdb2gmx -water. Use "none" (the default) for a vacuum simulation; any other value triggers a gmx solvate step in :meth:prepare.

'none'
box_margin float

Minimum distance (nm) between the solute and the box edge, passed to editconf -d.

1.0
box_type str

Box shape for editconf -bt ("cubic", "dodecahedron", "octahedron", ...).

'cubic'
verbose bool

When True, GROMACS subprocess stdout/stderr is not captured, so it streams to the console. Useful for debugging a failing run.

False
Example

from molforge.wrappers.md import GROMACS import molforge as mf

protein = mf.load("protein.pdb") engine = GROMACS(water_model="tip3p") sim = engine.prepare(protein, force_field="amber99sb-ildn") sim = engine.minimize(sim, max_iterations=500) traj = engine.run(sim, n_steps=5000, save_every=500) traj.n_frames 11

prepare

prepare(
    protein: Protein,
    *,
    force_field: str = "amber99sb-ildn",
    temperature: float = 300.0,
    timestep: float = 0.002,
    **_kwargs: object,
) -> Simulation

Build a GROMACS :class:Simulation from a protein structure.

Runs pdb2gmxeditconf → (optionally) solvate in a fresh run directory.

Parameters:

Name Type Description Default
protein Protein

Input structure. pdb2gmx needs every heavy atom of every residue it is asked to parameterize, and only knows the force field's standard residues.

required
force_field str

A GROMACS force-field name (see :data:_KNOWN_FORCE_FIELDS).

'amber99sb-ildn'
temperature float

Thermostat target (K), stored on the returned :class:Simulation and used by :meth:run.

300.0
timestep float

Integrator timestep (ps), likewise stored and used by :meth:run.

0.002

Returns:

Name Type Description
A Simulation

class:Simulation whose engine_handle and

Simulation

metadata["run_dir"] both hold the run-directory path.

Raises:

Type Description
MDEngineNotInstalledError

If gmx cannot be found.

ValueError

If force_field is not recognized.

RuntimeError

If any GROMACS step fails.

minimize

minimize(
    simulation: Simulation,
    *,
    max_iterations: int = 1000,
    tolerance: float = 10.0,
    **_kwargs: object,
) -> Simulation

Energy-minimize the system with steepest descent.

Writes an EM .mdp, assembles a .tpr with grompp, and runs mdrun. Returns the same :class:Simulation with its coordinates updated to the minimized structure.

Parameters:

Name Type Description Default
simulation Simulation

A :class:Simulation from :meth:prepare.

required
max_iterations int

Cap on steepest-descent steps (nsteps).

1000
tolerance float

Convergence tolerance in kJ/mol/nm (emtol).

10.0

Raises:

Type Description
MDEngineNotInstalledError

If gmx cannot be found.

ValueError

If the simulation has no GROMACS run directory.

RuntimeError

If a GROMACS step fails.

run

run(
    simulation: Simulation,
    *,
    n_steps: int,
    save_every: int = 1,
    **_kwargs: object,
) -> Trajectory

Integrate the system and return a :class:Trajectory.

Writes a production MD .mdp, assembles the .tpr, runs mdrun, then reads the frames back by converting the .xtc to a multi-model PDB with trjconv and the energies with gmx energy.

Parameters:

Name Type Description Default
simulation Simulation

A :class:Simulation from :meth:prepare (typically after :meth:minimize).

required
n_steps int

Number of integrator steps.

required
save_every int

Record a frame every save_every steps. The trajectory has n_steps // save_every + 1 frames (the +1 is the initial frame).

1

Raises:

Type Description
MDEngineNotInstalledError

If gmx cannot be found.

ValueError

If n_steps / save_every are invalid or the simulation has no run directory.

RuntimeError

If a GROMACS step fails or produces no frames.

OpenMM

OpenMM(
    *,
    platform: str | None = None,
    precision: str = "mixed",
    nonbonded_cutoff: float = 1.0,
    nonbonded_method: str = "NoCutoff",
    constraints: str | None = "HBonds",
    add_hydrogens: bool = True,
)

Bases: MDEngine

OpenMM MD engine wrapper.

Parameters:

Name Type Description Default
platform str | None

"CUDA", "CPU", "OpenCL", or None to let OpenMM pick the fastest available. CUDA is by far the fastest on supported NVIDIA GPUs.

None
precision str

"mixed" (default) or "single" / "double". Mixed is the standard choice — accurate enough for biology, ~2x faster than double on GPU.

'mixed'
nonbonded_cutoff float

Cutoff distance for nonbonded interactions in nanometers. Default 1.0 nm. Increase for larger boxes; leave alone for typical small-protein simulations.

1.0
nonbonded_method str

"NoCutoff" (default, suitable for implicit solvent / vacuum), "CutoffNonPeriodic", "PME" (requires periodic box).

'NoCutoff'
constraints str | None

"HBonds" (default — bonds to H are constrained, allowing 2-fs timestep), "AllBonds", or None.

'HBonds'
add_hydrogens bool

When True (default), missing hydrogens are added with OpenMM's Modeller.addHydrogens during :meth:prepare. This is what makes a heavy-atom structure — the normal output of folding and docking engines — usable as-is: a force field needs explicit hydrogens, and without this step prepare fails with a cryptic "no template found" error. The step is idempotent, so a structure that already has hydrogens is unaffected. Set False only if you have pre-protonated the structure yourself and want OpenMM to use exactly those atoms.

True
Example

from molforge.wrappers.md import OpenMM engine = OpenMM(platform="CUDA") sim = engine.prepare(my_protein, force_field="amber14-all") sim = engine.minimize(sim) traj = engine.run(sim, n_steps=50_000, save_every=500) traj.n_frames 101

prepare

prepare(
    protein: Protein,
    *,
    force_field: str = "amber14-all",
    temperature: float = 300.0,
    timestep: float = 0.002,
    **_kwargs: object,
) -> Simulation

Build an OpenMM simulation from a :class:Protein.

Parameters:

Name Type Description Default
protein Protein

input structure (no solvent — vacuum / implicit by default).

required
force_field str

name in :data:_FORCE_FIELD_FILES or any XML filename OpenMM can find.

'amber14-all'
temperature float

thermostat target in K (default 300).

300.0
timestep float

integrator timestep in picoseconds (default 0.002 = 2 fs; compatible with HBonds constraints).

0.002

Returns:

Name Type Description
A Simulation

class:Simulation whose engine_handle is the OpenMM

Simulation

Simulation object — drop down to OpenMM's API via that

Simulation

attribute for anything not exposed by molforge.

minimize

minimize(
    simulation: Simulation,
    *,
    max_iterations: int = 1000,
    tolerance: float = 10.0,
    **_kwargs: object,
) -> Simulation

Energy-minimize simulation's current configuration in place.

Parameters:

Name Type Description Default
simulation Simulation

a :class:Simulation from :meth:prepare.

required
max_iterations int

cap on minimizer steps. 0 = unlimited.

1000
tolerance float

convergence threshold in kJ/mol/nm.

10.0

Returns:

Type Description
Simulation

The same simulation with updated coordinates and zero velocities

Simulation

(OpenMM resets velocities after minimization).

run

run(
    simulation: Simulation,
    *,
    n_steps: int,
    save_every: int = 100,
    **_kwargs: object,
) -> Trajectory

Integrate simulation for n_steps and return a Trajectory.

Parameters:

Name Type Description Default
simulation Simulation

a :class:Simulation (should be minimized first).

required
n_steps int

number of MD steps to run. With the default 2 fs timestep, 50,000 steps = 100 ps.

required
save_every int

record a frame every N steps. Default 100; a 50,000-step run with save_every=100 gives 501 frames.

100

Returns:

Name Type Description
A Trajectory

class:Trajectory whose coordinates is shape

Trajectory

(n_frames, n_atoms, 3) and times is the

Trajectory

corresponding picosecond timestamps.