molforge.wrappers.folding¶
folding ¶
Folding-engine wrappers.
Concrete engines
- :class:
ESMFold— implemented (single-sequence transformer; fast) - :class:
AlphaFold— implemented (MSA-based via ColabFold) - :class:
Boltz— implemented (Boltz-1 / Boltz-2 via subprocess) - :class:
RoseTTAFold— implemented (RoseTTAFold All-Atom; subprocess)
All engines write per-residue confidence to
protein.metadata["confidence_per_residue"] so downstream code can
read confidence uniformly regardless of which engine produced the
structure.
FoldingEngine ¶
Bases: ABC
Abstract base for sequence-to-structure prediction engines.
Subclasses must implement :meth:predict. The default implementation
of :meth:predict_many is a simple loop; engines that support
batching (most do) should override it for efficiency.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Human-readable engine name (set by subclasses). |
predict
abstractmethod
¶
Predict a single structure from a sequence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
One-letter amino-acid sequence. Whitespace is
stripped; non-letter characters raise :class: |
required |
**kwargs
|
object
|
Engine-specific options. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
Protein
|
class: |
Protein
|
at minimum |
|
Protein
|
produces one, |
predict_many ¶
Predict structures for a batch of sequences.
The default implementation is a serial loop. Engines with batch APIs (almost all of them) should override this.
FoldingEngineNotInstalledError ¶
Bases: ImportError
Raised when a folding engine's heavy dependencies aren't installed.
The message points at the relevant pip install extras so users
can fix it without grepping the docs.
AlphaFold ¶
AlphaFold(
*,
mode: Literal["local", "server"] = "local",
num_models: int = 5,
num_recycles: int = 3,
msa_mode: str = "mmseqs2_uniref_env",
device: str | None = None,
model_type: str = "AlphaFold2-ptm",
)
Bases: FoldingEngine
Wrapper around AlphaFold via ColabFold's Python API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mode
|
Literal['local', 'server']
|
|
'local'
|
num_models
|
int
|
How many of the 5 AlphaFold models to run. Default 5 (full ensemble). Set to 1 for faster preview predictions; the AlphaFold paper showed that the top-1-of-5 best model captures most of the accuracy. |
5
|
num_recycles
|
int
|
AlphaFold recycling iterations. Default 3 matches the original paper. More = slower but slightly better; useful for low-confidence regions. |
3
|
msa_mode
|
str
|
ColabFold MSA pipeline. |
'mmseqs2_uniref_env'
|
device
|
str | None
|
|
None
|
model_type
|
str
|
|
'AlphaFold2-ptm'
|
Example
from molforge.wrappers.folding import AlphaFold engine = AlphaFold(num_models=1, num_recycles=3) # fastest preview protein = engine.predict("MKTVRQERLKSIVRILERSK") protein.metadata["mean_confidence"] 87.2
predict ¶
Fold a single sequence into a :class:Protein.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
One-letter amino-acid sequence. |
required |
**kwargs
|
object
|
Reserved for future per-call options. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
Protein
|
class: |
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
Boltz ¶
Boltz(
*,
model_version: Literal["boltz1", "boltz2"] = "boltz2",
use_msa_server: bool = True,
recycling_steps: int | None = None,
diffusion_samples: int | None = None,
sampling_steps: int | None = None,
device: str | None = None,
executable: str | None = None,
cache_dir: str | None = None,
)
Bases: FoldingEngine
Wrapper around the Boltz biomolecular prediction model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_version
|
Literal['boltz1', 'boltz2']
|
|
'boltz2'
|
use_msa_server
|
bool
|
If |
True
|
recycling_steps
|
int | None
|
How many trunk-recycling rounds Boltz runs.
Default |
None
|
diffusion_samples
|
int | None
|
Number of diffusion samples drawn per
prediction. Default |
None
|
sampling_steps
|
int | None
|
Number of diffusion sampling steps. Default
|
None
|
device
|
str | None
|
Which device to use. Default |
None
|
executable
|
str | None
|
Path to the |
None
|
cache_dir
|
str | None
|
Where Boltz looks for / downloads its weights.
|
None
|
Example
from molforge.wrappers.folding import Boltz engine = Boltz(model_version="boltz2", use_msa_server=True) protein = engine.predict("MKTVRQERLKSIVRILERSK") protein.metadata["mean_confidence"] 87.3 protein.metadata["ptm"] 0.84
predict ¶
Fold a single sequence into a :class:Protein via the boltz CLI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
One-letter amino-acid sequence. |
required |
**kwargs
|
object
|
Reserved for future per-call options. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
Protein
|
class: |
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
Raises:
| Type | Description |
|---|---|
FoldingEngineNotInstalledError
|
If the |
RuntimeError
|
If the CLI runs but produces no output, or its output can't be parsed. |
ESMFold ¶
ESMFold(
*,
model_name: str = "facebook/esmfold_v1",
device: str | None = None,
chunk_size: int | None = None,
dtype: str = "float32",
)
Bases: FoldingEngine
Wrapper around Meta AI's ESMFold (single-sequence transformer folder).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
HuggingFace model identifier. Defaults to
|
'facebook/esmfold_v1'
|
device
|
str | None
|
Where to run inference. |
None
|
chunk_size
|
int | None
|
Axial-attention chunk size (lower = less memory but
slower). |
None
|
dtype
|
str
|
|
'float32'
|
Example
from molforge.wrappers.folding import ESMFold engine = ESMFold(device="cuda") protein = engine.predict("MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVS") protein.metadata["mean_confidence"] 82.4
predict ¶
Fold a single sequence into a :class:Protein.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
One-letter amino-acid sequence. |
required |
**kwargs
|
object
|
Reserved for future per-call options; currently unused. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
Protein
|
class: |
Protein
|
structure, and: |
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
RoseTTAFold ¶
RoseTTAFold(
*,
repo_dir: str | None = None,
python_executable: str | None = None,
max_cycle: int | None = None,
job_name: str = "molforge_prediction",
extra_overrides: list[str] | None = None,
)
Bases: FoldingEngine
Wrapper around RoseTTAFold All-Atom (RFAA) for single-chain protein folding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
repo_dir
|
str | None
|
Path to the cloned |
None
|
python_executable
|
str | None
|
Path to the Python interpreter that has the
RFAA environment activated. Default |
None
|
max_cycle
|
int | None
|
Hydra override for |
None
|
job_name
|
str
|
Name used for output files. Defaults to
|
'molforge_prediction'
|
extra_overrides
|
list[str] | None
|
Additional Hydra-style overrides (e.g.
|
None
|
Example
from molforge.wrappers.folding import RoseTTAFold engine = RoseTTAFold(repo_dir="/opt/RoseTTAFold-All-Atom", ... max_cycle=10) protein = engine.predict("MKTVRQERLKSIVRILERSK") protein.metadata["mean_confidence"] 82.4 protein.metadata["pae_inter"] # RFAA's headline confidence 4.8
predict ¶
Fold a single sequence into a :class:Protein via RFAA.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
One-letter amino-acid sequence. |
required |
**kwargs
|
object
|
Reserved for future per-call options. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
Protein
|
class: |
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
|
Protein
|
|
Raises:
| Type | Description |
|---|---|
FoldingEngineNotInstalledError
|
If |
RuntimeError
|
If the CLI fails or produces no output. |