Cluster & HPC¶
We provide SLURM-oriented scripts for cluster workflows.
Pipeline Scripts¶
| Script | Description |
|---|---|
complex_and_ts_search_local.sh |
Local pipeline template |
complex_and_ts_search_cpu.sh |
SLURM CPU pipeline template |
complex_and_ts_search_gpu.sh |
GPU validator-oriented template |
validation_goflow.sh |
Validate model-generated TS guesses on cluster |
Running on SLURM¶
As with all the other scripts, make sure that to adjust them according to your environment.
Parallel Execution with Hydra¶
python -m motsart.complex_finder.complex_finder -m \
hydra/launcher=joblib \
hydra.launcher.n_jobs=4 \
env=cluster \
"env.rxn_num=range(0,32)"
Learning on Cluster¶
Typical workflow:
- Push code and data:
bash push_to_musica.sh - Run base pipeline:
sbatch complex_and_ts_search_musica.sh - Prepare training/eval data:
bash create_fine_tune_dft_data.shorbash create_preprocess_rtsp_pretrain_data.sh - Import model-generated samples:
bash fetch_and_push_data_pkl_to_results.sh - Validate model-generated guesses:
sbatch validation_goflow.sh - Compute stats:
python -m motsart.validator.compute_stats \
--cluster-folder /data/results_cluster \
--learning-folder /data/results_goflow/finetune_noise_1_TS \
--validator DFTValidator \
--output-csv /data/results_goflow/finetune_noise_1_TS/stats_al.csv \
--cluster-ts-method racer_ts \
--al-ts-method learning \
--mode both
For a general walkthrough, see Paper Reproduction Workflow.