BASALT: Binning Across a Series of Assemblies Toolkit
π» How to download BASALT_WEIGHT from Huggingface?
We highly recommend install ing huggingface frist
pip install -U huggingface_hub
Then, run the following command to download the model to your current directory:
huggingface-cli login
huggingface-cli download PKU-EMBL/BASALT_WEIGHT --local-dir ./BASALT
Then you can set BASALT_WEIGHT environemntal setting as mentioned in the following installment section instead of by accessing Google Drive or other downlod methods.
β¬ BASALT v1.2.0 INSTALLATION
BASALT 1.2.0 installation
Please refer to the installation guide of BASALT v1.2.0:
git clone https://github.com/EMBL-PKU/BASALT.git cd BASALT conda create -n basalt_env -c conda-forge -c bioconda \ python=3.12 \ megahit metabat2 maxbin2 concoct prodigal semibin \ bedtools blast bowtie2 diamond checkm2 \ unicycler spades samtools racon pplacer pilon \ ncbi-vdb minimap2 miniasm idba hmmer entrez-direct \ biopython uv --yes conda activate basalt_env uv pip install tensorflow torch torchvision tensorboard tensorboardx \ lightgbm scikit-learn numpy==1.26.4 python-igr aph scipy pandas matplotlib \ cython biolib joblib tqdm requests checkm-genomeDownload BASALT Deep Learning Model Weights:
# please chanage the download path according to your computer environment python BASALT_models_download.py --path "my_model_folder"Download BASALT script files and change permission:
chmod +x install.sh bash install.sh chmod +x /path/to/basalt/bin/*Set environment variables by adding the following lines to your ~/.bashrc file:
nano ~/.bashrc export CHECKM2DB=/path/to/checkm2db/CheckM2_database/uniref100.KO.1.dmnd export CHECKM_DATA_PATH=/path/to/checkmdb export BASALT_WEIGHT=/path/to/BASALT source ~/.bashrcThe below Google Drive link provide the essential files for checkm_db, checkm2_db and newest singularity image.
https://drive.google.com/drive/folders/1d0e_2FpYRBAZLwKXl8fA-yDK4b5PBA_E?usp=sharing
βοΈ Citation
If you use this software in your research, please cite our paper:
Z Qiu, L Yuan, C Lian, B Lin, J Chen, R Mu, X Qiao, L Zhang, Z Xu, L Fan, Y Zhang, S Wang, J Li, H Cao, B Li, B Chen, C Song, Y Liu, L Shi, Y Tian, J Ni, T Zhang, J Zhou, W Zhuang, K Yu. BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis. Nat. Commun. 2024, 15, 2179. https://doi.org/10.1038/s41467-024-46539-7
@article{qiu2024basalt,
title={BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis},
author={Qiu, Zhiguang and Yuan, Li and Lian, Chun-Ang and Lin, Bin and Chen, Jie and Mu, Rong and Qiao, Xuejiao and Zhang, Liyu and Xu, Zheng and Fan, Lu and others},
journal={Nature communications},
volume={15},
number={1},
pages={2179},
year={2024},
publisher={Nature Publishing Group UK London}
}
π References
- Qiu, Z. et al. BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis. Nature Communications 15, 2179 (2024).
- Sieber, C.M. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nature microbiology 3, 836-843 (2018).
- Uritskiy, G.V., DiRuggiero, J. & Taylor, J. MetaWRAPβa flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 1-13 (2018).
- Olm, M.R., Brown, C.T., Brooks, B. & Banfield, J.F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. The ISME journal 11, 2864-2868 (2017).
- Xue, W., Liu, Z., Zhang, Y. et al. LorBin: efficient binning of long-read metagenomes by multiscale adaptive clustering and evaluation. Nat Commun 16, 9353 (2025).