BASALT: Binning Across a Series of Assemblies Toolkit

πŸ’» How to download BASALT_WEIGHT from Huggingface?

We highly recommend install ing huggingface frist

pip install -U huggingface_hub

Then, run the following command to download the model to your current directory:

huggingface-cli login

huggingface-cli download PKU-EMBL/BASALT_WEIGHT --local-dir ./BASALT

Then you can set BASALT_WEIGHT environemntal setting as mentioned in the following installment section instead of by accessing Google Drive or other downlod methods.

⏬ BASALT v1.2.0 INSTALLATION

  1. BASALT 1.2.0 installation

    Please refer to the installation guide of BASALT v1.2.0:

    git clone https://github.com/EMBL-PKU/BASALT.git
    
    cd BASALT
    
    conda create -n basalt_env -c conda-forge -c bioconda \     python=3.12 \     megahit metabat2 maxbin2 concoct prodigal semibin \     bedtools blast bowtie2 diamond checkm2 \     unicycler spades samtools racon pplacer pilon \     ncbi-vdb minimap2 miniasm idba hmmer entrez-direct \     biopython uv --yes
    
    conda activate basalt_env
    
    uv pip install tensorflow torch torchvision tensorboard tensorboardx \     lightgbm scikit-learn numpy==1.26.4 python-igr
    aph scipy pandas matplotlib \     cython biolib joblib tqdm requests checkm-genome
    

    Download BASALT Deep Learning Model Weights:

     # please chanage the download path according to your computer environment
     
     python BASALT_models_download.py --path "my_model_folder"
    

    Download BASALT script files and change permission:

    chmod +x install.sh
    
    bash install.sh
    
    chmod +x /path/to/basalt/bin/*
    

    Set environment variables by adding the following lines to your ~/.bashrc file:

    nano ~/.bashrc
    
    export CHECKM2DB=/path/to/checkm2db/CheckM2_database/uniref100.KO.1.dmnd
    export CHECKM_DATA_PATH=/path/to/checkmdb
    export BASALT_WEIGHT=/path/to/BASALT
    
    source ~/.bashrc
    

    The below Google Drive link provide the essential files for checkm_db, checkm2_db and newest singularity image.

    https://drive.google.com/drive/folders/1d0e_2FpYRBAZLwKXl8fA-yDK4b5PBA_E?usp=sharing
    

✏️ Citation

If you use this software in your research, please cite our paper:

Z Qiu, L Yuan, C Lian, B Lin, J Chen, R Mu, X Qiao, L Zhang, Z Xu, L Fan, Y Zhang, S Wang, J Li, H Cao, B Li, B Chen, C Song, Y Liu, L Shi, Y Tian, J Ni, T Zhang, J Zhou, W Zhuang, K Yu. BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis. Nat. Commun. 2024, 15, 2179. https://doi.org/10.1038/s41467-024-46539-7

@article{qiu2024basalt,
  title={BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis},
  author={Qiu, Zhiguang and Yuan, Li and Lian, Chun-Ang and Lin, Bin and Chen, Jie and Mu, Rong and Qiao, Xuejiao and Zhang, Liyu and Xu, Zheng and Fan, Lu and others},
  journal={Nature communications},
  volume={15},
  number={1},
  pages={2179},
  year={2024},
  publisher={Nature Publishing Group UK London}
}

πŸ“– References

  1. Qiu, Z. et al. BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis. Nature Communications 15, 2179 (2024).
  2. Sieber, C.M. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nature microbiology 3, 836-843 (2018).
  3. Uritskiy, G.V., DiRuggiero, J. & Taylor, J. MetaWRAPβ€”a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 1-13 (2018).
  4. Olm, M.R., Brown, C.T., Brooks, B. & Banfield, J.F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. The ISME journal 11, 2864-2868 (2017).
  5. Xue, W., Liu, Z., Zhang, Y. et al. LorBin: efficient binning of long-read metagenomes by multiscale adaptive clustering and evaluation. Nat Commun 16, 9353 (2025).
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support