Training DBRX-like model

#13
by nguyenthanhdo - opened

Hi, i've seen the team mentioned that the code used for training DBRX are optimized versions of Composer, LLM Foundry, MegaBlocks and Streaming but I found it quite challenging to navigate. I want to pretrain a MOE model (DBRX-like architecture), could you please guide me how would I do it with the opensourced versions of those mentioned libs?

This comment has been hidden
This comment has been hidden
Databricks org

Sign up or log in to comment