Aurora-M: The First Open Source Biden-Harris Executive Order Red teamed Multilingual Language Model Apr 2 • 4
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization Paper • 2404.03605 • Published Apr 4 • 1
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published 3 days ago • 9
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent 18 days ago • 67
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Paper • 2404.08676 • Published Apr 6 • 2
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 10 items • Updated 2 days ago • 101
Granite Code Models Collection Granite code models trained by IBM. • 2 items • Updated 15 days ago • 1
JetMoE: Reaching Llama2 Performance with 0.1M Dollars Paper • 2404.07413 • Published 29 days ago • 32
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Paper • 2404.05567 • Published Apr 8 • 10
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback Paper • 2402.02479 • Published Feb 4 • 2
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning By mayank-mishra • Mar 9 • 4
view article Article Aurora-M: The First Open Source Biden-Harris Executive Order Red teamed Multilingual Language Model By mayank-mishra • Apr 2 • 4
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30 • 39
Teaching Large Language Models to Reason with Reinforcement Learning Paper • 2403.04642 • Published Mar 7 • 42
Aurora-M models Collection Aurora-M models (base, biden-harris redteams and instruct) • 5 items • Updated 4 days ago • 15
Slim-Pajama 1B models Collection A bunch of 1B models trained on 629B tokens of Slim Pajama dataset • 4 items • Updated Mar 11 • 1
Variational Inference with Latent Space Quantization for Adversarial Resilience Paper • 1903.09940 • Published Mar 24, 2019 • 1
Adversarial Approximate Inference for Speech to Electroglottograph Conversion Paper • 1903.12248 • Published Mar 28, 2019 • 1
Variational Learning for Unsupervised Knowledge Grounded Dialogs Paper • 2112.00653 • Published Nov 23, 2021 • 1
Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog Paper • 2210.07295 • Published Oct 13, 2022 • 1
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 23
ModuleFormer: Learning Modular Large Language Models From Uncurated Data Paper • 2306.04640 • Published Jun 7, 2023 • 7