Hoptimizer's picture

43 182

Hoptimizer

bunnycore

·

https://scifilogic.com/

AI & ML interests

LLM, Text to Image and Video, Home Assistant Model

Organizations

None yet

bunnycore's activity

upvoted a collection 5 days ago

Best LLAMA 3 Models

4 items • Updated 5 days ago • 1

upvoted a collection 8 days ago

Neo-Models

Neo • 9 items • Updated 10 days ago • 14

upvoted a paper about 2 months ago

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 51

upvoted 2 collections about 2 months ago

Coding LLM

2 items • Updated Apr 18 • 1

Slider LORA

7 items • Updated Apr 17 • 1

upvoted an article about 2 months ago

Article

Mergoo: Efficiently Build Your Own MoE LLM

By

•

6 days ago

• 36

upvoted 3 collections about 2 months ago

Small MOE

3 items • Updated Apr 14 • 1

General Purpose LLM Dataset

7 items • Updated 5 days ago • 1

General Purpose LLM

4 items • Updated 20 days ago • 1

upvoted 2 collections 2 months ago

Small VLM

6 items • Updated 14 days ago • 1

Science LLM

8 items • Updated Apr 27 • 1

upvoted a paper 2 months ago

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 74

upvoted a collection 2 months ago

Recent Mamba Papers

[NB: Notes are from TuringPost] • 3 items • Updated Mar 26 • 8

upvoted 2 papers 3 months ago

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

Paper • 2403.12032 • Published Mar 18 • 14

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Paper • 2403.05525 • Published Mar 8 • 39

upvoted a collection 3 months ago

Transformers compatible Mamba

This release includes the `mamba` repositories compatible with the `transformers` library • 5 items • Updated Mar 6 • 27

upvoted 2 papers 3 months ago

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

Paper • 2403.00818 • Published Feb 26 • 13

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 568

upvoted 2 papers 4 months ago

Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling

Paper • 2402.10211 • Published Feb 15 • 8

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Paper • 2402.10210 • Published Feb 15 • 28

upvoted 8 papers 5 months ago

LEGO:Language Enhanced Multi-modal Grounding Model

Paper • 2401.06071 • Published Jan 11 • 10

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11 • 23

ReFT: Reasoning with Reinforced Fine-Tuning

Paper • 2401.08967 • Published Jan 17 • 27

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 135

Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 74

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Paper • 2401.02994 • Published Jan 4 • 44

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 154

LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2 • 25

upvoted a collection 5 months ago

🔮 Mixture of Experts

MoE done using mergekit and LazyMergekit: https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb#scrollTo=d5mYzDo1q96y • 13 items • Updated 12 days ago • 21

upvoted a paper 5 months ago

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50

upvoted a collection 5 months ago

Vision Models (GGUF)

How to use: Download a "mmproj" model file + one or more of the primary model files. • 5 items • Updated Dec 22, 2023 • 34

upvoted 2 papers 5 months ago

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 61

I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models

Paper • 2312.16693 • Published Dec 27, 2023 • 12

upvoted 2 papers 6 months ago

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 26

YUAN 2.0: A Large Language Model with Localized Filtering-based Attention

Paper • 2311.15786 • Published Nov 27, 2023 • 7

upvoted 3 papers 7 months ago

Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster

Paper • 2311.08263 • Published Nov 14, 2023 • 14

Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 26

Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 14

upvoted a paper 8 months ago

GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond

Paper • 2309.16583 • Published Sep 28, 2023 • 12

upvoted 3 papers 9 months ago

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Paper • 2309.10020 • Published Sep 18, 2023 • 39

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 37

GPT Can Solve Mathematical Problems Without a Calculator

Paper • 2309.03241 • Published Sep 6, 2023 • 17

upvoted a paper 10 months ago

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

Paper • 2308.06873 • Published Aug 14, 2023 • 24