Taishi's picture

19 17

Taishi

Taishi-N324

·

https://taishi-n324.github.io/

AI & ML interests

None yet

Recent Activity

updated a Space about 22 hours ago

tokyotech-llm/README

authored a paper 6 days ago

Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models

updated a collection 18 days ago

View all activity

Organizations

Taishi-N324's activity

upvoted a collection 29 days ago

Llama-3.3-Swallow

3 items • Updated 29 days ago • 2

upvoted 2 papers about 1 month ago

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Paper • 2503.04412 • Published Mar 6 • 1

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Paper • 2502.19261 • Published Feb 26 • 7

upvoted a collection about 1 month ago

Drop-Upcycling

31 items • Updated 6 days ago • 2

upvoted a collection 2 months ago

TinySwallow

Compact Japanese models trained with "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models" • 5 items • Updated Jan 30 • 16

upvoted a collection 5 months ago

CycleQD

5 items • Updated Oct 27, 2024 • 3

upvoted a paper 5 months ago

Agent Skill Acquisition for Large Language Models via CycleQD

Paper • 2410.14735 • Published Oct 16, 2024 • 2

upvoted a collection 6 months ago

Llama-3.1-Swallow

9 items • Updated Jan 31 • 6

upvoted a paper 9 months ago

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Paper • 2407.03963 • Published Jul 4, 2024 • 19

upvoted a collection 9 months ago

Llama-3-Swallow

4 items • Updated Jan 31 • 4

upvoted 4 collections 11 months ago

Swallow-MS-instruct

1 item • Updated Jan 31 • 1

Swallow-MX

Swallow MX(Mixtral) models • 1 item • Updated Jan 31 • 1

Swallow-MS

Swallow MS(Mistral) models • 1 item • Updated Jan 31 • 1

Swallow

Continual Pre-Training from Llama 2 • 7 items • Updated Jan 31 • 2

upvoted a collection 12 months ago

Swallow-instruct

Swallow instruction tuning models • 8 items • Updated Jan 31 • 4

upvoted a paper about 1 year ago

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30, 2024 • 42