14 137 157

lhl PRO

leonardlin

https://randomfoo.net/

lhl

AI & ML interests

None yet

Articles

Not Legal Advice on AI Training Data in Japan

8 days ago

• 1

Evaling llm-jp-eval (evals are hard)

15 days ago

• 4

Organizations

Posts 5

Post

1783

Interesting, I've just seen the my first HF spam on one of my new model uploads: shisa-ai/shisa-v1-llama3-70b - someone has an SEO spam page as a HF space attached to the model!?! Wild. Who do I report this to?

Post

1553

For those with an interest in JA language models, this Llama 3 70B test ablation looks like it is the current strongest publicly released, commercially usable, open model available. A lot of caveats I know, but it also matches gpt-3.5-turbo-0125's JA performance, which is worth noting, and is tuned *exclusively* with the old shisa-v1 dataset (so it's chart position will be very short lived).

shisa-ai/shisa-v1-llama3-70b

augmxnt/ultra-orca-boros-en-ja-v1

View all posts

Collections 22

spaces 1

Runtime error

💬

Shisa Ablations

models

None public yet

datasets

None public yet

lhl PRO

AI & ML interests

Articles

Not Legal Advice on AI Training Data in Japan

Evaling llm-jp-eval (evals are hard)

Organizations

Posts 5

Collections 22

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

Accelerating LLM Inference with Staged Speculative Decoding

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

QuIP: 2-Bit Quantization of Large Language Models With Guarantees

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

spaces 1

Shisa Ablations

models

datasets