Tansu Turkoglu's picture

Tansu Turkoglu

tansutt
·

AI & ML interests

None yet

Recent Activity

liked a model 4 months ago
RunDiffusion/Juggernaut-XL-v9
updated a dataset 4 months ago
tansutt/MedQA-USMLE-4-options-hf
updated a model 4 months ago
tansutt/medqa-usmle-mistral-7B-Instruct-v0.3
View all activity

Organizations

MLX Community's profile picture

tansutt's activity

reacted to mrfakename's post with 👀 11 months ago
view post
Post
4067
Mistral AI recently released a new Mixtral model. It's another Mixture of Experts model with 8 experts, each with 22B parameters. It requires over 200GB of VRAM to run in float16, and over 70GB of VRAM to run in int4. However, individuals have been successful at finetuning it on Apple Silicon laptops using the MLX framework. It features a 64K context window, twice that of their previous models (32K).

The model was released over torrent, a method Mistral has recently often used for their releases. While the license has not been confirmed yet, a moderator on their Discord server yesterday suggested it was Apache 2.0 licensed.

Sources:
https://twitter.com/_philschmid/status/1778051363554934874
https://twitter.com/reach_vb/status/1777946948617605384
  • 1 reply
·