code2022's picture
Update README.md
2467259 verified
metadata
license: apache-2.0
datasets:
  - google-research-datasets/paws
language:
  - en
metrics:
  - accuracy
base_model:
  - HuggingFaceTB/SmolLM2-135M-Instruct
pipeline_tag: text-generation

Introduction

Running SLMs in web browsers

This repository is part of playbook for experiments on fine tuning small language models using LoRA, exporting them to ONNX and running them locally using ONNX compatibale runtime like javascript(node js) and WASM (browser)

Before you start

To run NodeJS example (NodeJS + onnx-runtime, server side)

  • Simple run node app.js This is what you should see

NodeJS application showing paraphrasing screen NodeJS runtime memory usage

To run web browser based demo (WASM based in-browser inference)

  • Simply access web.html from a local server (example http://localhost:3000/web.html)

This is what you should see

Web browser showing memory usage when running onnx model using WASM Web browser memory usage

Citation

@misc{allal2024SmolLM,
      title={SmolLM - blazingly fast and remarkably powerful}, 
      author={Loubna Ben Allal and Anton Lozhkov and Elie Bakouch and Leandro von Werra and Thomas Wolf},
      year={2024},
}