Complete Overhaul (without conflict)

#3
  • increased context length
  • increased people use time (Prev is 120 sec decreased to 20 means 100 sec extra for users)
  • changed theme
  • added 4k model
  • Made it one place for all phi 3 medium models.
Walmart-the-bag changed pull request status to merged

Thank you πŸ’«

If the duration is at 20, doesnt that mean they only have 20 seconds of gpu time?

RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":830, please report a bug to PyTorch.

If the duration is at 20, doesnt that mean they only have 20 seconds of gpu time?

Duration is minimum GPU time required and also Maximum generation time.

RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":830, please report a bug to PyTorch.

Each generation is getting this.

Its loading too much.
image.png

Walmart-the-bag pinned discussion
Walmart-the-bag unpinned discussion

And if you could make the chatbox bigger, it'd be amazing. (open new pr)
image.png

I have saved your code, and updated it with old one until fixed.

If you dont have it, tell me, I saved it.

I have saved your code, and updated it with old one until fixed.

If you dont have it, tell me, I saved it.

I duplicated it, and solving the error and also making it faster using flashattention2

I have saved your code, and updated it with old one until fixed.

If you dont have it, tell me, I saved it.

I duplicated it, and solving the error and also making it faster using flashattention2

Alright, when its ready you can open a new pr and ill merge :)

And if you could make the chatbox bigger, it'd be amazing. (open new pr)
image.png

image.png

is this okk size

Amazing.

Sign up or log in to comment