deprecation warning??

#1
by ctranslate2-4you - opened

Can you clarify if this is a remnant or if you in-fact intend to remove ctranslate2 as an "engine" in the future, and if so, do you have an idea when?

Before I spend an inordinate amount of time revising my scripts to use ctranslate2 for my RAG application...did you find out that it's not actually faster and/or higher quality than the other options out there? Just curious because as far as I know ct2 is still superior in a lot of ways. Anyways, here's your source code that prompted my question:

"""
logger.warning(
"deprecated: ct2 inference is deprecated and will be removed in the future."
)
"""

Hey - would you mind opening that in Github as Issue?

I am considering to keep it - if it remains low maintainance.

  • Currently, Ctranslate2 only delivers 1/4 of performance of my optimizations in torch. this is on my rtx 3060 laptop with cuda

CT2:

Server Software:        uvicorn
Server Hostname:        127.0.0.1
Server Port:            7997

Document Path:          /embeddings
Document Length:        Variable

Concurrency Level:      10
Time taken for tests:   60.003 seconds
Complete requests:      10
Failed requests:        0
Total transferred:      20780760 bytes
Total body sent:        7196080
HTML transferred:       20779460 bytes
Requests per second:    0.17 [#/sec] (mean)
Time per request:       60002.693 [ms] (mean)
Time per request:       6000.269 [ms] (mean, across all concurrent requests)
Transfer rate:          338.21 [Kbytes/sec] received
                        117.12 kb/s sent
                        455.33 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   0.5      1       2
Processing:  4770 27597 18316.0  29418   55234
Waiting:     4766 27596 18316.4  29414   55233
Total:       4770 27599 18316.0  29419   55235

TORCH

Document Length:        Variable

Concurrency Level:      10
Time taken for tests:   7.417 seconds
Complete requests:      10
Failed requests:        0
Total transferred:      20781050 bytes
Total body sent:        7196080
HTML transferred:       20779750 bytes
Requests per second:    1.35 [#/sec] (mean)
Time per request:       7417.268 [ms] (mean)
Time per request:       741.727 [ms] (mean, across all concurrent requests)
Transfer rate:          2736.05 [Kbytes/sec] received
                        947.44 kb/s sent
                        3683.49 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   0.2      1       1
Processing:   555 3341 2295.0   3627    6862
Waiting:      553 3339 2295.6   3626    6861
Total:        556 3342 2295.2   3628    6863

Percentage of the requests served within a certain time (ms)
  50%   3628
  66%   4489
  75%   5336
  80%   6088
  90%   6863
  95%   6863
  98%   6863
  99%   6863
 100%   6863 (longest request)
michaelfeil changed discussion status to closed

@ctranslate2-4you If you want, you can remove the deprecation warning from infinity via Pull request - I'll approve it. always looking for contributors getting familiar with the repo

Sign up or log in to comment