Inference endpoint/inference Api

#100
by Abhi0401 - opened

Hi All,

I am new to hugging face, so I was wondering if it is possible to use multiple instance of Inference Api with same access token, all running at same time, will that work, if yes then will there be any latency for all parallel requests. If No, then what is the possible solution to achieve this.

An API access token is typically issued to a consumer/user of the endpoint to allow them to authenticate on-the-fly and gain access to the resources or data it furnishes. And they can be issued at any level you wish: one token per user or a single global token for a web service/app, that's up to you ... and when you call an endpoint with an access token, it's NOT checking with other all the other endpoints in existence and comparing your tokens to see if they match, it simply cares if the token is valid or not: If it's valid, the API works and respond ... if it's not, you get an error code. Pretty simple.

No magic or voodoo going on to try to prevent you reusing tokens. But it's "good practice" to limit what things can read/access tokens/secrets and try to split up your own apps/scripts into sensible groups (that make sense in your project or org/team) which have their own token(s) ... that way, you can enable/disable certain things on their own, independently, and swap out or refresh keys ... for example, say you had a web app that does NLP/text operations, one that generates/refines images and one that does text-to-speech ... it'd be wise to at least give each app its own token, so if you need perform maintenance or deal with security or any other issues, each part of your little ecosystem and APIs can be independently managed ...

I'd really like to know where are all the parameters listed for each type of endpoint. I set up an experimental SD XL image generation endpoint, and its results look really awful ... seems like it has way too low of guidance scale or steps/iterations (i.e., never really finishes "diffusing" the image) or maybe it's not using the "refiner" model properly ... either way, it looks super πŸ’© ... and I cannot, for the life of me, seem to find any straightforward list or documentation on the options/parameters for such endpoints and how I can controll sd-xl and how it behaves ...

Sign up or log in to comment