Token Classification
Transformers
PyTorch
code
bert
Inference Endpoints

License

#3
by rooa - opened

Hi! Thank you for the great work. I may have missed it, but I could not find the license for this model/repository.

Would it also follow BigCode Open RAIL-M v1?

BigCode org

The model currently doesn't have a license we just have the Terms of use as part of the gating access, does that work for you?

Is there any intention to release StarPII under the same Open RAIL-M license as StarCoder? Unfortunately, the current gated terms are too restrictive for our desired model use cases, in that we would like to leverage the model for more than just cleaning datasets.

BigCode org

Could you specify in what ways you plan to leverage the model?(Open RAIL-M is designed for text/code generation models and StarPII doesn't fall in that category)

Secrets detection generally, outside the bounds of just dataset cleaning. Could fill the same role of more traditional tooling, like trufflehog (https://github.com/trufflesecurity/trufflehog).

BigCode org

Thanks for the added information!

The restrictive Terms of Use are motivated by the strong dual-use potential of PII detectors - secrets detection in general can easily include malicious uses, such as scanning open repos for security vulnerabilities to exploit. As such, sharing the model requires us to balance the risks and benefits, hence the restricted uses - as we do not currently have the capacity to work with potential users more closely.

TLDR: we won't be releasing the model under a more permissive license in the near future, but efforts are ongoing on the governance of sensitive models :)

Sign up or log in to comment