Asankhaya Sharma PRO

codelion

AI & ML interests

AI/ML, Dev Tools and Application Security

Organizations

codelion's activity

replied to their post 5 days ago
replied to their post 5 days ago
posted an update 5 days ago
view post
Post
835
After the announcements yesterday, I got a chance to try the new gemini-1.5-flash model from @goog1e , it is almost as good as gpt-4o on the StaticAnalaysisEval ( patched-codes/static-analysis-eval) It is also a bit faster than gpt-4o and much cheaper.

I did run into a recitation flag with an example in the dataset where the api refused to fix the vulnerability and flagged the input as using copyrighted content. This is something you cannot unset even with the safety filters and seems to be an existing bug https://issuetracker.google.com/issues/331677495

But overall you get gpt-4o level performance for 7% the price, we are thinking of making it default in patchwork - https://github.com/patched-codes/patchwork You can use the google_api_key and model options to choose gemini-1.5-flash-latest to run it with patchwork.
  • 2 replies
Β·
replied to their post 6 days ago
view reply

At the moment we do not have any multimodal examples in the benchmark. The focus has been on vulnerability remediation but I cannot think off any use to utilize it in coding related tasks? Do you have any ideas on how multi modality can be exploited for something like coding?

posted an update 7 days ago
view post
Post
1621
The new gpt-4o model seems to a very good coder. OpenAI reported a 90+ score on openai_humaneval

We tried the new model on our patched-codes/static-analysis-eval which evaluates the model on vulnerability remediation. gpt-4o has reclaimed the top spot on our leaderboard (from meta-llama/Meta-Llama-3-70B-Instruct).

You can now use the new model with our open-source framework PatchWork - https://github.com/patched-codes/patchwork by passing model=gpt-4o on the CLI.
Β·
replied to their post 26 days ago
replied to Jaward's post 26 days ago
view reply

Great thanks, would love to see the kind of output it produces directly. We have been trying to automate agentic workflows using an open source framework called patchwork - https://github.com/patched-codes/patchwork

It is more deterministic and we are focussing only specific workflows so would love to compare with something like Devin.

posted an update 26 days ago
view post
Post
1749
Happy to announce the open source framework to turbo charge devops called patchwork - https://github.com/patched-codes/patchwork

You can use it to build patchflows - workflows that use LLMs for software development tasks like bug fixing, pull request review, library migration and documentation.

Supports any LLM of your choice including our own MoE model - patched-codes/patched-mix-4x7B

Give it a try!
  • 2 replies
Β·
replied to Jaward's post 26 days ago
view reply

Can you share the apps that it created?

posted an update 29 days ago
replied to WizardLM's post about 1 month ago
view reply

The weights seem to have been taken down?

posted an update about 1 month ago
view post
Post
1939
We just released a new MoE model (meraGPT/mera-mix-4x7B) that is half as large as Mixtral-8x7B while still been competitive with it across different benchmarks. mera-mix-4x7B achieves 76.37 on the open LLM eval.

You can check mera-mix-4x7B out on HF here - meraGPT/mera-mix-4x7B