atayloraerospace PRO

Taylor658

AI & ML interests

Computer Vision πŸ”­ | Multimodal Gen AI πŸ€–| AI in Healthcare 🩺 | AI in Aerospace πŸš€

Organizations

Taylor658's activity

posted an update about 5 hours ago
view post
Post
281
Cohere for AI, Argilla, and Hugging Face are collaborating on an Open Science Project to enhance multilingual model evaluations. The project focuses on the widely-used MMLU dataset, which spans 57 subjects like mathematics, computer science, and law. However, existing translations often miss linguistic and cultural nuances, thus embedding biases. πŸ€”

To address this, they have annotated a subset of the MMLU test set and are inviting global perspectives to review prompts, highlighting cultural specifics and required knowledge. They have mentioned that insights will help shape future multilingual model evaluations, ensuring they are more inclusive and accurate. πŸ—ΊοΈ πŸ“ πŸ™Œ

▢️ To get started go to: CohereForAI/MMLU-evaluation

🌍 They also have an Aya Discord server for collaboration with other participants: https://discord.gg/9gVhdfnQMN
posted an update 1 day ago
posted an update 5 days ago
view post
Post
1007
Researchers from Anthropic managed to extract millions of interpretable features from their Claude 3 Sonnet model, making it easier to identify and understand specific behaviors and patterns within the model​.

This advance in understanding closed source AI models could make them safer by showing how specific features relate to concepts and affect the model’s behavior.

Read the Article: https://www.anthropic.com/research/mapping-mind-language-model?utm_source=substack&utm_medium=email

Read The Paper: https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html
replied to their post 5 days ago
posted an update 6 days ago
view post
Post
1142
The Google Deep Mind Team just released a new technical report on Gemini 1.5 Pro and Gemini 1.5 Flash.

in addition to architecture, benchmark and evaluation details, the report also provides a few real world use cases for the models such as professional task optimization and translation of lesser-known languages.

You can check out the full report here: https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf?utm_source=substack&utm_medium=email

  • 2 replies
Β·
posted an update 8 days ago
view post
Post
1767
A new paper, "Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning," was just published. The approach improves VLMs' decision-making abilities in goal-directed tasks.

This is accomplished with Chain-of-thought (COT) reasoning, which seriously enhances performance. Removing COT reasoning, however, drops effectiveness, highlighting its crucial role.

Check out the paper here: https://arxiv.org/abs/2405.10292
replied to codelion's post 14 days ago
view reply

Yes some potential multi input or multi source coded related tasks could be - executing shell commands directly from a script, deserialization of untrusted data, or parsing xml data for example. CWE-611 and CWE-502 might already cover a couple of these coding scenarios though...

posted an update 14 days ago
replied to codelion's post 15 days ago
view reply

Thanks for posting results for gpt-4o so fast!

You will have to post the latest Gemini model results tomorrow after I/O announcements. :-)

Since we are squarely in the age of multimodal models I am curious if any of the 76 standard scripts run for vulnerability remediation in "static-analysis-eval" demonstrate multimodal vulnerabilities?

posted an update 17 days ago
view post
Post
1096
Red Hat and IBM have announced InstructLab, an open-source project for LLM contributions. InstructLab offers a model-agnostic approach for the community to contribute "skills" and or "knowledge" to LLMs via a CLI and tuning backend.

This community-driven approach to GenAI model development is novel to say the least. It will be interesting to see how effective it is in the long run, especially on models beyond the initial Granite and Merlinite familes.

Can check out Git Hub here: https://github.com/instructlab
Read the LAB Paper: https://arxiv.org/abs/2403.01081
View Model Builds: https://huggingface.co/instructlab
replied to mattmdjaga's post 19 days ago
posted an update 19 days ago
view post
Post
1099
πŸ€—The first submissions from the Community Hugging Face Computer Vision Course (https://huggingface.co/learn/computer-vision-course/unit0/welcome/welcome) are being posted up on HF Spaces!πŸ€—

OmAlve/Swin-Transformer-Foods101
Rageshhf/medi-classifier

It is amazing that the first group of students has completed the course and in record time!

Will look forward to seeing more submissions from the course soon.

A nice swag item that students get when they complete the course and make their submission is this cool Hugging Face Certificate of Completion. (Its suitable for framing) πŸ€—
πŸ‘‡
  • 1 reply
Β·
posted an update 26 days ago