OpenOrca

community

alignment_lab

Activity Feed Request to join this org

AI & ML interests

Superintelligence Alignment

Recent Activity

Alignment-Lab-AI updated a dataset about 2 months ago

Open-Orca/OpenOrca

arielnlee authored a paper 2 months ago

From Text to Pose to Image: Improving Diffusion Model Control and Quality

arielnlee authored a paper 2 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

View all activity

Open-Orca's activity

louisbrulenaudet

posted an update 24 days ago

Post

904

I’ve just released logfire-callback on PyPI, designed to facilitate monitoring of Hugging Face Transformer training loops using Pydantic Logfire 🤗

The callback will automatically log training start with configuration parameters, periodic metrics and training completion ⏱️

Install the package using pip:

pip install logfire-callback

First, ensure you have a Logfire API token and set it as an environment variable:

export LOGFIRE_TOKEN=your_logfire_token

Then use the callback in your training code:

from transformers import Trainer, TrainingArguments
from logfire_callback import LogfireCallback

# Initialize your model, dataset, etc.

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    # ... other training arguments
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    callbacks=[LogfireCallback()]  # Add the Logfire callback here
)

trainer.train()

If you have any feedback, please reach out at @louisbrulenaudet

not-lain

posted an update about 1 month ago

Post

2014

🚀AraClip is now fully integrated with Hugging Face 🤗

AraClip is a specialized CLIP model that was created by @pain and optimized for Arabic text-image retrieval tasks🔥

🔗 Try it out 🔗
🤖 model: Arabic-Clip/araclip
🧩 Gradio demo: Arabic-Clip/Araclip-Simplified
🌐 website: https://arabic-clip.github.io/Arabic-CLIP/

2 replies

Alignment-Lab-AI

updated a dataset about 2 months ago

Open-Orca/OpenOrca

Viewer • Updated Feb 19 • 2.94M • 9.88k • 1.39k

louisbrulenaudet

posted an update about 2 months ago

Post

3277

I am pleased to introduce my first project built upon Hugging Face’s smolagents framework, integrated with Alpaca for financial market analysis automation 🦙🤗

The project implements technical indicators such as the Relative Strength Index (RSI) and Bollinger Bands to provide momentum and volatility analysis. Market data is retrieved through the Alpaca API, enabling access to historical price information across various timeframes.

AI-powered insights are generated using Hugging Face’s inference API, facilitating the analysis of market trends through natural language processing with DuckDuckGo search integration for real-time sentiment analysis based on financial news 🦆

Link to the GitHub project: https://github.com/louisbrulenaudet/agentic-market-tool

arielnlee

authored 2 papers 2 months ago

From Text to Pose to Image: Improving Diffusion Model Control and Quality

Paper • 2411.12872 • Published Nov 19, 2024

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

not-lain

posted an update 3 months ago

Post

4377

I have just released a new blogpost about kv caching and its role in inference speedup 🚀
🔗 https://huggingface.co/blog/not-lain/kv-caching/
some takeaways :

4 replies

not-lain

posted an update 3 months ago

Post

1695

we now have more than 2000 public AI models using ModelHubMixin🤗

not-lain

posted an update 3 months ago

Post

4058

Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :

jph00

authored 2 papers 4 months ago

The Matrix Calculus You Need For Deep Learning

Paper • 1802.01528 • Published Feb 5, 2018 • 2

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 145

flavoredquark

authored 4 papers 5 months ago

Dockerface: an Easy to Install and Use Faster R-CNN Face Detector in a Docker Container

Paper • 1708.04370 • Published Aug 15, 2017 • 1

Fine-Grained Head Pose Estimation Without Keypoints

Paper • 1710.00925 • Published Oct 2, 2017

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Paper • 2304.00186 • Published Apr 1, 2023

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 72

not-lain

posted an update 5 months ago

Post

2365

ever wondered how you can make an API call to a visual-question-answering model without sending an image url 👀

you can do that by converting your local image to base64 and sending it to the API.

recently I made some changes to my library "loadimg" that allows you to make converting images to base64 a breeze.
🔗 https://github.com/not-lain/loadimg

API request example 🛠️:

from loadimg import load_img
from huggingface_hub import InferenceClient

# or load a local image
my_b64_img = load_img(imgPath_url_pillow_or_numpy ,output_type="base64" ) 

client = InferenceClient(api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

messages = [
	{
		"role": "user",
		"content": [
			{
				"type": "text",
				"text": "Describe this image in one sentence."
			},
			{
				"type": "image_url",
				"image_url": {
					"url": my_b64_img # base64 allows using images without uploading them to the web
				}
			}
		]
	}
]

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct", 
	messages=messages, 
	max_tokens=500,
	stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

louisbrulenaudet

posted an update 5 months ago

Post

2093

I’ve published a new dataset to simplify model merging 🤗

This dataset facilitates the search for compatible architectures for model merging with @arcee_ai’s mergekit, streamlining the automation of high-performance merge searches 📖

Dataset : louisbrulenaudet/mergekit-configs

1 reply

Alignment-Lab-AI

posted an update 6 months ago

Post

1381

remember boys and girls, always keep all your data, its never a waste of time!

louisbrulenaudet

posted an update 6 months ago

Post

1346

Introducing Lemone-router, a series of classification models designed to produce an optimal multi-agent system for different branches of tax law.

Trained on a base of 49k lines comprising a set of synthetic questions generated by GPT-4 Turbo and Llama 3.1 70B, which have been further refined through evol-instruction tuning and manual curation and authority documents, these models are based on an 8-category decomposition of the classification scheme derived from the Bulletin officiel des finances publiques - impôts :

label2id = {
    "Bénéfices professionnels": 0,
    "Contrôle et contentieux": 1,
    "Dispositifs transversaux": 2,
    "Fiscalité des entreprises": 3,
    "Patrimoine et enregistrement": 4,
    "Revenus particuliers": 5,
    "Revenus patrimoniaux": 6,
    "Taxes sur la consommation": 7
}
	
id2label = {
    0: "Bénéfices professionnels",
    1: "Contrôle et contentieux",
    2: "Dispositifs transversaux",
    3: "Fiscalité des entreprises",
    4: "Patrimoine et enregistrement",
    5: "Revenus particuliers",
    6: "Revenus patrimoniaux",
    7: "Taxes sur la consommation"
}

It achieves the following results on the evaluation set:
- Loss: 0.4734
- Accuracy: 0.9191

Link to the collection: louisbrulenaudet/lemone-router-671cce21d6410f3570514762

flavoredquark

authored a paper 6 months ago

Unbounded: A Generative Infinite Game of Character Life Simulation

Paper • 2410.18975 • Published Oct 24, 2024 • 38

AI & ML interests

Recent Activity

Team members 41

Open-Orca's activity