Bot's picture

Bot

inflatebot

AI & ML interests

"Potentially one of my biggest flaws is that I genuinely think that the science appreciates when you commit to a bit." - Tom ExtractionsAndIre

Recent Activity

Organizations

Alfitaria's profile picture Allura's profile picture

inflatebot's activity

New activity in Casual-Autopsy/Llama-3-VNTL-Yollisa-8B about 24 hours ago
New activity in trashpanda-org/QvQ-72B-Preview-Unsee 2 days ago

Put it back :(

2
#1 opened 3 days ago by
inflatebot
replied to clem's post 3 days ago
view reply

Basically this, yeah. I'd love for them to prove me wrong and knock it out of the park, I just have minimal belief in them. The most they've done over the last couple years is scalemaxx and publicize techniques that we'd already been doing for a while.

replied to clem's post 4 days ago
view reply

I'm not convinced they aren't about to just give us their scraps. GPT-4.5 was a tire fire and nobody wanted it if they could even afford it.

If the new OpenAI model is good, that'd be awesome, but my hopes are not terribly high.

New activity in huggingface/InferenceSupport 7 days ago
reacted to aifeifei798's post with 😎👀👍 18 days ago
view post
Post
3672
😊 This program is designed to remove emojis from a given text. It uses a regular expression (regex) pattern to match and replace emojis with an empty string, effectively removing them from the text. The pattern includes a range of Unicode characters that correspond to various types of emojis, such as emoticons, symbols, and flags. By using this program, you can clean up text data by removing any emojis that may be present, which can be useful for text processing, analysis, or other applications where emojis are not desired. 💻
import re

def remove_emojis(text):
    # Define a broader emoji pattern
    emoji_pattern = re.compile(
        "["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
        u"\U00002702-\U000027B0"
        u"\U000024C2-\U0001F251"
        u"\U0001F900-\U0001F9FF"  # supplemental symbols and pictographs
        u"\U0001FA00-\U0001FA6F"  # chess symbols and more emojis
        u"\U0001FA70-\U0001FAFF"  # more symbols and pictographs
        u"\U00002600-\U000026FF"  # miscellaneous symbols
        u"\U00002B50-\U00002B59"  # additional symbols
        u"\U0000200D"             # zero width joiner
        u"\U0000200C"             # zero width non-joiner
        u"\U0000FE0F"             # emoji variation selector
        "]+", flags=re.UNICODE
    )
    return emoji_pattern.sub(r'', text)
New activity in TheDrummer/Gemmasutra-Small-4B-v1 19 days ago

Bad timing.

3
#1 opened 24 days ago by
linkpharm
reacted to Kseniase's post with 🔥 19 days ago
view post
Post
7745
15 types of attention mechanisms

Attention mechanisms allow models to dynamically focus on specific parts of their input when performing tasks. In our recent article, we discussed Multi-Head Latent Attention (MLA) in detail and now it's time to summarize other existing types of attention.

Here is a list of 15 types of attention mechanisms used in AI models:

1. Soft attention (Deterministic attention) -> Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Assigns a continuous weight distribution over all parts of the input. It produces a weighted sum of the input using attention weights that sum to 1.

2. Hard attention (Stochastic attention) -> Effective Approaches to Attention-based Neural Machine Translation (1508.04025)
Makes a discrete selection of some part of the input to focus on at each step, rather than attending to everything.

3. Self-attention -> Attention Is All You Need (1706.03762)
Each element in the sequence "looks" at other elements and "decides" how much to borrow from each of them for its new representation.

4. Cross-Attention (Encoder-Decoder attention) -> Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation (2104.08771)
The queries come from one sequence and the keys/values come from another sequence. It allows a model to combine information from two different sources.

5. Multi-Head Attention (MHA) -> Attention Is All You Need (1706.03762)
Multiple attention “heads” are run in parallel.​ The model computes several attention distributions (heads), each with its own set of learned projections of queries, keys, and values.

6. Multi-Head Latent Attention (MLA) -> DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (2405.04434)
Extends MHA by incorporating a latent space where attention heads can dynamically learn different latent factors or representations.

7. Memory-Based attention -> End-To-End Memory Networks (1503.08895)
Involves an external memory and uses attention to read from and write to this memory.

See other types in the comments 👇
  • 1 reply
·