Grounded Bias Eval

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

shanchen authored a paper about 2 months ago

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

daniellebitt authored a paper about 2 months ago

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

shanchen authored a paper 3 months ago

Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation

View all activity

grounded-bias-eval's activity

shanchen

authored a paper about 2 months ago

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Paper • 2411.06469 • Published Nov 10, 2024 • 17

daniellebitt

authored a paper about 2 months ago

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Paper • 2411.06469 • Published Nov 10, 2024 • 17

shanchen

authored 2 papers 3 months ago

Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation

Paper • 2409.20385 • Published Sep 30, 2024

WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation

Paper • 2410.12722 • Published Oct 16, 2024 • 5

daniellebitt

authored a paper 7 months ago

Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks

Paper • 2406.12066 • Published Jun 17, 2024 • 8

gallifantjack

authored a paper 7 months ago

Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks

Paper • 2406.12066 • Published Jun 17, 2024 • 8

shanchen

authored 3 papers 7 months ago

Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks

Paper • 2406.12066 • Published Jun 17, 2024 • 8

Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias

Paper • 2405.05506 • Published May 9, 2024 • 1

Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly

Paper • 2310.12300 • Published Oct 18, 2023 • 1

stellaathena

authored a paper 7 months ago

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

Paper • 2406.04391 • Published Jun 6, 2024 • 7

stellaathena

authored a paper 10 months ago

On the Societal Impact of Open Foundation Models

Paper • 2403.07918 • Published Feb 27, 2024 • 16

stellaathena

authored 9 papers 11 months ago

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources

Paper • 2201.10066 • Published Jan 25, 2022

What Language Model to Train if You Have One Million GPU Hours?

Paper • 2210.15424 • Published Oct 27, 2022 • 2

Recasting Self-Attention with Holographic Reduced Representations

Paper • 2305.19534 • Published May 31, 2023 • 2

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

Paper • 2206.15076 • Published Jun 30, 2022 • 3

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration

Paper • 2306.01481 • Published Jun 2, 2023 • 1

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs

Paper • 2210.14986 • Published Oct 26, 2022 • 5

AI & ML interests

Recent Activity

Team members 7

grounded-bias-eval's activity