Hi! Introduce yourself! 👋

#2
by fdaudens HF staff - opened
Journalists on Hugging Face org

We're so excited to see so much interest in the Journalists on Hugging Face community! To start, we'd love to learn a bit more from you: i.e., what you'd like to see more of on this page, what aspects of AI journalism interest you, and what projects you're working on if you'd like to share. 🤗

Journalists on Hugging Face org

Hi! Thank you for this great initiative. It would be wonderful to begin a crowd-sourced curation of published journalistic stories or projects using AI!

Journalists on Hugging Face org

Would love to understand better what we can provide as tools to journalists

Journalists on Hugging Face org

I'd like to explore auto tagging articles with our taxonomy so writers can spend less time dealing with that. No one wants to slog through a list of taxonomy terms they need to click when they’re on deadline.
A personal community service hobby project I’m working on is a RAG application for city government meeting agendas and minutes. The retrieval is more complicated than other RAGs I’ve built because the metadata really matters. And I want it to include links to both source document chunks and the full documents.
Agents seem potentially useful. Could they help with article research?

clem changed discussion status to closed
clem changed discussion status to open
Journalists on Hugging Face org

@smach Interesting use cases! About tagging, in my previous job, I fine-tuned a model for a slightly different task (categorization) with Autotrain, and it worked like a charm. If you have a dataset with previous articles and the associated keywords, I'm sure you could obtain very interesting results.

There is a no-code interface here https://huggingface.co/autotrain and the doc https://huggingface.co/docs/autotrain/en/index. This tutorial could also be helpful, though a little bit old: https://www.youtube.com/watch?v=OH_e0wOkpZc

About your other project, have you seen this: https://www.vikramoberoi.com/how-citymeetings-nyc-uses-ai-to-make-it-easy-to-navigate-city-council-meetings/ ?

Journalists on Hugging Face org

Hi, I obtained a degree in Journalism and a post-graduation title in Machine Learning. I'm studying and developing language models fine-tuned to Brazilian Portuguese and training image models representing Brazilian culture (I haven't published them to HF yet, only on Civitai). I believe those initiatives can help the production of more content in Portuguese language and bring knowledge of Brazilian culture through more faithful representation on AI generated images.

My interesting in joining this org is to help build this bridge across languages and cultures.

@smach this an old project, but it could work for your use case: https://huggingface.co/spaces/pleonova/multi-label-summary-text

I would also recommend using SetFit, works like a charm!

LLMs are pretty good at giving you categories/topics and you could use those in your RAG pipeline, something I am doing currently.

Journalists on Hugging Face org

@lucianosb Welcome to the community! I'm curious: is it easy to find datasets in Portuguese or tailored to the Brazilian culture?

Journalists on Hugging Face org

@fdaudens Thanks for those links!!! Look very useful and I hope to dive in soon.
@pleonova Appreciate those suggestions also! I know many before me have done projects like this, very useful to learn from others. Glad to be here.

Journalists on Hugging Face org

Olá @lucianosb ! My city has a lot of residents who are originally from Brazil. It would be useful if some LLMs translated to Brazilian Portuguese well.

Journalists on Hugging Face org

I work in broadcast television, and I'm excited to learn from everyone. Headed to the Local Media Consortium conference next week (https://event.localmediaconsortium.com/). Will tell everyone about this group. I keep a blog with all of the links I see every week, if anyone's interested. It's purely a labor of love to keep me conversant.
https://ethanbholland.com/. Thanks for including me.

Journalists on Hugging Face org

Hi Everyone, I'm exploring some use case for Medias, with already a first PoC product piloted in some events with 200+ attendees - would be super interested to hear more about Journalists needs so feel free to reach out if interested to have a chat !

Journalists on Hugging Face org

I'd like to explore auto tagging articles with our taxonomy so writers can spend less time dealing with that. No one wants to slog through a list of taxonomy terms they need to click when they’re on deadline.

I am also very interested in automatic tagging. Apart from articles, I would like to use speech-to-text transcripts to tag radio and TV programmes according to a standard taxonomy: https://iptc.org/standards/media-topics/
I find it hard to figure out what is the best way to handle such a multi-label problem with quite a lot of hierarchically structured classes. Especially when it comes to tagging non-English content (in our case mostly German).

Journalists on Hugging Face org

@lucianosb Welcome to the community! I'm curious: is it easy to find datasets in Portuguese or tailored to the Brazilian culture?

I believe there is still room for improvement in Portuguese based datasets. There has been a lot more datasets available since the LLM hype peaked and it allowed for a lot of fine-tuned models as you can see on the Open PT LLM Leaderboard

Journalists on Hugging Face org

@constantinSch , I discussed this with the team, and thought you might find this resource useful for training your own model:

https://github.com/huggingface/transformers/tree/main/examples/pytorch/audio-classification

To give you a point of reference for the amount of training data needed, the Keyword Spotting (KWS) task uses about 25 hours of labeled audio data.

clem changed discussion title from Hi! 👋 to Hi! Introduce yourself! 👋
clem pinned discussion
This comment has been hidden

Looks like my comments and I are not welcome in this group and I have been removed from it. I wish someone would have reached out if I violated your group policies.

Btw, there might be a bug where people who are not part of this group can still post messages.

My best wishes to you all.

Sign up or log in to comment