Spaces:

Delik
/

pyannote-speaker-diarization-3.1

Running on Zero

App Files Files Community

Delik

Nick088 commited on 29 days ago

Commit

633b502

•

1 Parent(s): 7954b59

added more info (#3)

Browse files

- added more info (17744c2b17ce3c4f667f3c51bd5d2d06a6d5870a)
- added more info again (14d28e572b742cade2a26be3d716836eb8666f3f)

Co-authored-by: Nick088 <Nick088@users.noreply.huggingface.co>

Files changed (1) hide show

app.py +10 -5

app.py CHANGED Viewed

@@ -32,7 +32,7 @@ def save_audio(audio):
     return "temp.wav"
-@spaces.GPU(duration=90)
 def diarize_audio(temp_file, num_speakers, min_speakers, max_speakers):
     if pipeline is None:
         return "Error: Pipeline not initialized"
@@ -107,15 +107,20 @@ with gr.Blocks() as demo:
     Please upload an audio file and adjust the parameters as needed.
-    The maximum length of the audio file it can process is around **35-40 minutes**.
     If you find this space helpful, please ❤ it.
     Join my server for support and open source AI discussion: https://discord.gg/osai
     """)
-    audio_input = gr.Audio(type="filepath", label="Upload Audio")
-    num_speakers_input = gr.Number(label="Number of Speakers", value=0)
     min_speakers_input = gr.Number(label="Minimum Number of Speakers", value=0)
     max_speakers_input = gr.Number(label="Maximum Number of Speakers", value=0)
     process_button = gr.Button("Process")
@@ -127,4 +132,4 @@ with gr.Blocks() as demo:
         inputs=[audio_input, num_speakers_input, min_speakers_input, max_speakers_input],
         outputs=[diarization_output, label_file_link]
 )
-demo.launch()

     return "temp.wav"
+@spaces.GPU(duration=60 * 2)
 def diarize_audio(temp_file, num_speakers, min_speakers, max_speakers):
     if pipeline is None:
         return "Error: Pipeline not initialized"
     Please upload an audio file and adjust the parameters as needed.
+    The maximum length of the audio file that can be processed depends based on the hardware it's running on. If you are on the ZeroGPU HuggingFace Space, it's around **35-40 minutes**.
     If you find this space helpful, please ❤ it.
     Join my server for support and open source AI discussion: https://discord.gg/osai
+    IF YOU LEAVE ALL THE PARAMETERS BELOW TO 0, IT WILL BE ON AUTO MODE, AUTOMATICALLY DETECTING THE SPEAKERS, ELSE USE THE ONES BELOW FOR MORE COSTUMIZATION & BETTER RESULTS
     """)
+    audio_input = gr.Audio(type="filepath", label="Upload Audio File")
+    num_speakers_input = gr.Number(label="Number of Speakers", info="Use it only if you know the number of speakers in advance, else leave it to 0 and use the parameters below", value=0)
+    gr.Markdown("Use the following parameters only if you don't know the number of speakers, you can set lower and/or upper bounds on the number of speakers, if instead you know it, leave the following parameters to 0 and use the one above")
     min_speakers_input = gr.Number(label="Minimum Number of Speakers", value=0)
     max_speakers_input = gr.Number(label="Maximum Number of Speakers", value=0)
     process_button = gr.Button("Process")
         inputs=[audio_input, num_speakers_input, min_speakers_input, max_speakers_input],
         outputs=[diarization_output, label_file_link]
 )
+demo.launch(share = False)