--- title: README emoji: 💻 colorFrom: green colorTo: green sdk: static pinned: false ---
Deep Learning 101
The top private AI Meetup in Taiwan, launched on 2016
http://DeepLearning101.TWMAN.ORG |
https://huggingface.co/DeepLearning101 |
https://www.youtube.com/@DeepLearning101
##
### [Speech Processing( 語音處理)](https://github.com/Deep-Learning-101/Speech-Processing-Paper):[那些語音處理踩的坑](https://blog.twman.org/2021/04/ASR.html):[針對訪談或對話進行分析與識別](https://www.twman.org/AI/ASR)。
語音處理
Speech Recognition (語音識別)
- [中文語音識別](https://www.twman.org/AI/ASR)
- [語音識別質檢+時間戳:Whisper Large V2](https://huggingface.co/spaces/DeepLearning101/Speech-Quality-Inspection_whisperX)
- [Whisper](https://github.com/Deep-Learning-101/Speech-Processing-Paper/blob/main/Whisper.md)
- [WeNet](https://github.com/Deep-Learning-101/Speech-Processing-Paper/blob/main/WeNet.md)
- [FunASR](https://github.com/Deep-Learning-101/Speech-Processing-Paper/blob/main/FunASR.md)
Speaker Recognition (聲紋識別)
- [中文語者(聲紋)識別](https://www.twman.org/AI/ASR/SpeakerRecognition)
- [WeSpeaker](https://github.com/Deep-Learning-101/Speech-Processing-Paper/blob/main/WeSpeaker.md)
- [SincNet](https://github.com/Deep-Learning-101/Speech-Processing-Paper/blob/main/SincNet.md)
Speech Enhancement (語音增強)
- [中文語音增強(去噪)](https://www.twman.org/AI/ASR/SpeechEnhancement)
- [語音質檢+噪音去除:Meta Denoiser](https://huggingface.co/spaces/DeepLearning101/Speech-Quality-Inspection_Meta-Denoiser)
- [Denoiser](https://github.com/Deep-Learning-101/Speech-Processing-Paper/blob/main/Denoiser.md)
Speech Separation (語音分離)
- [中文語者分離(分割)](https://www.twman.org/AI/ASR/SpeechSeparation)
- [Mossformer](https://github.com/Deep-Learning-101/Speech-Processing-Paper/blob/main/Mossformer.md)
- [TOLD@FASR](https://github.com/alibaba-damo-academy/FunASR/tree/main/egs/callhome/TOLD)
- [TOLD能對混疊語音建模的說話人日誌框架](https://zhuanlan.zhihu.com/p/650346578)
Speech Synthesis (語音合成)
- [Rectified Flow Matching 語音合成,上海交大開源](https://www.speechhome.com/blogs/news/1712396018944970752):https://github.com/cantabile-kwok/VoiceFlow-TTS
- [新一代開源語音庫CoQui TTS衝到了GitHub 20.5k Star](https://zhuanlan.zhihu.com/p/661291996):https://github.com/coqui-ai/TTS/
- [清華大學LightGrad-TTS,且流式實現](https://zhuanlan.zhihu.com/p/656012430):https://github.com/thuhcsi/LightGrad
- [出門問問MeetVoice, 讓合成聲音以假亂真](https://zhuanlan.zhihu.com/p/92903377)
- [VALL-E:微軟全新語音合成模型可以在3秒內復制任何人的聲音](https://zhuanlan.zhihu.com/p/598473227)
- [BLSTM-RNN、Deep Voice、Tacotron…你都掌握了吗?一文总结语音合成必备经典模型(一)](https://new.qq.com/rain/a/20221204A02GIT00)
- [Tacotron2、GST、Glow-TTS、Flow-TTS…你都掌握了吗?一文总结语音合成必备经典模型(二)](https://cloud.tencent.com/developer/article/2250062)
- Bark:https://github.com/suno-ai/bark
- [最強文本轉語音工具:Bark,本地安裝+雲端部署+在線體驗詳細教程](https://zhuanlan.zhihu.com/p/630900585)
- [使用Transformers 優化文本轉語音模型Bark](https://zhuanlan.zhihu.com/p/651951136)
自然語言處理
Large Language Model (大語言模型)
- [LangChain](https://github.com/Deep-Learning-101/Natural-Language-Processing-Paper#langchain)
- [Retrieval Augmented Generation](https://github.com/Deep-Learning-101/Natural-Language-Processing-Paper#rag)
- [LLM Model](https://github.com/Deep-Learning-101/Natural-Language-Processing-Paper#llm-%E6%A8%A1%E5%9E%8B%E4%BB%8B%E7%B4%B9)
Information/Event Extraction (資訊/事件擷取)
- [HugNLP](https://github.com/Deep-Learning-101/Natural-Language-Processing-Paper/blob/main/HugNLP.md)
- [DeepKE](https://github.com/Deep-Learning-101/Natural-Language-Processing-Paper/blob/main/DeepKE.md)
- [ERINE-Layout](https://github.com/Deep-Learning-101/Natural-Language-Processing-Paper/blob/main/ERNIE-Layout.md)
Machine Reading Comprehension (機器閱讀理解)
- [中文機器閱讀理解](https://www.twman.org/AI/NLP/MRC)
- [繁體中文閱讀理解:Bert](https://huggingface.co/spaces/DeepLearning101/Reading-Comprehension_Bert)
Named Entity Recognition (命名實體識別)
Correction (糾錯)
Classification (分類)
Similarity (相似度)
圖像處理:
Optical Character Recognition (光學字元辨識)
- [繁體中文醫療診斷書和收據OCR:PaddleOCR](https://huggingface.co/spaces/DeepLearning101/OCR101TW)
- PaddleOCR
Document Layout Analysis (文件結構分析)
- [arXiv-2020_LayoutLM](https://github.com/Deep-Learning-101/Computer-Vision-Paper/blob/main/LayoutLM.md)
- [arXiv-2021_LayoutLMv2](https://github.com/Deep-Learning-101/Computer-Vision-Paper/blob/main/LayoutLMv2.md)
- arXiv-2021_LayoutXLM
- arXiv-2022_LayoutLMv3
Document Understanding (文件理解)
Object Detection (物件偵測)
Handwriting Recognition (手寫識別)
Face Recognition (人臉識別)