machine learning Transformers have become the staple architecture for deep learning models NLP Diffusion models natural language processing deep learning Deep Learning Support vector machines random forests probability distribution Cross entropy loss Kullback leibler divergence Shannon entropy Activation functions ATM deep fakes AGI AI deep trouble artificial intelligence deep diving artificial snow shallow waters deep end RELU sigmoid GELU RNN CNN Gaussian