Vincent Granville's picture

Vincent Granville PRO

vincentg64

AI & ML interests

GenAI, LLM, synthetic data, optimization, fine-tuning, model evaluation

Recent Activity

posted an update 8 days ago
The Rise of Specialized LLMs for Enterprise -https://mltblog.com/3QXXE4I In this article, I discuss the main problems of standard LLMs (OpenAI and the likes), and how the new generation of LLMs addresses these issues. The focus is on Enterprise LLMs. LLMs with Billions of Parameters: Most of the LLMs still fall in that category. The first ones (ChatGPT) appeared around 2022, though Bert is an early precursor. Most recent books discussing LLMs still define them as transformer architecture with deep neural networks (DNNs), costly training, and reliance on GPUs. The training is optimized to predict the next tokens or missing tokens. However, this task is remotely relevant to what modern LLMs now deliver to the user, see here. Yet it requires time and intensive computer resources. Indeed, this type of architecture works best with billions or trillions of tokens. In the end, most of these tokens are noise, requiring smart distillation for performance improvement. The main issues are: ➡️ Performance: Requires GPU and large corpuses as input data. Re-training is expensive. Hallucinations are still a problem. Fine-tuning is delicate (Blackbox). You need prompt engineering to get the best results. Mixtures of experts (multiple sub-LLMs, DeepSeek) is one step towards improving accuracy. ➡️ Cost: Besides the GPU costs, the pricing model charges by the token, incentivizing vendors to use models with billions of tokens. Read full article describing more issues and how LLM 2.0 addresses them, at https://mltblog.com/3QXXE4I More links: - To receive latest updates: https://mltblog.com/4iTvQec - About LLM 2.0: https://mltblog.com/4g2sKTv - PowerPoint presentation: https://mltblog.com/43DYviE - Our company website: https://mlt
posted an update about 2 months ago
Spectacular Connection Between LLMs, Quantum Systems, and Number Theory | https://mltblog.com/3DgambA In my recent paper 51 on cracking the deepest mathematical mystery, available at https://mltblog.com/3zsnQ2g, I paved the way to solve a famous multi-century old math conjecture. The question is whether or not the digits of numbers such as π are evenly distributed. Currently, no one knows if the proportion of '1' even exists in these binary digit expansions. It could oscillate forever without ever converging. Of course, mathematicians believe that it is 50% in all cases. Trillions of digits have been computed for various constants, and they pass all randomness tests. In this article, I offer a new framework to solve this mystery once for all, for the number e. Rather than a closure on this topic, it is a starting point opening new research directions in several fields. Applications include cryptography, dynamical systems, quantum dynamics, high performance computing, LLMs to answer difficult math questions, and more. The highly innovative approach involves iterated self-convolutions of strings and working with numbers as large as 2^n + 1 at power 2^n, with n larger than 100,000. No one before has ever analyzed the digits of such titanic numbers! To read the full article, participate in the AI & LLM challenge, get the very fast Python code, read about ground-breaking research, and see all the applications, visit https://mltblog.com/3DgambA
View all activity

Organizations

None yet

Posts 23

view post
Post
2264
The Rise of Specialized LLMs for Enterprise -https://mltblog.com/3QXXE4I

In this article, I discuss the main problems of standard LLMs (OpenAI and the likes), and how the new generation of LLMs addresses these issues. The focus is on Enterprise LLMs.

LLMs with Billions of Parameters: Most of the LLMs still fall in that category. The first ones (ChatGPT) appeared around 2022, though Bert is an early precursor. Most recent books discussing LLMs still define them as transformer architecture with deep neural networks (DNNs), costly training, and reliance on GPUs. The training is optimized to predict the next tokens or missing tokens. However, this task is remotely relevant to what modern LLMs now deliver to the user, see here. Yet it requires time and intensive computer resources. Indeed, this type of architecture works best with billions or trillions of tokens. In the end, most of these tokens are noise, requiring smart distillation for performance improvement.

The main issues are:

➡️ Performance: Requires GPU and large corpuses as input data. Re-training is expensive. Hallucinations are still a problem. Fine-tuning is delicate (Blackbox). You need prompt engineering to get the best results. Mixtures of experts (multiple sub-LLMs, DeepSeek) is one step towards improving accuracy.

➡️ Cost: Besides the GPU costs, the pricing model charges by the token, incentivizing vendors to use models with billions of tokens.

Read full article describing more issues and how LLM 2.0 addresses them, at https://mltblog.com/3QXXE4I

More links:

- To receive latest updates: https://mltblog.com/4iTvQec
- About LLM 2.0: https://mltblog.com/4g2sKTv
- PowerPoint presentation: https://mltblog.com/43DYviE
- Our company website: https://mlt

view post
Post
1916
LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture https://mltblog.com/3F3Y9Yd

In my recent article “Piercing the Deepest Mathematical Mystery”, I paved the way to proving a famous multi-century old conjecture: are the digits of major mathematical constant such as π, e, log 2, or √2 evenly distributed? No one before ever managed to prove even the most basic trivialities, such as whether the proportion of ‘0’ or ‘1’ exists in the binary expansions of any of these constants, or if it oscillates indefinitely between 0% and 100%.

Here I provide an overview of the new framework built to uncover deep results about the digit distribution of Euler’s number e, discuss the latest developments, share a 10x faster version of the code, and feature new potential research areas in LLMs, AI, quantum dynamics, high performance computing, cryptography, dynamical systems, number theory and more, arising from my discovery. Perhaps the most interesting part is testing LLMs and other AI tools to assess their reasoning capabilities on a fascinating math problem with no solution posted anywhere.

➡️ Read about the challenge at https://mltblog.com/3F3Y9Yd

datasets

None public yet