Mistral Hypnosis Finetune

#1
by Philipp-Sc - opened

Hi @jtatman ,

I am curious about this model, did you do any work on this?
I am working on something similar.

Best regards,
Philipp-Sc

Owner

I did work on this, and it's mostly unfinished at this point unfortunately. Appreciate your reaching out!

I spent a few hundred hours generating various hypnosis scenarios from OpenAI, Cohere, and Hugging Face models of various capacities. I started with a general scraping of professional hypnosis scripts that were specific in intent and patterns, identified those patterns and differences, and decided on Erickson Hypnosis as the best overall sequence for development. Generation is hit or miss and ends up being a whole lot of filtering to get something useful. Lots of manual work.

I figure this goes hand in hand with any future therapeutic bot interface for adoption, and there is a potential emerging ever since generated voice caught up a bit in quality for models. However, haven't found a voice that is level and soothing enough with very little variance that would serve as an output interface for guided sessions to be effective - IMHO. Obviously it's a little touchy as a) hypnosis has a slightly bad reputation because of the fake stage and street acts that make it seem like a dangerous joke, and b) because it's the ghost in the machine responsible for synthetic guided hypnosis. I stopped working on this to consider the abuse potential and voice selection - also why I focused on legitimate Erickson hypnosis as well at the end.

Would appreciate any feedback or experiences you might be having - it's a bit of uncharted territory not being formally trained and all, but experiencing the benefits I've seen with hypnosis I wanted to move the proverbial ball a bit. Hope you're encountering less sorting out the particulars. Appreciating the smaller model push recently to get less cluttered starting models - easier to specialize rather than just Lora, or so it seems...

Cheers and Best!

Same here, it's challenging to build a great dataset.
Apart from OpenAI there also is Mistral-Trismegistus-7B that has some abilities to write hypnosis transcripts but in the end I did not get many good results.
I ended up transcribing a good amount of hypnosis sessions mainly from YouTube using Whisper and applied quite a few methods (also using LLMs) to clean the data. Then I generated a prompt for each transcript using Mistral-7b-reverse-instruct to get a hypnosis-instruct-dataset.

I trained a Mistral-7b-hypnosis instruct model a few weeks ago. But then realized nobody really wants to formulate a well written query to get a customized hypnosis session. So I am now working on a second assistant that collects the users preferences, which also is challenging due to my lack of data in this regard (and the amount of possible ways a user can interact with it).

That's why I am curious how your dataset looks like. Maybe there are better ways to do it so one only needs a single model.

Regarding the voice maybe now is the first time it ever became possible, ElevenLabs is great in this regard.

Abuse potential could be serious when exploited by bad actors, unwanted triggers or negative reinforcement could unintentionally do damage. It would be sensible to evaluate this before releasing a hypnosis model.
Maybe you have some ideas how to evaluate for this. In general though I think if done right a hypnosis assistant powered by advanced AI providing users with personalized, convenient, and accessible guided sessions, promoting stress relief, improved focus, enhanced creativity, and overall emotional well-being should out-weight the abuse potential. ;-)

Best regards,
Philipp-Sc

Owner

Hear that about Trimegistus, was excited to see that one - a little esoteric sometimes with the added occult and alternative stuff, but great nonetheless.

Think you're probably right about the single model - I'm trending toward specialization models on micro scaffold most nowadays. The vision I had was that the misuse (hypnotizing without consent, using methods without disclosure, etc.) has to be curtailed without massive alignment - but that's as far as I went, because after all the models are flexible stats machines - they want to return something even if it's wrong. I was wondering the other day if there's a way to construct a mixture of experts such that one expert is always the filter or gatekeeper for abuse situations - as in it generates the worst possible and then evals through a sentiment layer that specifically looks for manipulation in every request- of course, just defining the depths of human manipulation is staggering in itself, and always changing. A devious approach one day is a different approach the next and so forth.

I really appreciate the nod to reverse-instruct to generate prompting - I was using summarization and multiple restatements coded by sentiment scores, which was overly complex at best. The dataset is masked for social reasons - a lot of data is from archaic books from the 20's and so on when hypnosis was cool and emergent, or transcripts of Erickson and co. doing their thing way back when. I like the idea of the tube source and transcribe - did you have many corrections on transcription? I had a real problem with old scans making very confusing paragraphs and breaking the trance - a lot of stuff had to be thrown right out because it was unnerving to hear jagged speech in the middle of a session - really excited about the recent OpenVoice cloning to get a tonal pattern right for mood and delivery and consistency, but mood or not when the trance breaks it's all got to be started again.

Agree about the potential - self-hypnosis is a real key, as if it's personal enough and can be trusted, there's much less potential for abuse. If this could be distilled to that kind of output - timed sessions or varied time, subject matter from a list of possibles, voice output targeted (text-to-speech one directional or something) there's probably even less even if used manipulatively. Combined with a therapy model that recommends audible hypnosis sessions based on certain conditions or approaches would be fully on target, in my mind anyhow. Mix of experts in their modalities? :)

Someone capable can always fine tune the model to produce harmful content but at least that takes some effort. I did some experiments where I trained a model on crime interrogation, that had the funny end result that a bad actor would end up getting shifted into an interrogation interview and not a hypnosis session. The model has to answer is this a crime investigation or a hypnosis, which is similar to your idea.

I am wondering, the models get good enough such that prompt engineering becomes viable, one might not even need to fine tune, given one can add enough samples into the prompt. Maybe that's the way to get started to collect data and later fine tune a smaller model to save costs.

Overall sounds like you have an intriguing dataset, maybe YouTube transcripts on the other hand are more vanilla, less sophisticated as they are mostly for entertainment. Whisper is great but it often produced strange artifacts or duplicate sentences. I used multiple regex and went sentence for sentence over the transcripts using Mistral to correct obvious grammar mistakes. A little compute extensive..

Right now I am building an App for AI guided self-hypnosis mainly because I want to see how good personalized hypnosis can get, what I can say is that I feel that the questions a hypnotherapist would ask to personalize their session often lead to self-reflection which provides a link to regular therapy, an AI that has both capabilities would definitely be interesting.

Sign up or log in to comment