I can't make it sound in character with my audio reference, as it needs a much MUCH less emotions in audio refrence otherwise the sovits model will copy it's emotions.