this is a model that takes in visual information and sound information and applys edits accordingly, it can also be trained to follow the editors style of editing but you will need atleast 16 cores (32 threads) and 60 gigs of ram to train.
the data sets are the videos i used in the edits and the exported xml files from the finished edits
so far it only works for: -Vegas 18 pro
in theory it should work for: -Davinci Resolve -Final Cut Pro
havent tested: -Premier pro -etc.
Example use:
from EdAixml import VideoAudioFeatureExtractor
extractor = VideoAudioFeatureExtractor("D:\\path\\to\\video.mp4", "D:\\path\\to\\Export\\Location\\generated.xml")
extractor.process_video()