Spaces:
Running
GPT-4o Added!
After inferencing an image the response will remain in left sidebar and allows you to edit or display using sidebar buttons and download the .md file to save your response. If you dont want that there is also a delete all button which will clear all response history files with intent of clean demonstration.
Copying and pasting output from image AI into text input below the markdown triggers a search of arXiv to find relevant research papers to the text. It also can compose an output story that changes the feel of the text using a variety of three other LLMs that are going to provide fundamentally different output than gpt4o would.
🔎 Create a story with these concepts from my markdown outline: 🎹 Grand Piano: The central element of the image is a grand piano. 🎼 Musical Notation: There are various musical notations and symbols above and around the piano. 🎨 Color Scheme: The image features a warm color palette with shades of orange, yellow, and red. 🔠 Letters on Keys: The piano keys have letters written on them, possibly indicating notes or a specific coding system. 📊 Charts and Graphs: There are several charts and graphs on the right side of the image, likely representing musical data or analysis. 📏 Lines and Measurements: Vertical and horizontal lines are drawn, possibly indicating measurements or connections between different elements. 🔍 Detailed Annotations: The image includes detailed annotations and labels, suggesting a technical or educational purpose. 🎶 Sheet Music: The top part of the image resembles sheet music with staves and notes. 🖼️ Abstract Art: The background includes abstract shapes and patterns, adding an artistic touch. 📚 Educational Tool: The combination of musical elements and data suggests that this image could be used as an educational tool for learning about music theory or piano.
In the heart of a warm, inviting studio, a grand piano takes center stage. Its polished ebony surface gleams under the soft glow of ambient lighting, reflecting the vibrant hues of the surrounding space. The room is alive with the spirit of creativity, as if each colorful stroke of paint on the walls has infused it with inspiration.
Above the piano, a cascade of musical notation and symbols dance in mid-air, defying gravity. They weave together in an intricate ballet, guiding the eye from one end of the room to the other. Each note, rest, and clef is meticulously crafted, their lines and curves forming a visual symphony that echoes the melodies waiting to be coaxed from the piano below.
The keys of the grand piano themselves are no ordinary set of ivory and ebony. Instead, they bear the imprint of letters, arranged in a mysterious coding system that seems to hold secrets of its own. As the viewer's gaze traces the pattern, they can't help but wonder if these letters represent hidden messages or a new way of understanding music.
On the right side of the room, charts and graphs spring to life, their lines and shapes illustrating the complex relationships between notes, chords, and rhythm. Each graph tells a story, revealing patterns and connections that might otherwise go unnoticed. The data visualizations are both beautiful and informative, offering a unique perspective on the world of music.
Vertical and horizontal lines crisscross the space, connecting the various elements like an invisible web. These measurements serve as a reminder that music, like any other form of art, is grounded in mathematics and structure. They also suggest that there is order and purpose behind the seemingly chaotic swirl of colors and shapes.
Detailed annotations and labels accompany each element, providing context and clarity. The text is precise and technical, yet approachable, inviting the viewer to explore and learn. It's clear that this image was designed with education in mind, offering a rich tapestry of information for those willing to delve into its depths.
At the top of the image, a staff of sheet music beckons, its notes and rests waiting to be played. The staves seem to stretch beyond the boundaries of the page, inviting the viewer to imagine the music that
🩺🔍 Search Results
01 Dec 2021 | Score Transformer: Generating Musical Score from Note-level Representation | ⬇️
Masahiro Suzuki
In this paper, we explore the tokenized representation of musical scores using the Transformer model to automatically generate musical scores. Thus far, sequence models have yielded fruitful results with note-level (MIDI-equivalent) symbolic representations of music. Although the note-level representations can comprise sufficient information to reproduce music aurally, they cannot contain adequate information to represent music visually in terms of notation. Musical scores contain various musical symbols (e.g., clef, key signature, and notes) and attributes (e.g., stem direction, beam, and tie) that enable us to visually comprehend musical content. However, automated estimation of these elements has yet to be comprehensively addressed. In this paper, we first design score token representation corresponding to the various musical elements. We then train the Transformer model to transcribe note-level representation into appropriate music notation. Evaluations of popular piano scores show that the proposed method significantly outperforms existing methods on all 12 musical aspects that were investigated. We also explore an effective notation-level token representation to work with the model and determine that our proposed representation produces the steadiest results.
07 Aug 2019 | Deep Learning Techniques for Music Generation -- A Survey | ⬇️
Jean-Pierre Briot, Ga"etan Hadjeres and Fran\c{c}ois-David Pachet
This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical content is to be generated? Examples are: melody, polyphony, accompaniment or counterpoint. - For what destination and for what use? To be performed by a human(s) (in the case of a musical score), or by a machine (in the case of an audio file). Representation - What are the concepts to be manipulated? Examples are: waveform, spectrogram, note, chord, meter and beat. - What format is to be used? Examples are: MIDI, piano roll or text. - How will the representation be encoded? Examples are: scalar, one-hot or many-hot. Architecture - What type(s) of deep neural network is (are) to be used? Examples are: feedforward network, recurrent network, autoencoder or generative adversarial networks. Challenge - What are the limitations and open challenges? Examples are: variability, interactivity and creativity. Strategy - How do we model and control the process of generation? Examples are: single-step feedforward, iterative feedforward, sampling or input manipulation. For each dimension, we conduct a comparative analysis of various models and techniques and we propose some tentative multidimensional typology. This typology is bottom-up, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature. These systems are described and are used to exemplify the various choices of objective, representation, architecture, challenge and strategy. The last section includes some discussion and some prospects.
15 Dec 2023 | N-Gram Unsupervised Compoundation and Feature Injection for Better Symbolic Music Understanding | ⬇️
Jinhao Tian, Zuchao Li, Jiajia Li, Ping Wang
The first step to apply deep learning techniques for symbolic music understanding is to transform musical pieces (mainly in MIDI format) into sequences of predefined tokens like note pitch, note velocity, and chords. Subsequently, the sequences are fed into a neural sequence model to accomplish specific tasks. Music sequences exhibit strong correlations between adjacent elements, making them prime candidates for N-gram techniques from Natural Language Processing (NLP). Consider classical piano music: specific melodies might recur throughout a piece, with subtle variations each time. In this paper, we propose a novel method, NG-Midiformer, for understanding symbolic music sequences that leverages the N-gram approach. Our method involves first processing music pieces into word-like sequences with our proposed unsupervised compoundation, followed by using our N-gram Transformer encoder, which can effectively incorporate N-gram information to enhance the primary encoder part for better understanding of music sequences. The pre-training process on large-scale music datasets enables the model to thoroughly learn the N-gram information contained within music sequences, and subsequently apply this information for making inferences during the fine-tuning stage. Experiment on various datasets demonstrate the effectiveness of our method and achieved state-of-the-art performance on a series of music understanding downstream tasks. The code and model weights will be released at https://github.com/CinqueOrigin/NG-Midiformer.
27 Jul 2023 | Graph-based Polyphonic Multitrack Music Generation | ⬇️
Emanuele Cosenza, Andrea Valenti, Davide Bacciu
Graphs can be leveraged to model polyphonic multitrack symbolic music, where notes, chords and entire sections may be linked at different levels of the musical hierarchy by tonal and rhythmic relationships. Nonetheless, there is a lack of works that consider graph representations in the context of deep learning systems for music generation. This paper bridges this gap by introducing a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately, one after the other, with a hierarchical architecture that matches the structural priors of music. By separating the structure and content of musical graphs, it is possible to condition generation by specifying which instruments are played at certain times. This opens the door to a new form of human-computer interaction in the context of music co-creation. After training the model on existing MIDI datasets, the experiments show that the model is able to generate appealing short and long musical sequences and to realistically interpolate between them, producing music that is tonally and rhythmically consistent. Finally, the visualization of the embeddings shows that the model is able to organize its latent space in accordance with known musical concepts.
24 Jun 2019 | A Convolutional Approach to Melody Line Identification in Symbolic Scores | ⬇️
Federico Simonetta and Carlos Cancino-Chac'on and Stavros Ntalampiras and Gerhard Widmer
In many musical traditions, the melody line is of primary significance in a piece. Human listeners can readily distinguish melodies from accompaniment; however, making this distinction given only the written score -- i.e. without listening to the music performed -- can be a difficult task. Solving this task is of great importance for both Music Information Retrieval and musicological applications. In this paper, we propose an automated approach to identifying the most salient melody line in a symbolic score. The backbone of the method consists of a convolutional neural network (CNN) estimating the probability that each note in the score (more precisely: each pixel in a piano roll encoding of the score) belongs to the melody line. We train and evaluate the method on various datasets, using manual annotations where available and solo instrument parts where not. We also propose a method to inspect the CNN and to analyze the influence exerted by notes on the prediction of other notes; this method can be applied whenever the output of a neural network has the same size as the input.
29 Jul 2020 | Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining | ⬇️
TJ Tsai and Kevin Ji
This paper studies composer style classification of piano sheet music images. Previous approaches to the composer classification task have been limited by a scarcity of data. We address this issue in two ways: (1) we recast the problem to be based on raw sheet music images rather than a symbolic music format, and (2) we propose an approach that can be trained on unlabeled data. Our approach first converts the sheet music image into a sequence of musical "words" based on the bootleg feature representation, and then feeds the sequence into a text classifier. We show that it is possible to significantly improve classifier performance by first training a language model on a set of unlabeled data, initializing the classifier with the pretrained language model weights, and then finetuning the classifier on a small amount of labeled data. We train AWD-LSTM, GPT-2, and RoBERTa language models on all piano sheet music images in IMSLP. We find that transformer-based architectures outperform CNN and LSTM models, and pretraining boosts classification accuracy for the GPT-2 model from 46% to 70% on a 9-way classification task. The trained model can also be used as a feature extractor that projects piano sheet music into a feature space that characterizes compositional style.
07 Jan 2021 | Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs | ⬇️
Wen-Yi Hsiao, Jen-Yu Liu, Yin-Cheng Yeh, Yi-Hsuan Yang
To apply neural sequence models such as the Transformers to music generation tasks, one has to represent a piece of music by a sequence of tokens drawn from a finite set of pre-defined vocabulary. Such a vocabulary usually involves tokens of various types. For example, to describe a musical note, one needs separate tokens to indicate the note's pitch, duration, velocity (dynamics), and placement (onset time) along the time grid. While different types of tokens may possess different properties, existing models usually treat them equally, in the same way as modeling words in natural languages. In this paper, we present a conceptually different approach that explicitly takes into account the type of the tokens, such as note types and metric types. And, we propose a new Transformer decoder architecture that uses different feed-forward heads to model tokens of different types. With an expansion-compression trick, we convert a piece of music to a sequence of compound words by grouping neighboring tokens, greatly reducing the length of the token sequences. We show that the resulting model can be viewed as a learner over dynamic directed hypergraphs. And, we employ it to learn to compose expressive Pop piano music of full-song length (involving up to 10K individual tokens per song), both conditionally and unconditionally. Our experiment shows that, compared to state-of-the-art models, the proposed model converges 5--10 times faster at training (i.e., within a day on a single GPU with 11 GB memory), and with comparable quality in the generated music.
22 Feb 2024 | Structuring Concept Space with the Musical Circle of Fifths by Utilizing Music Grammar Based Activations | ⬇️
Tofara Moyo
In this paper, we explore the intriguing similarities between the structure of a discrete neural network, such as a spiking network, and the composition of a piano piece. While both involve nodes or notes that are activated sequentially or in parallel, the latter benefits from the rich body of music theory to guide meaningful combinations. We propose a novel approach that leverages musical grammar to regulate activations in a spiking neural network, allowing for the representation of symbols as attractors. By applying rules for chord progressions from music theory, we demonstrate how certain activations naturally follow others, akin to the concept of attraction. Furthermore, we introduce the concept of modulating keys to navigate different basins of attraction within the network. Ultimately, we show that the map of concepts in our model is structured by the musical circle of fifths, highlighting the potential for leveraging music theory principles in deep learning algorithms.
04 Feb 2020 | Learning the helix topology of musical pitch | ⬇️
Vincent Lostanlen, Sripathi Sridhar, Brian McFee, Andrew Farnsworth, Juan Pablo Bello
To explain the consonance of octaves, music psychologists represent pitch as a helix where azimuth and axial coordinate correspond to pitch class and pitch height respectively. This article addresses the problem of discovering this helical structure from unlabeled audio data. We measure Pearson correlations in the constant-Q transform (CQT) domain to build a K-nearest neighbor graph between frequency subbands. Then, we run the Isomap manifold learning algorithm to represent this graph in a three-dimensional space in which straight lines approximate graph geodesics. Experiments on isolated musical notes demonstrate that the resulting manifold resembles a helix which makes a full turn at every octave. A circular shape is also found in English speech, but not in urban noise. We discuss the impact of various design choices on the visualization: instrumentarium, loudness mapping function, and number of neighbors K.
25 Nov 2021 | A-Muze-Net: Music Generation by Composing the Harmony based on the Generated Melody | ⬇️
Or Goren, Eliya Nachmani, Lior Wolf
We present a method for the generation of Midi files of piano music. The method models the right and left hands using two networks, where the left hand is conditioned on the right hand. This way, the melody is generated before the harmony. The Midi is represented in a way that is invariant to the musical scale, and the melody is represented, for the purpose of conditioning the harmony, by the content of each bar, viewed as a chord. Finally, notes are added randomly, based on this chord representation, in order to enrich the generated audio. Our experiments show a significant improvement over the state of the art for training on such datasets, and demonstrate the contribution of each of the novel components.
20 Apr 2020 | Music Gesture for Visual Sound Separation | ⬇️
Chuang Gan, Deng Huang, Hang Zhao, Joshua B. Tenenbaum, Antonio Torralba
Recent deep learning approaches have achieved impressive performance on visual sound separation tasks. However, these approaches are mostly built on appearance and optical flow like motion feature representations, which exhibit limited abilities to find the correlations between audio signals and visual points, especially when separating multiple instruments of the same types, such as multiple violins in a scene. To address this, we propose "Music Gesture," a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music. We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals. Experimental results on three music performance datasets show: 1) strong improvements upon benchmark metrics for hetero-musical separation tasks (i.e. different instruments); 2) new ability for effective homo-musical separation for piano, flute, and trumpet duets, which to our best knowledge has never been achieved with alternative methods. Project page: http://music-gesture.csail.mit.edu.
07 Feb 2024 | PBSCSR: The Piano Bootleg Score Composer Style Recognition Dataset | ⬇️
Arhan Jain, Alec Bunn, Austin Pham, and TJ Tsai
This article motivates, describes, and presents the PBSCSR dataset for studying composer style recognition of piano sheet music. Our overarching goal was to create a dataset for studying composer style recognition that is "as accessible as MNIST and as challenging as ImageNet". To achieve this goal, we use a previously proposed feature representation of sheet music called a bootleg score, which encodes the position of noteheads relative to the staff lines. Using this representation, we sample fixed-length bootleg score fragments from piano sheet music images on IMSLP. The dataset itself contains 40,000 62x64 bootleg score images for a 9-way classification task, 100,000 62x64 bootleg score images for a 100-way classification task, and 29,310 unlabeled variable-length bootleg score images for pretraining. The labeled data is presented in a form that mirrors MNIST images, in order to make it extremely easy to visualize, manipulate, and train models in an efficient manner. Additionally, we include relevant metadata to allow access to the underlying raw sheet music images and other related data on IMSLP. We describe several research tasks that could be studied with the dataset, including variations of composer style recognition in a few-shot or zero-shot setting. For tasks that have previously proposed models, we release code and baseline results for future works to compare against. We also discuss open research questions that the PBSCSR data is especially well suited to facilitate research on and areas of fruitful exploration in future work.
03 Aug 2016 | A Stochastic Temporal Model of Polyphonic MIDI Performance with Ornaments | ⬇️
Eita Nakamura, Nobutaka Ono, Shigeki Sagayama, Kenji Watanabe
We study indeterminacies in realization of ornaments and how they can be incorporated in a stochastic performance model applicable for music information processing such as score-performance matching. We point out the importance of temporal information, and propose a hidden Markov model which describes it explicitly and represents ornaments with several state types. Following a review of the indeterminacies, they are carefully incorporated into the model through its topology and parameters, and the state construction for quite general polyphonic scores is explained in detail. By analyzing piano performance data, we find significant overlaps in inter-onset-interval distributions of chordal notes, ornaments, and inter-chord events, and the data is used to determine details of the model. The model is applied for score following and offline score-performance matching, yielding highly accurate matching for performances with many ornaments and relatively frequent errors, repeats, and skips.
24 Dec 2023 | Combinatorial music generation model with song structure graph analysis | ⬇️
Seonghyeon Go and Kyogu Lee
In this work, we propose a symbolic music generation model with the song structure graph analysis network. We construct a graph that uses information such as note sequence and instrument as node features, while the correlation between note sequences acts as the edge feature. We trained a Graph Neural Network to obtain node representation in the graph, then we use node representation as input of Unet to generate CONLON pianoroll image latent. The outcomes of our experimental results show that the proposed model can generate a comprehensive form of music. Our approach represents a promising and innovative method for symbolic music generation and holds potential applications in various fields in Music Information Retreival, including music composition, music classification, and music inpainting systems.
31 May 2022 | Towards Context-Aware Neural Performance-Score Synchronisation | ⬇️
Ruchit Agrawal
Music can be represented in multiple forms, such as in the audio form as a recording of a performance, in the symbolic form as a computer readable score, or in the image form as a scan of the sheet music. Music synchronisation provides a way to navigate among multiple representations of music in a unified manner by generating an accurate mapping between them, lending itself applicable to a myriad of domains like music education, performance analysis, automatic accompaniment and music editing. Traditional synchronisation methods compute alignment using knowledge-driven and stochastic approaches, typically employing handcrafted features. These methods are often unable to generalise well to different instruments, acoustic environments and recording conditions, and normally assume complete structural agreement between the performances and the scores. This PhD furthers the development of performance-score synchronisation research by proposing data-driven, context-aware alignment approaches, on three fronts: Firstly, I replace the handcrafted features by employing a metric learning based approach that is adaptable to different acoustic settings and performs well in data-scarce conditions. Secondly, I address the handling of structural differences between the performances and scores, which is a common limitation of standard alignment methods. Finally, I eschew the reliance on both feature engineering and dynamic programming, and propose a completely data-driven synchronisation method that computes alignments using a neural framework, whilst also being robust to structural differences between the performances and scores.
16 Jun 2021 | Listen to Your Favorite Melodies with img2Mxml, Producing MusicXML from Sheet Music Image by Measure-based Multimodal Deep Learning-driven Assembly | ⬇️
Tomoyuki Shishido, Fehmiju Fati, Daisuke Tokushige, and Yasuhiro Ono
Deep learning has recently been applied to optical music recognition (OMR). However, currently OMR processing from various sheet music images still lacks precision to be widely applicable. Here, we present an MMdA (Measure-based Multimodal deep learning (DL)-driven Assembly) method allowing for end-to-end OMR processing from various images including inclined photo images. Using this method, measures are extracted by a deep learning model, aligned, and resized to be used for inference of given musical symbol components by using multiple deep learning models in sequence or in parallel. Use of each standardized measure enables efficient training of the models and accurate adjustment of five staff lines in each measure. Multiple musical symbol component category models with a small number of feature types can represent a diverse set of notes and other musical symbols including chords. This MMdA method provides a solution to end-to-end OMR processing with precision.
29 Mar 2022 | Machine Composition of Korean Music via Topological Data Analysis and Artificial Neural Network | ⬇️
Mai Lan Tran and Dongjin Lee and Jae-Hun Jung
Common AI music composition algorithms based on artificial neural networks are to train a machine by feeding a large number of music pieces and create artificial neural networks that can produce music similar to the input music data. This approach is a blackbox optimization, that is, the underlying composition algorithm is, in general, not known to users. In this paper, we present a way of machine composition that trains a machine the composition principle embedded in the given music data instead of directly feeding music pieces. We propose this approach by using the concept of {\color{black}{Overlap}} matrix proposed in \cite{TPJ}. In \cite{TPJ}, a type of Korean music, so-called the {\it Dodeuri} music such as Suyeonjangjigok has been analyzed using topological data analysis (TDA), particularly using persistent homology. As the raw music data is not suitable for TDA analysis, the music data is first reconstructed as a graph. The node of the graph is defined as a two-dimensional vector composed of the pitch and duration of each music note. The edge between two nodes is created when those nodes appear consecutively in the music flow. Distance is defined based on the frequency of such appearances. Through TDA on the constructed graph, a unique set of cycles is found for the given music. In \cite{TPJ}, the new concept of the {\it {\color{black}{Overlap}} matrix} has been proposed, which visualizes how those cycles are interconnected over the music flow, in a matrix form. In this paper, we explain how we use the {\color{black}{Overlap}} matrix for machine composition. The {\color{black}{Overlap}} matrix makes it possible to compose a new music piece algorithmically and also provide a seed music towards the desired artificial neural network. In this paper, we use the {\it Dodeuri} music and explain detailed steps.
27 Dec 2019 | Structural characterization of musical harmonies | ⬇️
Maria Rojo Gonz'alez and Simone Santini
Understanding the structural characteristics of harmony is essential for an effective use of music as a communication medium. Of the three expressive axes of music (melody, rhythm, harmony), harmony is the foundation on which the emotional content is built, and its understanding is important in areas such as multimedia and affective computing. The common tool for studying this kind of structure in computing science is the formal grammar but, in the case of music, grammars run into problems due to the ambiguous nature of some of the concepts defined in music theory. In this paper, we consider one of such constructs: modulation, that is, the change of key in the middle of a musical piece, an important tool used by many authors to enhance the capacity of music to express emotions. We develop a hybrid method in which an evidence-gathering numerical method detects modulation and then, based on the detected tonalities, a non-ambiguous grammar can be used for analyzing the structure of each tonal component. Experiments with music from the XVII and XVIII centuries show that we can detect the precise point of modulation with an error of at most two chords in almost 97% of the cases. Finally, we show examples of complete modulation and structural analysis of musical harmonies.
23 Jun 2020 | Audeo: Audio Generation for a Silent Performance Video | ⬇️
Kun Su, Xiulong Liu, Eli Shlizerman
We present a novel system that gets as an input video frames of a musician playing the piano and generates the music for that video. Generation of music from visual cues is a challenging problem and it is not clear whether it is an attainable goal at all. Our main aim in this work is to explore the plausibility of such a transformation and to identify cues and components able to carry the association of sounds with visual events. To achieve the transformation we built a full pipeline named \textit{Audeo}' containing three components. We first translate the video frames of the keyboard and the musician hand movements into raw mechanical musical symbolic representation Piano-Roll (Roll) for each video frame which represents the keys pressed at each time step. We then adapt the Roll to be amenable for audio synthesis by including temporal correlations. This step turns out to be critical for meaningful audio generation. As a last step, we implement Midi synthesizers to generate realistic music. \textit{Audeo} converts video to audio smoothly and clearly with only a few setup constraints. We evaluate \textit{Audeo} on in the wild' piano performance videos and obtain that their generated music is of reasonable audio quality and can be successfully recognized with high precision by popular music identification software.
20 Mar 2020 | Exploring Inherent Properties of the Monophonic Melody of Songs | ⬇️
Zehao Wang, Shicheng Zhang, Xiaoou Chen
Melody is one of the most important components in music. Unlike other components in music theory, such as harmony and counterpoint, computable features for melody is urgently in need. These features are highly demanded as data-driven methods dominating the fields such as musical information retrieval and automatic music composition. To boost the performance of deep-learning-related musical tasks, we propose a set of interpretable features on monophonic melody for computational purposes. These features are defined not only in mathematical form, but also with some considerations on composers 'intuition. For example, the Melodic Center of Gravity can reflect the sentence-wise contour of the melody, the local / global melody dynamics quantifies the dynamics of a melody that couples pitch and time in a sentence. We found that these features are considered by people universally in many genres of songs, even for atonal composition practices. Hopefully, these melodic features can provide nov el inspiration for future researchers as a tool in the field of MIR and automatic composition.
Date: 01 Dec 2021
Title: Score Transformer: Generating Musical Score from Note-level Representation
Abstract Link: https://arxiv.org/abs/2112.00355
PDF Link: https://arxiv.org/pdf/2112.00355
Date: 07 Aug 2019
Title: Deep Learning Techniques for Music Generation -- A Survey
Abstract Link: https://arxiv.org/abs/1709.01620
PDF Link: https://arxiv.org/pdf/1709.01620
Date: 15 Dec 2023
Title: N-Gram Unsupervised Compoundation and Feature Injection for Better Symbolic Music Understanding
Abstract Link: https://arxiv.org/abs/2312.08931
PDF Link: https://arxiv.org/pdf/2312.08931
Date: 27 Jul 2023
Title: Graph-based Polyphonic Multitrack Music Generation
Abstract Link: https://arxiv.org/abs/2307.14928
PDF Link: https://arxiv.org/pdf/2307.14928
Date: 24 Jun 2019
Title: A Convolutional Approach to Melody Line Identification in Symbolic Scores
Abstract Link: https://arxiv.org/abs/1906.10547
PDF Link: https://arxiv.org/pdf/1906.10547
Date: 29 Jul 2020
Title: Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining
Abstract Link: https://arxiv.org/abs/2007.14587
PDF Link: https://arxiv.org/pdf/2007.14587
Date: 07 Jan 2021
Title: Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs
Abstract Link: https://arxiv.org/abs/2101.02402
PDF Link: https://arxiv.org/pdf/2101.02402
Date: 22 Feb 2024
Title: Structuring Concept Space with the Musical Circle of Fifths by Utilizing Music Grammar Based Activations
Abstract Link: https://arxiv.org/abs/2403.00790
PDF Link: https://arxiv.org/pdf/2403.00790
Date: 04 Feb 2020
Title: Learning the helix topology of musical pitch
Abstract Link: https://arxiv.org/abs/1910.10246
PDF Link: https://arxiv.org/pdf/1910.10246
Date: 25 Nov 2021
Title: A-Muze-Net: Music Generation by Composing the Harmony based on the Generated Melody
Abstract Link: https://arxiv.org/abs/2111.12986
PDF Link: https://arxiv.org/pdf/2111.12986
Date: 20 Apr 2020
Title: Music Gesture for Visual Sound Separation
Abstract Link: https://arxiv.org/abs/2004.09476
PDF Link: https://arxiv.org/pdf/2004.09476
Date: 07 Feb 2024
Title: PBSCSR: The Piano Bootleg Score Composer Style Recognition Dataset
Abstract Link: https://arxiv.org/abs/2401.16803
PDF Link: https://arxiv.org/pdf/2401.16803
Date: 03 Aug 2016
Title: A Stochastic Temporal Model of Polyphonic MIDI Performance with Ornaments
Abstract Link: https://arxiv.org/abs/1404.2314
PDF Link: https://arxiv.org/pdf/1404.2314
Date: 24 Dec 2023
Title: Combinatorial music generation model with song structure graph analysis
Abstract Link: https://arxiv.org/abs/2312.15400
PDF Link: https://arxiv.org/pdf/2312.15400
Date: 31 May 2022
Title: Towards Context-Aware Neural Performance-Score Synchronisation
Abstract Link: https://arxiv.org/abs/2206.00454
PDF Link: https://arxiv.org/pdf/2206.00454
Date: 16 Jun 2021
Title: Listen to Your Favorite Melodies with img2Mxml, Producing MusicXML from Sheet Music Image by Measure-based Multimodal Deep Learning-driven Assembly
Abstract Link: https://arxiv.org/abs/2106.12037
PDF Link: https://arxiv.org/pdf/2106.12037
Date: 29 Mar 2022
Title: Machine Composition of Korean Music via Topological Data Analysis and Artificial Neural Network
Abstract Link: https://arxiv.org/abs/2203.15468
PDF Link: https://arxiv.org/pdf/2203.15468
Date: 27 Dec 2019
Title: Structural characterization of musical harmonies
Abstract Link: https://arxiv.org/abs/1912.12362
PDF Link: https://arxiv.org/pdf/1912.12362
Date: 23 Jun 2020
Title: Audeo: Audio Generation for a Silent Performance Video
Abstract Link: https://arxiv.org/abs/2006.14348
PDF Link: https://arxiv.org/pdf/2006.14348
Date: 20 Mar 2020
Title: Exploring Inherent Properties of the Monophonic Melody of Songs
Abstract Link: https://arxiv.org/abs/2003.09287
PDF Link: https://arxiv.org/pdf/2003.09287
🔍Run of Multi-Agent System Paper Summary Spec is Complete
Start time: 2024-05-15 13:56:45
Finish time: 2024-05-15 13:57:09
Elapsed time: 24.00 seconds
Create a story with these concepts from my markdown outline: 🎹 Grand Piano: The central element of the image is a grand piano. 🎼 Musical Notation: There are various musical notations and symbols above and around the piano. 🎨 Color Scheme: The image features a warm color palette with shades of orange, yellow, and red. 🔠 Letters on Keys: The piano keys have letters written on them, possibly indicating notes or a specific coding system. 📊 Charts and Graphs: There are several charts and graphs on the right side of the image, likely representing musical data or analysis. 📏 Lines and Measurements: Vertical and horizontal lines are drawn, possibly indicating measurements or connections between different elements. 🔍 Detailed Annotations: The image includes detailed annotations and labels, suggesting a technical or educational purpose. 🎶 Sheet Music: The top part of the image resembles sheet music with staves and notes. 🖼️ Abstract Art: The background includes abstract shapes and patterns, adding an artistic touch. 📚 Educational Tool: The combination of musical elements and data suggests that this image could be used as an educational tool for learning about music theory or piano.