opmo Posted January 5 Posted January 5 Opusmodus 3.0.29517 With the new SPECTRAL-ANALYSIS function, you can analyze audio files directly in Opusmodus. The SPEAR application is no longer required to generate spectral data. SPECTRAL-ANALYSIS spectral-analysis filename &key start end fft-size window-size hop-size frame-interval min-peak-diff min-amp-db under-peak-db window min-freq max-freq normalize sample-rate Arguments and Values: filename String. Path to the audio file to be analysed. Supported formats are AIFF and WAV. start (optional): Float. Start time (in seconds) of the audio segment to analyse If not specified, analysis starts from the beginning of the audio file. Default is NIL. end optional): Float. End time (in seconds) of the audio segment to analyse. If not specified, analysis continues to the end of the audio file. Default is NIL. fft-size (optional): Integer. Number of points for the Fast Fourier Transform (FFT). Determines the frequency resolution of the analysis. A larger FFT size provides higher frequency resolution but increases computational load. Default is 16384. window-size (optional): Integer. Size of the window applied to each frame before performing FFT. If not specified, defaults to one-fourth of the fft-size. hop-size (optional): Integer. Number of samples between the starts of consecutive frames. Determines the overlap between frames. A smaller hop size increases temporal resolution but also computational load. Default is (floor (* sample-rate frame-interval). frame-interval (optional): Float. Interval between frames in seconds. Determines how frequently frames are extracted from the audio signal. Default is 0.01. min-peak-diff (optional): Float. Minimum difference between adjacent peaks in the magnitude spectrum for a peak to be considered valid. Helps in peak detection by filtering out insignificant variations. Default is 0.01. min-amp-db (optional): Float. Minimum amplitude threshold in decibels (dB) for a frequency component to be considered a partial. Components below this threshold are ignored. Default is -90. under-peak-db (optional): Float. Under-peak amplitude threshold in decibels (dB) for peak detection. Used to filter out less prominent peaks. Default is -60. window (optional): Function. Type of window function applied to each frame. Currently supports (default ):hanning. You can extend this to include other window types as needed (e.g., :hanning, :blackman). Supported window functions :rectangular, :blackman-harris, :bartlett, :welch, :parzen, :exponential, :riemann, :cauchy, :tukey, :bohman, :gaussian, :connes, :hann-poisson, :kaiser, :bartlett-hann, :blackman-nuttall, :flat-top, :nuttall, :planck-taper, :cosine, :triangular, :blackman, :hamming, :hanning and :ultraspherical. min-freq (optional): Float. Minimum frequency in Hertz (Hz) to consider during peak detection. Helps in focusing the analysis on the desired frequency range. Default is 8.1758. max-freq (optional): Float. Maximum frequency in Hertz (Hz) to consider during peak detection. Helps in focusing the analysis on the desired frequency range. Default is 12543.855. normalize (optional): Boolean. If T (true), the amplitude of each frame’s partials will be normalised to have a maximum amplitude of 1.0. If NIL, no normalisation is performed. Default is T. sample-rate (optional): Integer. Optional override of the audio file’s sample rate. If provided, it replaces the file’s actual sample rate for analysis purposes. Default is NIL. Description: The SPECTRAL-ANALYSIS function performs a spectral analysis on audio files in AIFF or WAV formats. It processes the audio data to extract frames of partials (frequency components) using Fast Fourier Transform (FFT), windowing, and peak detection. The function is versatile, allowing customisation through various parameters to suit different analytical needs. The function returns a list of frames, where each frame is a list of partials. Each partial is represented as a pair (frequency amplitude), where: frequency: The frequency of the partial in Hertz (Hz). amplitude: The normalised amplitude of the partial, typically ranging between 0.0 and 1.0. Additionally, the :start and :end parameters allow users to focus the analysis on a specific portion of the audio file. The specified range is extracted and processed, ensuring that the function can efficiently handle large audio files or focus on specific sections of interest. Opusmodus is configured to use the ~/Opusmodus/Media/Audio/ directory as its default location for storing and accessing audio files. If the audio file you intend to analyse is not located within the ~/Opusmodus/Media/Audio/ directory, you must provide the complete file path when invoking analysis functions. Examples: Analysing a stereo AIFF file with default parameters: (spectral-analysis “cello.aiff”) AIFF File: Samples: 195378, SR: 44100, Channels: 1, Bit Depth: 16 Hop Size: 427, Window Size: 4096, Window Function: hanning Audio Duration: 4.4303403, Specified Duration: 4.4303403 Computed Duration: 4.3456, Segment Duration: 4.4303403 Frame Count: 448 => ((61.90525 0.437551 88.81198 0.742432 172.25395 1.0 . . .) (10.708511 0.788477 35.021386 0.264489 56.52694 0.442652 . . .) . . .) Analysing an AIFF file with a larger FFT size for higher frequency resolution and a longer frame interval: (spectral-analysis "cello.aiff" :fft-size 32768 :frame-interval 0.05) AIFF File: Samples: 195378, SR: 44100, Channels: 1, Bit Depth: 16 Hop Size: 2205, Window Size: 8192, Window Function: hanning Audio Duration: 4.4303403, Specified Duration: 4.4303403 Computed Duration: 4.25, Segment Duration: 4.4303403 Frame Count: 85 => ((16.129637 0.443842 41.70884 0.129888 60.54874 0.343018 . . .) (45.74839 0.246823 61.901658 0.537477 105.00895 0.105809 . . .) . . .) Analysing a AIFF file focusing on a specific frequency range and using a Blackman window: (spectral-analysis "cello.aiff" :min-freq 100.0 :max-freq 8000.0 :window :blackman) AIFF File: Samples: 195378, SR: 44100, Channels: 1, Bit Depth: 16 Hop Size: 427, Window Size: 4096, Window Function: blackman Audio Duration: 4.4303403, Specified Duration: 4.4303403 Computed Duration: 4.3456, Segment Duration: 4.4303403 Frame Count: 448 => ((102.17515 1.0 115.712296 0.223771 134.71875 0.2579 . . .) (118.50911 0.355596 128.92746 0.292709 139.72049 0.325564 . . .) . . .) A larger FFT size increases frequency resolution, allowing finer distinction between close frequency components: (spectral-analysis "cello.aiff" :fft-size 65536) AIFF File: Samples: 195378, SR: 44100, Channels: 1, Bit Depth: 16 Hop Size: 427, Window Size: 16384, Window Function: hanning Audio Duration: 4.4303403, Specified Duration: 4.4303403 Computed Duration: 4.0740004, Segment Duration: 4.4303403 Frame Count: 420 => ((28.935587 0.064061 43.740155 0.028993 61.90408 0.100941 . . .) (28.935658 0.058497 35.65732 0.024704 61.905793 0.09388 . . .) . . .) Higher fft-size: Increases frequency resolution, allowing for finer analysis of frequency components. However, it also increases computational time and memory usage. Lower fft-size: Decreases frequency resolution but speeds up processing and reduces memory consumption. Recommendation: Choose a size that balances resolution needs with available computational resources. Common sizes are powers of two (e.g., 4096, 8192, 16384). Analysing a Specific Segment: (spectral-analysis "cello.aiff" :start 1.5 :end 2.5) AIFF File: Samples: 195378, SR: 44100, Channels: 1, Bit Depth: 16 Hop Size: 427, Window Size: 4096, Window Function: hanning Audio Duration: 4.4303403, Specified Duration: 1.0 Computed Duration: 0.9118, Segment Duration: 1.0 Frame Count: 94 => ((56.571057 0.078394 67.283196 0.023841 118.45378 0.080271 . . .) (67.20516 0.117893 118.45953 0.071553 129.13939 0.022749 . . .)) Frames will include frequency and amplitude data for this 1-second segment. Both Segment Duration and Computed Duration will reflect this time window. (spectral-analysis "cello.aiff" :start 0.5 :end 0.6) AIFF File: Samples: 195378, SR: 44100, Channels: 1, Bit Depth: 16 Hop Size: 427, Window Size: 4096, Window Function: hanning Audio Duration: 4.4303403, Specified Duration: 0.100000024 Computed Duration: 0.0097, Segment Duration: 0.1 Frame Count: 1 => ((91.54743 1.0 118.52162 0.016846 128.9997 0.012361 . . .)) A short 0.1-second segment will be analysed. Only one frame will be generated due to the short duration. Creating and Utilising a Spectral Analysis Library in Opusmodus Creating a spectral analysis library is a best practice that streamlines your workflow by reducing computational time and memory usage for future analyses. By storing precomputed spectral data, you can quickly access and manipulate this information without reprocessing the original audio files each time. Below is a step-by-step guide illustrating how to perform spectral analysis on an audio file, create a library from the analysis results, retrieve data from the library, convert partials to pitch values, assemble an OMN form sequence, and display the final result in standard musical notation. 1. Performing Spectral Analysis on an Audio File Begin by conducting spectral analysis on your target audio file (e.g., "cello.aiff"). This process decomposes the audio signal into its constituent frequency components (partials), which are essential for detailed musical analysis and processing. (setf spectral (spectral-analysis "cello.aiff")) 2. Creating a Spectral Library After obtaining the spectral data, it’s advisable to create a library to store and organise this information systematically. This library serves as a repository, enabling efficient retrieval and management of spectral analyses for various audio files. The newly created library is automatically loaded and stored in the ~/Opusmodus/User Source/Libraries/Def-Library directory. (create-library 'cello-sa 'partials 'p spectral :file "cello-sa") 3. Retrieving Spectral Analysis Data from the Library To utilise the stored spectral data, you need to retrieve it from the library. This allows you to access specific frames or partials for further processing or analysis. (setf partials (library 'cello-sa 'partials nil :collect :all)) SPECTRAL-TO-OMN spectral-to-omn partials &key min-amp max-amp min-freq max-freq crf normalize quantize resolution ambitus min-frame-size min-vel max-vel quantize Arguments and Values: partials A list of spectral data. The function supports two-point data: frequency and amplitude. min-amp (optional): Float. Specifies the minimum amplitude value to be included in the output. Partial amplitudes below this threshold will be ignored. Default is NIL (no threshold). max-amp (optional): Float. Specifies the maximum amplitude value to be included in the output. Partial amplitudes above this threshold will be ignored. Default is NIL (no threshold). min-freq (optional): Float. Specifies the minimum frequency in Hertz (Hz) to include in the output. Frequencies below this value will be filtered out. Default is NIL. max-freq (optional): Float. Specifies the maximum frequency in Hertz (Hz) to include in the output. Frequencies above this value will be filtered out. Default is NIL. crf (optional): Integer. Specifies the maximum number of consecutive repetitions for any given partial in the sequence. If set to 1 (default), consecutive repetitions of spectral partials are removed, ensuring uniqueness in the output sequence. Increasing this value allows more repetitions before filtering. normalize (optional): Boolean. Specifies whether to normalise the amplitude of each frame’s partials to a maximum of 1.0. If T (default), partials are normalised; if NIL, amplitudes are returned as-is. quantize (optional): Ratio. Specifies the frequency quantization level. Options include NIL (no quantization), 1/2 (semitone), 1/4 (quarter tone) and 1/8 (eighth tone). Default is NIL (no quantization). resolution (optional): Length symbol or ratio. Specifies the temporal resolution for the OMN notation. Examples: 's (sixteenth note), 'e (eighth note), or custom values. Default is 's. ambitus (optional): Specifies the pitch range for filtering and/or transposing the output. Can be an instrument name (e.g., 'piano), an integer range, or a list of two pitches (low high). Pitches outside this range will be transposed to fit within the range. Default is 'piano. min-frame-size (optional): Integer. Specifies the minimum number of partials required in a frame for it to be included in the output. Frames with fewer partials are replaced by a rest or ignored. Default is NIL. min-vel (optional): Velocity symbol. Specifies the minimum velocity level for the output in OMN notation. Default is 'ppp (pianissimo). max-vel (optional): Velocity symbol. Specifies the maximum velocity level for the output in OMN notation. Default is 'fff (fortissimo). Description: Spectral music, with its emphasis on the acoustic properties of sound and its focus on the frequency spectrum, resonates deeply with the capabilities of the SPECTRAL-TO-OMN function. In spectral music, composers seek to explore the harmonic and timbral richness inherent in the sound spectrum, often using sophisticated analysis techniques to extract frequencies, amplitudes, and temporal structures. The SPECTRAL-TO-OMN function aligns perfectly with this compositional philosophy, as it transforms spectral data into Opusmodus Notation (OMN), enabling composers to directly integrate spectral material into their musical works. By processing spectral data into OMN (length pitch velocity) events, the SPECTRAL-TO-OMN function provides a practical and creative bridge between sound analysis and composition. This integration is achieved through its key features: Temporal Mapping: The resolution parameter determines the rhythmic resolution for the output. When combined with a specified tempo in the score, the temporal mapping can faithfully reproduce the original timing of the spectral data. This allows all the partials to occur at their precise times, reflecting the temporal evolution of the sound. Frames that do not meet the amplitude (min-amp) or size (min-frame-size) thresholds are replaced by rests, ensuring the rhythmic structure reflects the energy and significance of the spectral data. Frequency to Pitch Conversion: Spectral frequencies extracted from audio are quantised (e.g., to semitones) and converted to pitches. The ambitus parameter ensures all pitches fit within a specified range, such as an instrument’s playable range, making the output musically viable. Dynamic Control via Mean Velocity: The amplitude values of partials within a frame are summed and averaged (i.e., divided by the number of partials in the frame) to calculate a mean value. This mean amplitude is mapped to a velocity range (min-vel to max-vel) using the VECTRO-TO-VELOCITY function (internally). The resulting velocity dynamically reflects the energy of the chord or event. Filtering and Normalisation: Filters are applied to include only partials within specified amplitude and frequency ranges (min-amp, max-amp, min-freq, max-freq). Amplitudes can be optionally normalised (normalize), ensuring consistent dynamic control across the output. Through these features, the SPECTRAL-TO-OMN function enables composers to translate spectral characteristics—harmonics, overtones, and even noise components—into musical notation, creating scores that embody the essence of spectral music. For instance, analysing a bell sound using this function can yield clusters of pitches and dynamics that reflect the bell’s rich overtone structure. These clusters can then be orchestrated, manipulated, or layered to produce compositions inspired by spectral principles but directly derived from real-world sounds. The SPECTRAL-TO-OMN function’s ability to seamlessly integrate the analytical and creative aspects of spectral music expands the possibilities for transforming sound into music. Spectral Analysis: 1. Performing Spectral Analysis on an Audio File (setf spectral (spectral-analysis "marangona.aiff" :start 0.0 :end 1.0)) WAV File: Samples: 2158420, SR: 44100, Channels: 2, Bit Depth: 16 Hop Size: 427, Window Size: 4096, Window Function: hanning Audio Duration: 24.471882, Specified Duration: 1.0 Computed Duration: 0.9118, Segment Duration: 1.0 Frame Count: 94 2. Creating a Spectral Library After obtaining the spectral data, it’s advisable to create a library to store and organise this information systematically. The newly created library is automatically loaded and stored in the ~/Opusmodus/User Source/Libraries/Def-Library directory. (create-library 'marangona-1.0 'partials 'p spectral :file "marangona-1.0") 3. Retrieving Spectral Analysis Data from the Library To utilise the stored spectral data, you need to retrieve it from the library. This allows you to access specific frames or partials for further processing or analysis. (setf pframes (library 'marangona-1.0 'partials nil :collect :all)) Examples: This example processes 1 second of spectral data from "marangona-1.0" audio file. Only partials with an amplitude of 0.0081 or higher are included. Frequencies are quantised to the nearest semitone (1/2), and the output is represented in OMN format. (spectral-to-omn pframes :min-amp 0.0081 :quantize 1/2) Applying Minimum Frame Size: Frames with fewer than 3 partials are replaced with rests. The output includes only frames that meet the size and amplitude thresholds. (spectral-to-omn pframes :min-frame-size 3 :min-amp 0.0081 :quantize 1/2) Filtering by Frequency Range: This example focuses on a specific frequency range (10 Hz to 5000 Hz). Frames with fewer than 3 partials are replaced by rests. Pitches are quantised to semitones and converted to OM. (spectral-to-omn pframes :min-freq 10.0 :max-freq 5000.0 :min-frame-size 3 :min-amp 0.0081 :quantize 1/2) Applying a Custom Ambitus: (setf omn (spectral-to-omn pframes :min-frame-size 3 :min-amp 0.0081 :quantize 1/2 :ambitus '(c2 c6))) Pitches are transposed to fit within the range of c2 to c6 (:ambitus '(c2 c6)). Frequencies are quantised to semitones and converted to OMN. (ps 'gm :pg (list omn) :tempo 112 :time-signature '(1 4)) Best wishes and Happy New Year, Janusz Stephane Boussuge, Amir, Robin and 8 others 11
Recommended Posts