2023.12.28

This commit is contained in:
sshlien
2023-12-28 14:57:49 -05:00
parent 9fa917b1bd
commit b3d18d9722
6 changed files with 331 additions and 121 deletions

View File

@@ -1,2 +1,2 @@
December 23 2023 December 26 2023

View File

@@ -15182,6 +15182,28 @@ case statements were removed since they are unnecessary.
December 28 2023
abc2midi: tuplet bug
The following example produces the error
Warning in line-char 7-8 : Different length notes in tuple
X:1
T:Test
L:1/4
Q:1/4=90
M:3/4
K:D
(3[ac']/d'/[ac']/ [ac']/z/ |
Analysis: though it is legal to have different length notes (and
rests) in a tuple, this is clearly a bug. The message occurs in the
function event_note() in store.c. tnote_num and tnote_denom should
contain the expected length of the note in the tuple based on the
first note encountered in the tuple. The value of tnote_denom was
not adjusted by event_chordoff to compensate by the length value
specified at the end of the [ac'] chord, resulting in the problem.

View File

@@ -1,107 +0,0 @@
Advamced Percussion Analysis
in the Midistats Program
This is an addendum to the midistats.1 file.
The MIDI file devotes channel 9 to the percussion instruments
and over 60 percussion instruments are defined in the MIDI
standard. Though there is a lot of diversity in the percussion
track, for most MIDI files only the first 10 or so percussion
instruments are important in defining the character of the track. The
program Midiexplorer has various tools for exposing the percussion
channel which are described in the documentation. The goal
here is to find the essential characteristics of the percussion
track which distinguishes the MIDI files. This is attempted
in the program midistats. Here is a short description.
-corestats
Produces a line with 5 numbers separated by tabs. eg
1 8 384 4057 375
It returns the number of tracks, the number of channels, the
number of divisions per quarter note beat (ppqn),
the number of note onsets in the midi file, and the maximum
number of quarter note beats in midi file.
-pulseanalysis
Counts the number of note onsets as a function of its onset time
relative to a beat, grouping them into 12 intervals and returns
the result as a discrete probability density function. Generally,
the distribution consists of a couple of peaks corresponding
to quarter notes or eigth notes. If the distribution is flat,
it indicates that the times of the note occurrences have not been
quantized into beats and fractions. Here is a sample output.
0.3496,0.0000,0.0000,0.1602,0.0000,0.0002,0.2983,0.0000,0.0000,0.1914,0.0002,0.0001
-panal
Counts the number of note onsets for each percussion instrument. The first
number is the code (pitch) of the instrument, the second number is the
number of occurrences. eg.
35 337 37 16 38 432 39 208 40 231 42 1088 46 384 49 42 54 1104 57 5 70 1040 85 16
-ppatfor n
where n is the code number of the percussion instrument. Each beat
is represented by a 4 bit number where the position of the on-bit
indicates the time in the beat when the drum onset occurs. The bits
are ordered from left to right (higher order bits to lower order
bits). This is the order of bits that you would expect in a
time series.
Thus 0 indicates that there was no note onset in that beat, 1 indicates
a note onset at the end of the beat, 4 indicates a note onset
in the middle of the beat, and etc. The function returns a string
of numbers ranging from 0 to 7 indicating the presence of note onsets
for the selected percussion instrument for the sequence of beats
in the midi file. Here is a truncated sample of the output.
0 0 0 0 0 0 0 0 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 4 4 0
1 0 0 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 0 0
1 0 5 0 1 0 5 0 1 etc.
One can see a repeating 4 beat pattern.
-ppat
midistats attempts to find two percussion instruments in the midi file
which come closest to acting as the bass drum and snare drum.
If it is unsuccessful, it returns a message of its failue. Otherwise,
encodes the position of these drum onsets in a 8 bit byte for each
quarter note beat in the midi file. The lower (right) 4 bits encode the
bass drum and the higher (left) 4 bits encode the snare drum in the
same manner as described above for -ppatfor.
0 0 0 0 0 0 0 0 0 0 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145
33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145
33 145 33 145 33 145 33 145 33 145 33 and etc.
-ppathist
computes and displays the histogram of the values that would appear
when running the -ppat. eg.
bass 35 337
snare 38 432
1 (0.1) 64 32 (2.0) 8 33 (2.1) 136 144 (9.0) 8 145 (9.1) 136
The bass percussion code, the number of onsets, and the snare
percussion code and the number of onsets are given in the
first two lines. In the next line the number of occurrences of
each value in the -ppat listing is given. The number in parentheses
splits the two 4-bit values with a period. Thus 33 = (2*16 + 1).
-nseqfor -n
Note sequence for channel n. This option produces a string for bytes
indicating the presence of a note in a time unit corresponding to
an eigth note. Thus each quarter note beat is represented by two
bytes. The pitch class is represented by the line number on the
staff, where 0 is C. Thus the notes on a scale are represented
by 7 numbers, and sharps and flats are ignored. The line number is
then converted to a bit position in the byte, so that the pitch
classes are represented by the numbers 1,2,4,8, and etc. A chord
of consisting of two note onsets would set two of the corresponding
bits. If we were to represent the full chromatic scale consisting
of 12 pitches, then we would require two-byte integers or
twice of much memory.
Though the pitch resolution is not sufficient to distinguish
major or minor chords, it should be sufficient to be identify some
repeating patterns.

View File

@@ -1,4 +1,4 @@
.TH MIDISTATS 1 "17 November 2023" .TH MIDISTATS 1 "27 December 2023"
.SH NAME .SH NAME
\fBmidistats\fP \- program to summarize the statistical properties of a midi file \fBmidistats\fP \- program to summarize the statistical properties of a midi file
.SH SYNOPSIS .SH SYNOPSIS
@@ -55,6 +55,8 @@ file.
.PP .PP
pitchbends specifies the total number of pitchbends in this file. pitchbends specifies the total number of pitchbends in this file.
.PP .PP
pitchbendin c n specifies the number of pitchbends n in channel c
.PP
progs is a list of all the midi programs addressed progs is a list of all the midi programs addressed
.PP .PP
progsact the amount of activity for each of the above midi programs. progsact the amount of activity for each of the above midi programs.
@@ -74,9 +76,20 @@ instruments.
pitches is a histogram for the 11 pitch classes (C, C#, D ...B) pitches is a histogram for the 11 pitch classes (C, C#, D ...B)
that occur in the midi file. that occur in the midi file.
.PP .PP
key indicates the key of the music, the number of sharps (positive) or
flats (negative) in the key signature, and a measure of the confidence
in this key signature. The key was estimated from the above pitch histogram.
A confidence level below 0.4 indicates that the pitch histogram does
not follow the histogram of a major or minor scale. (It may be the
result of a mixture of two key signatures.)
.PP
pitchact is a similar histogram but is weighted by the length of pitchact is a similar histogram but is weighted by the length of
the notes. the notes.
.PP .PP
chanvol indicates the value of the control volume commands in the
midi file for each of the 16 channels. The maximum value is 127.
It scales the loudness of the notes (velocity) by its value.
.PP
chnact returns the amount of note activity in each channel. chnact returns the amount of note activity in each channel.
.PP .PP
trkact returns the number of notes in each track. trkact returns the number of notes in each track.
@@ -87,21 +100,172 @@ all channels except the percussion channel.
collisions. Midistats counts the bar rhythm patterns using a hashing collisions. Midistats counts the bar rhythm patterns using a hashing
function. Presently collisions are ignored so occasionally two function. Presently collisions are ignored so occasionally two
distinct rhythm patterns are counted as one. distinct rhythm patterns are counted as one.
.SH Advance Percussion Analysis Tools
.PP .PP
In addition the midistats may return other codes that describe
other characteristics. They include
unquantized - the note onsets are not quantized
.br
triplets - 3 notes played in the time of 2 notes are present
.br
qnotes - the rhythm is basically simple
.br
clean_quantization - the note onsets are quantized into 1/4, 1/8, 1/16 time units.
.br
dithered_quantization - small variations in the quantized note onsets.
.br
Lyrics - lyrics are present in the meta data
.br
programcmd - there may be multiple program changes in a midi channel
.SH Advanced Percussion Analysis Tools
.PP
The MIDI file devotes channel 9 to the percussion instruments
and over 60 percussion instruments are defined in the MIDI
standard. Though there is a lot of diversity in the percussion
track, for most MIDI files only the first 10 or so percussion
instruments are important in defining the character of the track. The
program Midiexplorer has various tools for exposing the percussion
channel which are described in the documentation. The goal
here is to find the essential characteristics of the percussion
track which distinguishes the MIDI files. This is attempted
in the program midistats. Here is a short description.
.br
A number of experimental tools for analyzing the percussion channel A number of experimental tools for analyzing the percussion channel
(track) were introduced into midistats and are accessible through (track) were introduced into midistats and are accessible through
the runtime arguments. When these tools are used in a script which the runtime arguments. When these tools are used in a script which
runs through a collection of midi files, you can build a database runs through a collection of midi files, you can build a database
of percussion descriptors. Some more details are given in the of percussion descriptors.
file drums.txt which comes with this documentation.
.SH OPTIONS .SH OPTIONS
.TP .PP
.B -corestats -corestats
.TP .br
.B -pulseanalysis outputs a line with 5 numbers separated by tabs. eg
.TP .br
1 8 384 4057 375
.br
It returns the number of tracks, the number of channels, the
number of divisions per quarter note beat (ppqn),
the number of note onsets in the midi file, and the maximum
number of quarter note beats in midi file.
.PP
-pulseanalysis
.br
counts the number of note onsets as a function of its onset time
relative to a beat, grouping them into 12 intervals and returns
the result as a discrete probability density function. Generally,
the distribution consists of a couple of peaks corresponding
to quarter notes or eigth notes. If the distribution is flat,
it indicates that the times of the note occurrences have not been
quantized into beats and fractions. Here is a sample output.
.br
0.349,0.000,0.000,0.160,0.000,0.000,0.298,0.000,0.000,0.191,0.000,0.000
.PP
-panal
.br
Counts the number of note onsets for each percussion instrument. The first
number is the code (pitch) of the instrument, the second number is the
number of occurrences. eg.
.br
35 337 37 16 38 432 39 208 40 231 42 1088 46 384 49 42 54 1104 57 5 70 1040 85 16
.PP
-ppatfor n
.br
where n is the code number of the percussion instrument. Each beat
is represented by a 4 bit number where the position of the on-bit
indicates the time in the beat when the drum onset occurs. The bits
are ordered from left to right (higher order bits to lower order
bits). This is the order of bits that you would expect in a
time series.
Thus 0 indicates that there was no note onset in that beat, 1 indicates
a note onset at the end of the beat, 4 indicates a note onset
in the middle of the beat, and etc. The function returns a string
of numbers ranging from 0 to 7 indicating the presence of note onsets
for the selected percussion instrument for the sequence of beats
in the midi file. Here is a truncated sample of the output.
.br
0 0 0 0 0 0 0 0 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 4 4 0
1 0 0 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 0 0
1 0 5 0 1 0 5 0 1 etc.
.br
One can see a repeating 4 beat pattern.
.PP
-ppat
.br
midistats attempts to find two percussion instruments in the midi file
which come closest to acting as the bass drum and snare drum.
If it is unsuccessful, it returns a message of its failue. Otherwise,
encodes the position of these drum onsets in a 8 bit byte for each
quarter note beat in the midi file. The lower (right) 4 bits encode the
bass drum and the higher (left) 4 bits encode the snare drum in the
same manner as described above for -ppatfor.
.br
0 0 0 0 0 0 0 0 0 0 33 145 33 145 33 145 33 145 33 145 33 145 33 145
.br
33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145
.br
33 145 33 145 33 145 33 145 33 145 33 and etc.
.PP
-ppathist
.br
computes and displays the histogram of the values that would appear
when running the -ppat. eg.
.br
bass 35 337
.br
snare 38 432
.br
1 (0.1) 64 32 (2.0) 8 33 (2.1) 136 144 (9.0) 8 145 (9.1) 136
.br
The bass percussion code, the number of onsets, and the snare
percussion code and the number of onsets are given in the
first two lines. In the next line the number of occurrences of
each value in the -ppat listing is given. The number in parentheses
splits the two 4-bit values with a period. Thus 33 = (2*16 + 1).
.PP
-pitchclass
.br
Returns the pitch class distribution for the entire midi file.
.PP
-nseqfor
.br
Note sequence for channel n. This option produces a string of bytes
indicating the presence of a note in a time unit corresponding to
an eigth note. Thus each quarter note beat is represented by two
bytes. The pitch class is represented by the line number on the
staff, where 0 is C. Thus the notes on a scale are represented
by 7 numbers, and sharps and flats are ignored. The line number is
then converted to a bit position in the byte, so that the pitch
classes are represented by the numbers 1,2,4,8, and etc. A chord
of consisting of two note onsets would set two of the corresponding
bits. If we were to represent the full chromatic scale consisting
of 12 pitches, then we would require two-byte integers or
twice of much memory.
.br
Though the pitch resolution is not sufficient to distinguish
major or minor chords, it should be sufficient to be identify some
repeating patterns.
-ver (version number)
.B etc. (See drums.txt in doc folder.) .B etc. (See drums.txt in doc folder.)

View File

@@ -6,7 +6,7 @@ abc2abc version 2.20 February 07 2023
yaps version 1.92 January 06 2023 yaps version 1.92 January 06 2023
abcmatch version 1.82 June 14 2022 abcmatch version 1.82 June 14 2022
midicopy version 1.39 November 08 2022 midicopy version 1.39 November 08 2022
midistats version 0.82 December 17 2023 midistats version 0.83 December 26 2023
24th January 2002 24th January 2002
Copyright James Allwright Copyright James Allwright

View File

@@ -18,7 +18,23 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
*/ */
#define VERSION "0.82 December 17 2023 midistats" #define VERSION "0.83 December 27 2023 midistats"
/* midistrats.c is a descendent of midi2abc.c which was becoming to
large. The object of the program is to extract statistical characterisitic
of a midi file. It is mainly called by the midiexplorer.tcl application,
but it now used to create some databases using runstats.tcl which
comes with the midiexplorer package.
By default the program produces a summary that is described in the
midistats.1 man file. This is done by making a single pass through
the midi file. If the program is called with one of the runtime
options, the program extracts particular information by making more
than one pass. In the first pass it creates a table of all the
midievents which is stored in memory. The midievents are sorted in
time, and the requested information is extracted by going through
this table.
*/
#include <limits.h> #include <limits.h>
/* Microsoft Visual C++ Version 6.0 or higher */ /* Microsoft Visual C++ Version 6.0 or higher */
@@ -52,6 +68,7 @@ void stats_finish();
float histogram_perplexity (int *histogram, int size); float histogram_perplexity (int *histogram, int size);
void stats_noteoff(int chan,int pitch,int vol); void stats_noteoff(int chan,int pitch,int vol);
void stats_eot (); void stats_eot ();
void keymatch();
#define max(a,b) (( a > b ? a : b)) #define max(a,b) (( a > b ? a : b))
#define min(a,b) (( a < b ? a : b)) #define min(a,b) (( a < b ? a : b))
@@ -584,6 +601,9 @@ for (i=35;i<100;i++) {
printf("\npitches "); /* [SS] 2017-11-01 */ printf("\npitches "); /* [SS] 2017-11-01 */
for (i=0;i<12;i++) printf("%d ",pitchhistogram[i]); for (i=0;i<12;i++) printf("%d ",pitchhistogram[i]);
keymatch();
printf("\npitchact "); /* [SS] 2018-02-02 */ printf("\npitchact "); /* [SS] 2018-02-02 */
if (npulses > 0) if (npulses > 0)
for (i=0;i<12;i++) printf("%5.2f ",pitchclass_activity[i]/(double) npulses); for (i=0;i<12;i++) printf("%5.2f ",pitchclass_activity[i]/(double) npulses);
@@ -1253,6 +1273,117 @@ for (i=0;i<lastBeat;i++) printf("%d ",drumpat[i]);
printf("\n"); printf("\n");
} }
/*
The key match algorithm is based on the work of Craig Sapp
Visual Hierarchical Key Analysis
https://ccrma.stanford.edu/~craig/papers/05/p3d-sapp.pdf
published in Proceedings of the International Computer Music
Conference,2001,
and the work of Krumhansl and Schmukler.
Craig Sapp's simple coefficients (mkeyscape)
Major C scale
The algorithm correlates the pitch class class histogram with
the ssMj or ssMn coefficients trying all 12 key centers, and
looks for a maximum.
The algorithm returns the key, sf (the number of sharps or
flats), and the maximum peak which is relatable to the
level of confidence we have of the result.
*/
static float ssMj[] = { 1.25, -0.75, 0.25, -0.75, 0.25, 0.25,
-0.75, 1.25, -0.75, 0.25, -0.75, 0.25};
/* Minor C scale (3 flats)
*/
static float ssMn[] = { 1.25, -0.75, 0.25, 0.25, -0.75, 0.25,
-0.75, 1.25, 0.25, -0.75, 0.25, -0.75};
static char *keylist[] = {"C", "C#", "D", "Eb", "E", "F",
"F#", "G", "Ab", "A", "Bb", "B"};
static char *majmin[] = {"maj", "min"};
/* number of sharps or flats for major keys in keylist */
static int maj2sf[] = {0, 7, 2, -3, 4, -1, 6, 1, -4, 3, -2, 5};
static int min2sf[] = {-3, 4, -1, -6, -4, 3, -4 -2, -7, 0, -5, 2};
void keymatch () {
int i;
int r;
int k;
float c2M,c2m,h2,hM,hm;
float rmaj[12],rmin[12];
float hist[12];
float best;
int bestIndex,bestMode;
int sf; /* number of flats or sharps (flats negative) */
int total;
float fnorm;
c2M = 0.0;
c2m = 0.0;
h2 = 0.0;
best = 0.0;
bestIndex = 0;
bestMode = -1;
total =0;
for (i=0;i<12;i++) {
total += pitchhistogram[i];
}
for (i=0;i<12;i++) {
hist[i] = (float) pitchhistogram[i]/(float) total;
}
fnorm = 0.0;
for (i=0;i<12;i++) {
fnorm = hist[i]*hist[i] + fnorm;
}
fnorm = sqrt(fnorm);
for (i=0;i<12;i++) {
hist[i] = hist[i]/fnorm;
}
for (i=0;i<12;i++) {
c2M += ssMj[i]*ssMj[i];
c2m += ssMn[i]*ssMn[i];
h2 += hist[i]*hist[i];
}
if (h2 < 0.0001) {
printf("zero histogram\n");
return;
}
for (r=0;r<12;r++) {
hM = 0.0;
hm = 0.0;
for (i=0;i<12;i++) {
k = (i - r) % 12;
if (k < 0) k = k + 12;
hM += hist[i]*ssMj[k];
hm += hist[i]*ssMn[k];
}
rmaj[r] = hM/sqrt(h2*c2M);
rmin[r] = hm/sqrt(h2*c2m);
}
for (r=0;r<12;r++) {
if(rmaj[r] > best) {
best = rmaj[r];
bestIndex = r;
bestMode = 0;
}
if(rmin[r] > best) {
best = rmin[r];
bestIndex = r;
bestMode = 1;
}
}
if (bestMode == 0) sf = maj2sf[bestIndex];
else sf = min2sf[bestIndex];
/*printf("\nkeymatch: best = %f bestIndex = %d bestMode = %d",best,bestIndex,bestMode);*/
printf("\nkey %s%s %d %f",keylist[bestIndex],majmin[bestMode],sf,best);
}
void percsummary () { void percsummary () {