Spect version 1.0 AUG 93 by Kevin C. Fandre *************** About: Spect is a sampling and spectral analysis program written for the Gravis Ultrasound. It utilizes i386 32 bit instructions for speed but since no protected mode instructions are used it should run in 8086 virtual mode, ie it won't interfere with memory managers and other such protected mode programs. It will not, however, run under windows but who needs to since this program was written for speed, speed, speed and that precludes windows. So, at minimum, you must have a 386 or better computer, a VGA compatible video card, and a Gravis Ultrasound. A math coprocessor is NOT required and the presence of one will not affect the speed of the program since spect uses only integer instructions. *************** Disclaimer: I am not responsible for any damages whatsoever whether it be eye strain, migraines, tidal waves, heart failure, or otherwise caused by prolonged viewing of this program. Loss of data, hardware failures, stock crashes, and everything else caused by this program are solely the responsibility of the user. (I always wanted to write one of those.) *************** Public Domain: Use of this program is free and I will not try to make you feel guilty for using it and not paying for it. It may be distributed freely for non-profit only and I reserve all rights to this program. However, if you use this program very frequently, use it for *non-private* purposes, or would like to spur me on to further develop this program and others based on FFT analysis, you may show your interest by sending a donation commensurate with your level of usage of this program to: Kevin C. Fandre 1726 Leonard Dr. Jacksonville, AR 72706 Such donations are strictly non-obligatory but I would strongly encourage those who use this program for professional purposes, such as in education or as a demo in a commercial store for instance, to send in a donation. Donations in the range of ten to fifteen dollars per copy are quite acceptable. Please include any comments or suggestions you might have and, most especially, please describe to what use you are putting this program. This will help me to better gauge your needs and improve upon this program. In the future I was planning on adding: -Window smoothing to clear up spectrums and improve definition -Adding various kinds of digital filters -Support for other sound cards or a home built ADC kit -Support for Gravis' own 16bit daughter card -Adding mouse and window support and other user niceties -Adding oscilliscopes -Adding wavelet analysis -?(Whatever you want) *************** Comments: Please tell me what you think! I would love to hear from you. Send your comments to either the above address or my email address: good till May of 94 *************** To test: Spect is configurable from the command line but for a quick test try running spect without any command line arguments. This will set up spect to sample from both the line in and microphone channels in stereo at 8000 Hz. These parameters should give adequate results and fast screen updates even on a slow computer but aliasing caused by too low a sampling rate will cause artifacts(unwanted lines) to appear if there is an abundance of frequencies higher than 4000 Hz. This means that if you are analyzing rock music for instance, cymbal crashes will look like static superimposed on the lower frequencies. The only way to avoid this is to switch to a higher sampling rate( up to 44100 Hz on the Ultrasound) but this has its negative side-effects also. But more on this later. First I will briefly describe what the Fast Fourier Transform(FFT) does and how it does it and this will give you a better understanding of how to configure spect to give you optimum performance. *************** The FFT: The FFT is based on the Discrete Fourier Transform which is a discrete extension of the Fourier integral, an integral which can be interpreted as the continuous representation of the spectral components of any given function. The Fourier integral itself derives from the Fourier series which is capable of describing any function over an interval in terms of a *unique* infinite summation of sines and cosines whose periods are integer multiples of each other. Well, that was a mouthful but basically it is possible to describe a function, and sampled data over a range constitutes a function, as a summation of sine waves. The mathematics of all this was worked out by the mega-geniuses of the past(Fourier,Laplace, and Euler) and boils down to this: 1 N-1 jk X(j) = - SUM x(k)W N k=0 where, N = number of points W = exp(2*pi*i)/N X(j) - N complex transformed points x(k) - N complex samples The points which are graphed on the screen are merely |X(j)| which come from the sample points x(k). Since we are dealing with real data the complex parts of the samples are just set to zero. Now, as you can see the number of operations that must be gone through is proportional to N^2, N summations by N multiply-adds. Up until 1963 this is the way spectrums were done until Cooley-Tukey came up with a faster alternative that better utilizes intermediate results. Their original paper described a method which worked with any number of samples so long as N was not prime and came to be know as the Fast Fourier Transform. A special case arises when the number of samples is a power of 2(or a power of 5 or any number for that matter). Spect uses just such an algorithm and runs in a time proportional to N*log(N). The time savings over N^2 is enormous. Its difficult to describe in words how the FFT algorithm utilizes intermediate results to reduce computation time. Basically, it amounts to splitting up the set of samples into two odd and even sets, applying the summation detailed above except that exponents are all multiplied by two. With the resulting two sets any desired X(j) can be computed by adding an element from the even results with an appropriate element from the set of odd results multiplied by W raised to a fractional power depending on which X(j) is being computed. The neat part is that this procedure can be applied recursively up to log2(N) times if N is a power of 2 until we produce N sets of one element each which are nothing more than the sample data themselves. Then the recursion stops and we can work our way backwards using the simple combining rules to produce an X(j) using only big O log2(n) operations to do so. Over N X(j), this is proportional to N*log2(n) total compute time. The algorithm used by Spect I owe to who knows who but it dispenses with the stack pushing overhead involved in recursion and utilizes the patterns evident in doing all the substitutions of the combining rules into themselves that was affected by the recursion anyway. The only draw back is that the results are in a scrambled order but a simple table lookup alleviates the headache of unscrambling them. The code was written using 32bit integer instructions, utilizes table lookups wherever possible, and squeezes every last CPU cycle out of the i386. ************** Command line Arguments: -? or -h : list options -l :turn on the UltraSound line in -q :put in quite mode, turn of output -m :enable microphone input -L :turn on the logarithmic scale -f#:allows you to specify the frequency -s :tell spect to sample stereo data Currently file playing is not supported but will be in future versions. Typing a filename will work though, but there will be no sound, only graphics. It was there for diagnostic purposes and I decided to leave it. ************** How to Use: From within the program, typing F1 will print the help screen. Here's an example of typical command line parameters: spect -f16000 -lsL This tells spect to use a logarithmic scale and sample in stereo from the line in at 16000 Hz. If you dont use either the l switch or the m switch you will be recording dead silence so you must use at minimum one of these switches. The frequency defaults to 8000Hz but can be specified as desired. Now press the "+" key once. Notice that this screen is identical to the screen that results when spect is run without any command line arguments only that is will be a little bit cleaner because aliasing effects are reduced. Also you can press the PgUp key to see the other half of the transformed samples since pressing the "+" key doubles the number of samples to 512. Doing an FFT transform on real data produces only N/2 relevant points. The other points are just a mirror image and so are discarded by spect. The number of bars displayed by spect is 128 so the minimum number of samples required to produce 128 significant transform points is 256. This is the minimum. Spect is capable of utilizing up to 2048 points but at the cost of execution speed. However, what you do get is increased resolution. It is possible to dynamically increase and decrease the number of samples computed from within the program by pressing the + key to increase the number of samples and the - key to decrease the number of samples. Fortunately, increasing the number of samples to 2048 dosent slow the computer to a crawl for the reasons described above. A fast 486 helps when using a lot of samples and you will need them when sampling at higher frequencies unless you are willing to sacrifice a lot of detail. Since the display is limited to 128 bars, when zoomed in it is possible to shift the display left and right one bar at a time by using the cursor keys and 128 bars at a time by using the PgUp and PgDwn keys. The frequencies in hertz listed along the bottom of the screen will automatically adjust to reflect the new range represented by the group of 128 bars currently on screen. A logarithmic scale is provided to more closely represent the way the ear perceives sound(decibels are a logarithmic scale afterall). The logarithmic scale tends to more clearly discriminate fainter signals and deemphasizes the differences between stronger signals. You can select this scale option from the command line with the -L(case matters) switch. From within the program it possible to switch between linear and logarithmic scales with the "/" key. To determine the spectral resolution of the transform just divide the sampling rate by the total number of samples. Thus for the default 8000Hz the spectral resolution under a 256 point transform is 31.25Hz, meaning that each bar is 31.25 Hz greater than the next. You can halve this number by pressing the "+" key and doubling the number of samples. The maximum observable frequency is always one half of the sampling rate as per the Nyquist frequency law. When you experiment with Spect try to pick out the harmonics in various signals. Guitar solos really stand out and voices look really beautiful. After starring at the patterns that voices produce you can get a glimmer of how one might go about writing a voice recognition program since the patterns are fairly regular. If you cant see the patterns too clearly then try reducing the sampling frequency. Most of the action, in music at least, occurs below 4000Hz. The higher harmonics are there of course but they are fainter and not as interesting to look at and can be done away with by choosing a lower sampling rate. Only, take care that the higher harmonics are not too strong or they will fold over onto the lower frequencies and give false lines if the sampling rate is too low. If you have more horsepower at your disposal try doubling the sampling rate and doubling the zoom. You will get the same screen; it will be a little slower though, but much cleaner. Also, make sure the volume of your input source is set high enough since the Ultrasound's ADC is not all that sensitive. Keep in mind that because the Ultrasound only records in 8 bits and lacks significant dynamic range it will take some fine control of the volume control to get enough volume to make all of the harmonics show up while at the same time prevent clipping. You don't want clipping since that will transform into false lines that aren't in your source. A little won't hurt too much though... If you have access to a signal generator and test it out on spect don't be surprised if you don't get nice sharp bars for every frequency tried. Since the DFT works on a limited number of points inaccuracies are bound to show up in the form of broadening. You can see what I mean by just whistling into a microphone. You will see a nice sharp peak that broadens as it nears the bottom of the screen. This is due to the inherent inaccuracies involved in representing a continuous signal by a finite number of discrete points. Finding the best combination of speed and accuracy by choosing the frequency and number of samples is a matter of trial and error. In general having a fast computer will allow you greater flexibility in choosing higher sampling rates and larger samples while still maintaining decent refresh speeds. A really fast computer is not essential though. I developed this program on a 386-20 and it runs pretty fast as far as I can tell(9 frames per second with a 256 point transform!). A fast 486 should run over three times as fast I imagine and that will provide near realtime performance with a 256 point transform. True realtime performance is not possible since it takes a finite amount of time to record a block of data but with delays in the hundredths of seconds it should be virtually instantaneous to the eye. Experiment and have fun! Oh, one last thing. The graphics look great in the dark with the stereo on.