STAT03
A collection of statistical/matrix utilities for the HP48G(X)
(c) 1996 by Christian Meland
christian.meland@pfi.no
IMPORTANT: This is the third version, and it has not been fully
tested. Backup your memory. Also, I do not know if it will run on
any other version than rom version R, as some of the unsupported
statistics entries have moved around a bit.
0.0 DISCLAIMER
==============
STAT03 is distributed in the hope that it will be useful, but the
copyright holder provide the program "as is" without warranty of
any kind, either expressed or implied, including, but not limited to,
the implied warranties of merchantability and fitness for a
particular purpose. In no event will the copyright holder be liable
to you for damages, including any general, special, incidental or
consequential damages arising out of the use or inability to use the
program.
1.0 CREDITS
===========
Tamir Demri wrote the quiksort program I use in MEDI, PTILE
and SRT. He also wrote the heapsort-program I modified to sort
arrays with two columns (more than two columns are
delt with by using indexes.)
Laurent Bercot (HPZeus) wrote the NDUPN program.
The code used to revert the stack is from Jim Donnelly's "An intro
to hp-48 sysRPL and ass. lang."
I used Mika Heiskanen's JAZZ to write the library.
Scott Hyde suggested to implement harmonic and geometric means
and did some beta-testing.
2.0 OVERVIEW
============
STAT03 is a collection of utilities I find useful for users working
with statistics and numerical matrices. It is still mostly basic
commands, but they should be of great help for those who would
like to implement more advanced tests and tasks.
2.1 HISTORY
===========
This is the third release. The number of commands has increased to
49, and the size to about 9,3 kb. More than 23 % is ml, the rest is
sysRPL. So far I have not got much response on my programs, so
maybe that's why the progress is slow :-)? If you use it, give me a
word or two, suggestions ar appreciated (christian.meland@pfi.no)
3.0 INSTALLATION
================
The STAT03 library works in any port of a HP48G(X).
To install STAT03 on your HP48 download the file 'stat03.lib' to
your calculator and store in it a port of your choice. For example to
store the library to port 0, type in (or press suitable keys)
'stat03.lib' DUP RCL SWAP PURGE 0 STO
and power-cycle the calculator. The library will autoattach itself to
the HOME directory.
To delete the library, press ON-C, then type (press)
:p: 1042 DUP DETACH PURGE
where p is the port number where the library is stored. (If this
makes no sense to you, the port number is 0 ;-) )
4.0 COMMANDS
============
->VEC (alias VEC)
Stack: (real-meta
or list
or array --> [ vector ]
->CVEC (alias CV)
Stack: (real-meta
or list
or array --> [[ column ]] (n*1)
MANIP
MANIP is used to manipulate matrices, like "let column 4 be equal
to exp(column 5)". The user has access to an algebraic command
CNR, that allows him/her to spesify column-numbers.
Stack: ([[real]] col_nr << program >>
or [[real]] col_nr 'algebraic'
or [[real]] col_nr 'global'
or [[real]] col_nr 'local'
or [[real]] col_nr real --> [[real]]
col_nr can be from 1 to p+1, where p=number of columns in input-
array. If col_nr are p+1, a new column is appended, or else the
spesified column is overwritten.
Examples:
[[1 4][2 5][3 6]] 3 << 1 CNR 2 CNR + >> MANIP
-> [[1 4 5][2 5 7][3 6 9]]
[[1 4 5][2 5 7][3 6 9]] 3 'Sum(I=1,3,CNR(I))' MANIP
-> [[1 4 10][2 5 14][3 6 18]]
CNR
CNR is used in MANIP, and will not give any reaponse if used
outside MANIP.
Syntax: col_nr CNR or 'CNR(col_nr)'
You will find that many of the M... commands are stack-
replacements for some of the built-in stat. commands. Most of
them also give the user some additional features, such as mean of a
spesific column or mean of the entire array.
MTOT
Stack: [[real]] -> [ tot1 tot2 ... totp]
or [[real]] col_nr -> tot(col_nr)
or [[real]] 0 -> tot(whole array)
MMEAN
Stack: [[real]] -> [ mean1 mean2 ... meanp]
or [[real]] col_nr -> mean(col_nr)
or [[real]] 0 -> mean(whole array)
MSDEV
Stack: [[real]] -> [ sdev1 sdev2 ... sdevp]
or [[real]] col_nr -> sdev(col_nr)
or [[real]] 0 -> sdev(whole array)
MMAX
Stack: [[real]] -> [ max1 max2 ... max]
or [[real]] col_nr -> max (col_nr)
or [[real]] 0 -> max(whole array)
MMIN
Stack: [[real]] -> [ min1 min2 ... minp]
or [[real]] col_nr -> min(col_nr)
or [[real]] 0 -> min(whole array)
MVAR
Stack: [[real]] -> [ var1 var2 ... varp]
or [[real]] col_nr -> var(col_nr)
or [[real]] 0 -> var(whole array)
MSSQ
Stack: [[real]] -> [ ssq1 ssq2 ... ssqp]
or [[real]] col_nr -> ssq(col_nr)
or [[real]] 0 -> ssq(whole array)
Sum of squares, Sum(X^2)
MSXY
Stack: [[real]] col_x col_y -> sum(x*y)
Sum of products, Sum(x*y)
MSLN
Stack: [[real]] -> [ sumln1 sumln2 ... sumlnp]
or [[real]] col_nr -> sumln(col_nr)
or [[real]] 0 -> sumln(whole array)
Sum of logaritms, Sum(ln(x))
MGMEAN
Stack: [[real]] -> [ gmean1 gmean2 ... gmeanp]
or [[real]] col_nr -> gmean(col_nr)
or [[real]] 0 -> gmean(whole array)
Geometric mean, Prod(X)^(1/n)
MHMEAN
Stack: [[real]] -> [ hmean1 hmean2 ... hmeanp]
or [[real]] col_nr -> hmean(col_nr)
or [[real]] 0 -> hmean(whole array)
Harmonic mean, n/(sum(inv(x)))
MMEDI
Stack: [[real]] -> [ medi1 medi2 ... medip]
or [[real]] col_nr -> medi(col_nr)
or [[real]] 0 -> medi(whole array)
Median of data.
MC
Stack: [[real]] -> number_of_cols
or [real] -> number_of_elements
Number of columns. For vectors: number of elements.
MR
Stack: [[real]] -> number_of_cols
or [real] -> number_of_elements
Number of rows. For vectors: number of elements.
SQARR
Stack: ([[real]] --> [[real]]
or [real] --> [real])
Squares all values in an array.
EADD
Stack: [[real]] [[real]] -> [[real]] (both can be vectors)
or [[real]] real -> [[real]]
or real [[real]] -> [[real]]
Addition of arrays, or addition by a constant. If two arrays are
added, the number of elements must be equal, and the result will
inherit the dimensions from the array on stack level 1
ESUB
Stack: [[real]] [[real]] -> [[real]] (both can be vectors)
or [[real]] real -> [[real]]
or real [[real]] -> [[real]]
Subtraction of arrays, or subtraction of a constant. If two arrays are
subtracted, the number of elements must be equal, and the result
will inherit the dimensions from the array on stack level 1
EMUL
Stack: [[real]] [[real]] -> [[real]] (both can be vectors)
or [[real]] real -> [[real]]
or real [[real]] -> [[real]]
Element-by-element multiplication of arrays, or multiplication by a
scalar. If two arrays are multiplied, the number of elements must be
equal, and the result will inherit the dimensions from the array on
stack level 1
EDIV
Stack: [[real]] [[real]] -> [[real]] (both can be vectors)
or [[real]] real -> [[real]]
or real [[real]] -> [[real]]
Element-by-element division of arrays, or division by a divisor. If
two arrays are divided, the number of elements must be equal, and
the result will inherit the dimensions from the array on stack level
1
MEDI
Stack: --> [ medians ] or median
Returns the median of the array stored as SigmaDAT. Very fast
(returns the median of a { 100 10 } RANDom array in 1.5 sec's).
WMEAN
Stack: --> mean
WMEAN returns the weighted mean value. The data-values are
stored in the xcol of SigmaDAT, and the weights (or frequencies)
are stored in the ycol.
WSDEV
Stack: --> std
WSDEV returns the weighted standard deviation value. The data-
values are stored in the xcol of SigmaDAT, and the weights (or
frequencies) are stored in the ycol.
WVAR
Stack: --> var
WVAR returns the weighted variance value. The data-values are
stored in the xcol of SigmaDAT, and the weights (or frequencies)
are stored in the ycol.
WPSDEV
Stack: --> pstd
WPSDEV returns the weighted standard deviation for the
population (denominator n instead of the usual n-1). The data-
values are stored in the xcol of SigmaDAT, and the weights (or
requencies) are stored in the ycol.
WPVAR
Stack: --> pvar
WPSTD returns the weighted variance for the population
(denominator n instead of the usual n-1). The data-values are
stored in the xcol of SigmaDAT, and the weights (or frequencies)
are stored in the ycol.
COLM
Stack: [[real]] col_nr --> [[real]]
Extracts the spesific column from an array. Same as COL- SWAP
DROP ->COL, but faster, the result is still a column, and does not
error if the input has only one column.
ROWM
Stack: [[real]] row_nr --> [real]
Extracts the spesific row from an array. Same as ROW- SWAP
DROP, but faster.
COLSW
Stack: [[real]] col_nr1 col_nr2 --> [[real]]
Swap two columns in an array. Same as CSWP, but faster.
BINS1
Stack: min delta nbins --> [[bins]] [below above]
Fast replacement for the built in BINS-command, with minor
changes. Very fast in some cases, fast in other. A large number of
bins and/or a large amount of data located right below max and
(min+max)/2 will slow the process much more than a high number
of data to count.
Comparison: (all data are produced by RAND, with an initial 123
RDZ)
Program #data min delta nbins Time #data/sec
BINS1 100 0 1/20 20 0.1878 532.3
BINS 2.184 9.16
BINS1 1000 0 1/100 100 2.850 350.8
BINS 44.38 22.53
BINS1 1000 0 1/10 10 0.5624 1778
BINS 41.52 24.08
BINS1 3000 0 1/100 100 7.908 380
BINS ca. 133 ca. 22
BINS1 3000 0 1/2 2 1.266 2370
BINS1 3000 -1 1 1 0.891 3366
BINS1 1000 -1 1/100 100 3.226 310
BINS1 1000 1 1/100 100 0.726 1377
Given min, delta and nbins, it returns the number of observations
in each cell [[ min,min+d>,[min+d,min+2d>,...,[min+(nbins-1)d,
min+nbins*d>] and [number below, number above]
Note that the last interval is open [...> and not closed [...], as with
BINS. The program does not handle data with exponent equal to
499 properly, but this should not be any problem in most cases.
PTILE
Stack: [[real]] p --> percentile
or [real] p --> percentile
Returns the p'th percentile of the array. 0<=p<=100. Fast.
SRT (alias QSRT)
Stack: [[real]] --> [[real]]
or { reals } --> { reals }
or real_meta --> real_meta
or [[real]] col_nr --> [[real]]
<---- NEW
or [[real]] -col_nr --> [[real]] (descending)
<---- NEW
or { list of any objects } << program >> -> { sorted list }
<---- NEW
This is an attempt to make a fast general purpose sort command.
The three new "sub-commands" use a heapsort algoritm written by
Tamir Demri, I have made some modifications and added some ml
indexing and rearrengment routines. The other "sub-commands"
use a quicksort algoritm written by Tamir Demri. They require
about 1.6 times size of object to sort free space. To sort in
decreasing order, use REV after sorting is done. SRT is very fast. It
sorts an 1000 elem. array (made by RAND) in 1.54 sec's.
When input is [[real]] col_nr, it will sort the array by that spesific
column, and in decreasing order if col_nr is negative.
When input is {list} << program>>, the list can contain any object,
and different types can be mixed. The << program >> should, for
each element in the list, return a real that is to be the sorting-key.
For example, to sort a list of elements by their size, one would use
<< BYTES SWAP DROP >> as sorting-key. The program is quite
fast, to sort a list of 500 random complex numbers by their
imaginary part, only takes 4 sec. if you use :: C>Im ; as <<
program >>. The list is decomposed and rebuilt, so you may run
into a heavy garbage-collection from time to time. Also you need
some free space ( appx. 2 times size of list + 16 times number of
elements) To sort lists of strings, remember that NUM only give
you the value of the first character. You may use the built in SORT
command, or use :: ROMPTR 412 5D #>% ; as << program >>
(ROMPTR 412 5D is the routine to revert strings, #>% is the body
of B->R, but it works well with strings, too.)
Tips: To randomize a list, simply use << DROP RAND >> as
sorting-key.
TRNS
Stack: [[real]] --> [[real]]
or [[complex]] --> [[complex]]
or 'local' -->
or 'global' -->
Faster replacement for the built in TRN. Conjugates the elements
of a complex array (like TRN does).
META
Stack: n --> 1 2 ... n n
Makes a real meta object.
Example: 5 META --> 1 2 3 4 5 5
NDUPN
Stack: (ob n --> ob ob ... ob n)
Code written by HPZeus. Makes n copies of the object on stack
level 2.
REV
Stack: {list} --> {list}
or meta --> meta
or string --> string
or [[real]] --> [[real]]
or [real] --> [real]
Reverses the elements/characters/numbers of an object.
ZALPHA (algebraic)
Stack: alpha --> Z(alpha)
For a given alpha value 0<=alpha<=1, computes the z (standard
normally distr.) value such that P(Z>z)=alpha (=> 0 1 z UTPN =
alpha).
Flag 54:
Clear: ZALPHA gives a (good enough) numerical approximation.
Typical time: 0.06 sec. Accuracy is in most cases better than +/-
0.001
Set: ZALPHA gives an accurate answer.
Part of algoritm found in Numerical Recipes
TALPHA (algebraic)
Stack: df alpha --> t(df,alpha)
For a given set of (df, alpha), df >=1, 0<=alpha<=1, TALPHA
computes the t (student-t distr.) value such that P(T>t|df)=alpha
(=> df t UTPT = alpha).
Flag 54:
Clear: TALPHA gives a (good enough) numerical approximation.
Typical time: 0.13 sec. Accuracy is in most cases better than +/-
0.01
Set: TALPHA gives an accurate answer.
Part of algoritm found in Numerical Recipes
UTPG (algebraic)
Stack: k x --> utpg
Returns the upper tail prob. that a gamma(k) deviate should be
greater than x.
= 1 - integral(0,x,exp(-t)*t^(k-1),t)/gamma(k)
Uses a numerical solution found in Numerical Recipes. Note that
although the gammadistribution has two parameters, you can
always standardize x such that only k (the shape-parameter) is
needed. I did not want to implement the standardisation, because
the gamma-distribution is presented in different ways in different
books. The most usual ones require that you either multiply or
divide x by a (a is the scale-parameter).
DIGAMMA (algebraic)
Stack: x --> digamma(x)
Returns the digammafunction of x.
Digamma(x)= d/dx[ln(gamma(x))]
UNI
Stack: n -> [uniform(0,1)]
or { n k } -> [[uniform(0,1)]]
Return a vector or array filled with RAND-numbers.
NORM
Stack: --> norm
Generates a normally distr. value with expectation 0 and variance
1. Norm=sqrt(-2*ln(%RAN)*SINRAD(2PI*%RAN)
NNORM
Stack: mean sdev n --> [[real]]
Given mean, sdev and number of deviates, generates n independent
values from a normal(mean,sdev^2) distribution, and collects them
in a { n 1 } array. Uses a acsept/reject algoritm (donÆt know the
correct english phrase)
MAKERN
Stack: n ->
Routine to draw öitemsö without replacement. (like picking
numbered balls out of a hat.) This command make a program in the
current directory named ÆRANnÆ where n is the number of balls to
pick. MAKERN will error if n is greater than 125000. The size of
the ÆRANnÆ program is n/2+20,5 bytes.
Example: Make a program to pick 5 balls.
5 MAKERN
You will now find a program named ÆRAN5Æ in your directory.
With an initial 1 RDZ it will return:
RAN5 -> 4
RAN5 -> 3
RAN5 -> 5
RAN5 -> 2
RAN5 -> 1
RAN5 -> Error: No More Balls Left!
The ÆRANnÆ program is self-modifying, so donÆt use it from a port.
To see that it actually is self-modifying, recall a copy of the
program to the stack, then execute << RANn DROP >> several
times.
STEST
Stack: {list of names/objects} n --> [[number mean std.
lower upper]...]
Appends a program named 'TestPrg' (must exist in the same
directory, and have stackdiagram ( ob_from_list --> real)) to each
of the elements in the list n times, and records the results of each of
the n trials. The order of which the list is traversed, is randomized.
I use it mostly to test which of several procedures are the fastest
one.
For example, to see which of SIN and COS is in general the fastest
on to perform, I would store SIN and COS under the names 'S' and
'C' (you are free to choose your own names), then store this
program under the name 'TestPrg' (assuming RAD mode)
<< RAND 6.2832 * SWAP TIM DTAG SWAP DROP >>
(TIM is in M. Heiskanen's HACK library) Then I would execute
{ S C } 50 STEST
On my calc, it returns
[[ 1 15.38... 1.55... 14.94... 15.82...]
[ 2 15.56... 0.94... 15.30... 15.83... ]]
indicating that even if SIN was a bit faster than COS, it was not
signifficantly faster (because the intervals cover eachother).
The level of confidence of the interval 95%.
Note that TestPrg may take input from the user, and such be used
as a data collector.
PREGR
Used in polynomial regression. Still not complete, but this
will help you on the way.
Stack: [[real]] m -> [[1 r1 r1^2 ... r1^m]
...[1 rn rn^2 ... rn^m]]
To fit a 4'th degree polynomial to the data in a column of an array,
given that the array has two columns and the first is x, the second
is y, I would execute the following program:
<< ->COL DROP SWAP 4 PREGR LSQ >>
A faster version of LSQ is << DUP TRNS DUP ROT * INV
SWAP * SWAP * >>. It is not as accurate though.
Things to do:
t-tests (one and two-sample, indep. and depn.)
ANOVA (one and two-way)
Konfidence-intervals for p, my, sigma
UTPBinomial
Trigamma
Maybe simulation of some more distributions
(exponential...yes, I know it's easy)
Chi-square tests
Any suggestions?