[Home] . . . Search by [Problem] [Package] [Name or Keyword] . . . [Math at NIST]

Modules in Package SAS

Package SAS (Installed on ITL)

ACECLUS
Obtains approximate estimates of the pooled within-cluster covariance matrix when the clusters can be assumed multivariate normal with equal covariance matrices. Neither cluster membership nor the number of clusters need to be known. Options: weights, missing values.
ANOVA
Performs univariate and multivariate analysis of variance for balanced data, including Latin-square, certain balanced incomplete block designs, completely nested (hierarchical) designs. Options: numerous means comparisons, missing values.
CANCORR
Performs canonical correlation and tests correlation hypotheses using an F approximation. Both standardized and unstandardized canonical coefficients and correlations between canonical variable and the original variables are produced. Options: canonical redundancy analysis, partial canonical correlation, weights, output data sets of scores on each canonical variable and canonical coefficients.
CANDISC
Performs a canonical discriminant analysis, computes Mahalanobis distances, and does both univariate and multivariate one-way analyses of variance. Tests zero correlations using an F approximation. Options: weights, missing values.
CATMOD
Analyzes two-dimensional contingency tables by fitting linear models to functions of response frequencies using a maximum-likelihood estimation of parameters for log-linear models and the analysis of generalized logits, or using a weighted-least-squares estimation of parameters for general linear models. Options: weights, parameter testing.
CHART
Produces line printer vertical and horizontal bar charts (histograms), X-Y-Z block charts, pie charts, star charts, frequency and cumulative frequency plots.
CLUSTER
Hierarchically clusters observations by one of eleven procedures (standard linkage methods, density linkage (including kth-nearest-neighbor and two-stage), and maximum-likelihood for mixtures of spherical multivariate normal distributions). Input data can be either coordinates or distances. Options: trimming input data, missing values.
CORR
Computes correlation coefficients between variables, including Pearson product-moment and weighted product-moment correlations. Can also compute Spearman''s rank-order correlation, Kendall''s tau-b, and Hoeffding''s measurement of dependence. Options: some univariate descriptive statistics, missing values.
DISCRIM
Computes linear or quadratic discriminant functions for classifying observations into two or more groups. The distribution within each group should be approximately multivariate normal. The classification criterion can be based on either the individual within-group covariance matrices or the pooled covariance matrix. Options: homogeneity of the within-group covariance test, missing values.
FACTOR
Performs several types of common factor and component analysis for multivariate data, a correlation matrix, a covariance matrix, a factor pattern, or a matrix of scoring coefficients. A variety of methods are available for extracting factors, for prior communality estimation, and for rotation. Options: weights, factor scores.
FASTCLUS
Performs a disjoint cluster analysis by minimizing the sum of squared distances from the cluster means. User specifies the maximum number of clusters and optionally, the minimum radius of the clusters. Designed for use with large data sets. Options: weights, missing values.
FREQ
Builds frequency or crosstabulation tables for one-way to n-way categorical data. Can compute tests and measures of association for two-way tables and can do stratified analysis and compute statistics within as well as across strata for n-way tables. Options: missing values, weights, additional analysis.
GLM
Performs simple and multiple least-squares regression, analysis of variance (especially for unbalanced data), analysis of covariance, response-surface regression, polynomial regression, partial correlation, multivariate analysis of variance, and repeated measures analysis of variance. Options: weights, missing values.
LIFEREG
Fits parametric models to failure-time data that may be right censored. Models include exponential, Weibull, log normal, and log logistic. Parameters are estimated by maximum likelihood using a Newton-Raphson algorithm. Independent variable may be continuous or discrete.
LIFETEST
Computes nonparametric estimates of the survival distribution (by the product limit method or the life table method) and computes rank tests for association of the response variable with covariates for stratified data that may be right censored. Options: tests homogeneity between strata, missing values, printer plots.
MEANS
Produces univariate descriptive statistics for numeric variables in an entire data set or for groups of observations in the data set. Options: weights, missing values.
NEIGHBOR
Performs a nearest neighbor discriminant analysis, classifying observations into groups according to either the nearest neighbor rule or the k-nearest-neighbor when the classes do not have multivariate normal distributions. Proximity is determined by either Mahalanobis or Euclidean distances. Options: use of prior probabilities in the classification, missing values.
NESTED
Performs analysis of variance and analysis of covariance for nested random designs. Especially good for designs involving large numbers of classification levels and observations. The data set must be sorted by the classification variables (assumed to form a nested set of effects).
NLIN
Performs nonlinear least-squares regression using one of four iterative methods (modified Gauss-Newton, Marquardt, gradient, or steepest-descent, and multivariate secant or false position (DUD). User provides starting values for the parameters, and derivatives of the model for all but the DUD method. Options: weights, bounds on the parameter estimates, objective function to be minimized, grid search for starting values.
NPAR1WAY
Performs nonparametric one-way analysis of variance on ranks and four rank scores (Wilcoxon, median, van der Waerden, and Savage).
PLAN
Generates random permutations of positive integers for experimental plans (e.g., completely random, split-plot, and hierarchical designs) given a specification of the randomized plan including number of levels of nesting. Option: seed of first permutation.
PLOT
Produces a Y vs. X line printer scatterplot, a superimposed plot, or a contour plot. Options: missing values, user control of plot features.
PRINCOMP
Performs principle component analysis on raw data, a correlation matrix, or a covariance matrix. Options: weights, missing values.
PROBIT
Calculates maximum-likelihood estimates of the intercept, slope and natural (threshold) response rate for biological assay data using a modified Gauss-Newton algorithm.
RANK
Computes ranks for one or more numeric variables across the observations of a data set. Options: group continuous data into ranges, fractional ranks, normal scores (Blom, Tukey, or van der Waerden), Savage (exponential) scores.
REG
Fits least-squares estimates to linear regression models. Options: weights; parameter estimates, predicted values, residuals, Studentized residuals, confidence limits, hypothesis tests; collinearity diagnostics; influence diagnostics including partial regression leverage plots; Durbin-Watson statistic; hypothesis tests involving multiple dependent variables; parameter estimates subject to linear restriction.
RSQUARE
Uses the R-squared statistic to select optimal subsets of independent variables for multiple regression. Can specify largest and smallest number of independent variables for a subset and number of subsets of each size. Options: weights; statistics for each model selected including Akaike''s information criterion, Mallows'' C-p, and others.
RSREG
Estimates a quadratic response surface using least-squares regression and determines critical values to optimize the response. Options. weights, lack of fit test, surface plotting, eigenvalues of the associated quadratic form.
SORT
Sorts observations by one or more variables in ascending or descending order. Options: handle duplicate records, maintain order of observations with identical values of the sorting variables, several collating sequences.
STANDARD
Standardizes some or all of the variables in a data set to a given mean and standard deviation. Options: weights, replace missing values with variable mean.
STEPDISC
Performs a stepwise discriminant analysis by forward selection, backward elimination, or stepwise selection of variables. The classes are assumed to be multivariate normal with a common covariance matrix. Options: weights, missing values.
STEPWISE
Provides five methods (forward selection, backward elimination, stepwise, maximum and minimum R-squared improvements) for stepwise regression. Recommends twenty independent variables. Options: weights, form of input, significance levels, Mallows'' C-p statistic.
SUMMARY
Produces univariate descriptive statistics for numeric variables. Options: weights, missing values.
TABULATE
Produces hierarchical tables of descriptive statistics from compositions of classification variables, analysis variables, and statistics keywords. Options: weights, missing values.
TIMEPLOT
Produces a line printer plot of one or more variables over time intervals. Options: missing values, user control of plot features.
TREE
Prints a tree diagram from the output generated by SAS procedures CLUSTER or VARCLUS. Can also create an output data set identifying disjoint clusters at a specified level in the tree. Optional user control of plot features.
TTEST
Computes a t statistic for testing the hypothesis that the means of two groups of observations are equal.
UNIVARIATE
Produces simple descriptive statistics for numeric variables including extreme values and quantiles. Options: distribution plots, frequency table, missing values, weights, normality tests.
VARCLUS
Performs either disjoint or hierarchical clustering of variables by maximizing the variation accounted for by either the first principal component or the centroid component of each cluster. Options: weights, missing values.
VARCOMP
Provides four methods (Type I, MIVQUEO, maximum-likelihood, and restricted maximum-likelihood) for estimating variance components in a general linear model containing random effects and optionally fixed effects. Option: missing values.
Comments? gams@nist.gov