automatedtest

Automatically select and run the best statistical test for your data with just one line of code. Supporting one-sample-tests, two-sample-tests, multiple-sample-tests, and even correlations! automatedtests

What is `automatedtest`?

automatedtests is an R package designed to simplify statistical testing. It automatically analyzes your data, determines the most fitting statistical test (based on structure and content), and executes it. shortening the time spent deciding what test to use.

The package supports tidy data frames and a set of numeric/categorical vectors! non tidy data will have to be reshaped.

Features

Auto-detects best statistical test based on your data type and structure.
Handles tidy data: optional identifier exclusion.
Returns an AutomatedTest object with many different results including the full test $get_result().

Supported Tests

number	test
1	One-proportion test
2	Chi-square goodness-of-fit test
3	One-sample Student’s t-test
4	One-sample Wilcoxon test
5	Multiple linear regression
6	Binary logistic regression
7	Multinomial logistic regression
8	Pearson correlation
9	Spearman’s rank correlation
10	Cochran’s Q test
11	McNemar’s test
12	Fisher’s exact test
13	Chi-square test of independence
14	Student’s t-test for independent samples
15	Welch’s t-test for independent samples
16	Mann-Whitney U test
17	Student’s t-test for paired samples
18	Wilcoxon signed-rank test
19	One-way ANOVA
20	Welch’s ANOVA
21	Repeated measures ANOVA
22	Kruskal-Wallis test
23	Friedman test

Installation

You can install the package from CRAN:

install.packages("automatedtests")

# Load library
library(automatedtests)

Usage

Using a data frame

# Automatically runs appropriate test(s) on the cars dataset
test1 <- automatical_test(cars)

# Get quick overview
test1

# Get detailed results
test1$get_result()

Using individual vectors

# Compare Sepal.Length across Species
test2 <- automatical_test(iris$Species, iris$Sepal.Length)
test2$get_result()

One-sample tests

# Compare a numeric vector to a fixed value
automatical_test(c(3, 5, 4, 6, 7), compare_to = 5)

Arguments

Argument	Description
`...`	A data frame or multiple equal-length vectors
`compare_to`	Value to compare against in one-sample tests (numeric or assumed uniform for categorical data)
`identifiers`	Logical; if TRUE, the first column is treated as identifiers and excluded from testing
`paired`	Logical; if TRUE, the test will become paired, by default FALSE

Output

Returns an object of class AutomatedTest with methods and properties like:

print(object) - overview of executed test and its results.
$get_result() - detailed summary of the test performed, containing all information including p.value, statistics etc.
$get_test() - test type selected
$is_parametric() - Whether the numeric feature were parametric
$is_paired() - Returns if a paired test was used.
$get_strength() - Shows the strength of the test/correlation. This is a different kind of value for each test. It will also return what the value is. These are the different types of data it can return:

coefficient     – strength and direction of predictor effects  
r               – strength and direction of correlation  
mean difference – size of difference between group means  
statistic       – test statistic indicating group difference or association  
F statistic     – variance ratio across group means  
proportion      – estimated proportion of successes in a sample  
non-existent    – no interpretable strength measure available

$get_parametric_list() - Returns a list of all numeric features’ distributions and the parametric tests used.
$get_datatypes() - Shows what type of data the features used in the corresponding test contain.
$is_significant() - TRUE/FALSE if result is statistically significant (p.value < 0.05), to show the result in the blink of an eye!

Example Output

# Automated Test:
# Data:  speed, dist 
# Test:  Spearman's rank correlation 
# Test:  Spearman's rank correlation 
# Results: 
#  p.value:  8.824558e-14 
#  Strength:  r = 0.83 
#  Significant:  TRUE

Method to choose stastitical test

- By Antoine Soetewey

Dependencies

R6
MASS
nnet
nortest
stats,
DescTools

These are automatically handled during installation.

Author

Wouter Zeevat

License

This package is licensed under the GPL-3 License.

You can freely use, modify, and redistribute the software under the terms of the GNU General Public License v3 (GPL-3). The key conditions of the GPL-3 license are:

You can use the package for personal, academic, or commercial purposes.
If you modify the package and distribute it, you must distribute the source code of your modified version.
Any derivative work must also be licensed under GPL-3.

For more information, see the full GPL-3 License.