--- title: "Generating Small, Medium, and Large Datasets" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Example} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Overview This vignette demonstrates how to use the {samplezoo} package to generate datasets of varying sizes (small, medium, and large) with variables from multiple probability distributions. Each dataset contains: * Variables/columns from common distributions such as Normal, Binomial, Poisson, and others. * Adjustable sample sizes to meet needs. ```{r setup} library(samplezoo) ``` ### Generate a small dataset (i.e., 100 rows) ```{r} data_small <- samplezoo("small") head(data_small) ``` ### Generate a medium sized dataset (i.e., 1,000 rows) ```{r} data_medium <- samplezoo("medium") head(data_medium) ``` ### Generate a large sized dataset (i.e., 10,000 rows) ```{r} data_large <- samplezoo("large") head(data_large) ``` ### Adding Variation or Ensuring Reproducibility with set.seed() To ensure reproducibility and introduce controlled variation in your dataset, use set.seed() before generating random data. ```{r} set.seed(123) data_large <- samplezoo("large") head(data_large) ``` ```{r} set.seed(456) data_large <- samplezoo("large") head(data_large) ```