--- title: "MicrobialGrowth" author: "Florentin L'Homme^[florentin.lhomme@gmail.com], Clarisse Breard^[INRAE, clarisse.breard@inrae.fr]" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true fig_width: 7 fig_height: 5 fig_align: "center" vignette: > %\VignetteIndexEntry{MicrobialGrowth} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: inline --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ```{r packages, echo=FALSE, fig.show='hide', eval=TRUE} library(MicrobialGrowth) ``` ## Introduction {#introduction} `MicrobialGrowth` calculates metrics with biological meaning to describe microbiological growth curves. Microbial growth is measured in various biological experiments. Modern means makes it possible to recover great datasets on the evolution of these curves, for example by measuring the optical density of culture broth. These data need to be processed, analyzed and often compared. In the `MicrobialGrowth` package, we propose to fit growth curves to various known microbial growth models. All of them allow straight-forward biological interpretation with initial population ($N_0$), final population ($N_{max}$), growth rate ($\mu$) and a lag time ($\lambda$) and describe the population size $N_t$ as a function of the time $t$. These models are : - The modified form of the Gompertz equation (see [Zwietering *et al.*](https://doi.org/10.1128/aem.56.6.1875-1881.1990)) (see [Zwietering *et al.*, Modeling of the Bacterial Growth Curve, Appl Environ Microbiol. 1990 Jun; 56(6), p. 1875-1881](https://doi.org/10.1128/aem.56.6.1875-1881.1990)): $$ln\left(\frac{N_t}{N_0}\right) = ln\left(\frac{N_{max}}{N_0}\right) e^{-e^{\frac{\mu \cdot e}{ln\left(N_{max}/N_0\right)}(\lambda - t) + 1}}$$ - The Rosso model (see [Rosso *et al.*](https://doi.org/10.1006/jtbi.1993.1099)) (see [Rosso *et al.*, An Unexpected Correlation between Cardinal Temperatures of Microbial Growth Highlighted by a New Model, Journal of Theoretical Biology. 1993, Jun; 162(4) , p. 447-463](https://doi.org/10.1006/jtbi.1993.1099)): $$ln\left(N_t\right) = \left\{ \begin{array}{ll} ln(N_0) & \mbox{if } t \le \lambda\\ ln(N_{max})-ln\left(1+\left(\frac{N_{max}}{N_0}-1\right)e^{-\mu (t-\lambda)}\right) & \mbox{if } t > \lambda \end{array} \right.$$ - The Baranyi model (see [Baranyi and Roberts](https://doi.org/10.1016/0168-1605%2894%2990157-0)) (see [Baranyi and Roberts, A dynamic approach to predicting bacterial growth in food, International Journal of Food Microbiology. 1994 Nov; 23(3-4), p. 277-294](https://doi.org/10.1016/0168-1605%2894%2990157-0)): $$ ln\left(\frac{N_t}{N_0}\right) = \mu A(t) - ln\left(1+\frac{e^{\mu A(t)}-1}{\frac{N_{max}}{N_0}}\right)$$ with $$A(t) = t + \frac{1}{\mu} ln\left(e^{-\mu t} + e^{-\mu \lambda} - e^{-\mu(t+\lambda)}\right)$$ - The linear model (see [Dantigny *et al.*](https://doi.org/10.1016/j.ijfoodmicro.2004.10.013)) (see [Dantigny, P., Guilmart, A., Bensoussan, M., 2005. Basis of predictive mycology. International Journal of Food Microbiology, The Fourth International Conference on Predictive Modelling in Foods 100, 187–196](https://doi.org/10.1016/j.ijfoodmicro.2004.10.013)): $$N(t) = \mu * (t-\lambda) $$ From your data, `MicrobialGrowth`finds the best fitting values of $N_0$, $N_{max}$, $\mu$ and $\lambda$ for the model you choose. Additional functions allow graphical visualization of the results. This package has been designed to be as free to use as possible. ```{r gompertz.explain, echo=FALSE, eval=FALSE} gompertz.explain() mtext("Use `gompertz.explain()` to reproduce this plot", side = 1, line = 3, cex = 0.75) ``` ## Install and load the package You can install the latest released version from CRAN from within R with: ```{r install-cran, eval=FALSE} install.packages("MicrobialGrowth") ``` You can install the latest development version from gitlab with: ```{r install-gitlab, eval=FALSE} # install devtools first if you don't already have the package if (!("devtools" %in% installed.packages())) install.packages("devtools"); devtools::install_gitlab("florentin-lhomme/public/MicrobialGrowth") ``` Once installed, you can load the package as usual: ```{r load, eval=FALSE} library(MicrobialGrowth) ``` ## Input data {#input-data} ### example_data {#example_data} The `MicrobialGrowth` package provides some example data, to practice and be able to apply the few examples attached. These data are stored in the variable `example_data`. Below are the first 10 lines of `example_data`. ```{r, echo = FALSE, eval = TRUE} # Load the sample growth curve data provided with the package # The first column is the time in hours, and there is one column # for each well in a 96-well plate. knitr::kable(head(example_data, 10), digits = 3) ``` The first column corresponds to the time, the following columns correspond to optical density values. From column `y1` to `y15`, you will find examples of curves from the cleanest to the most chaotic (in fact, no growth). ### Your own data {#your-own-data} You are probably using a software as Excel, LibreOffice Calc, Numbers or a derivative. We recommend you to export your data in [CSV format]{style="text-decoration: underline;"} which avoids depending on a library of one of these software. Moreover, using CSV format helps saving memory space (good for the planet!), for our `example_data`, using the CSV format allowed us saving 20 to 86% of memory space compared to the software mentioned above. Make sure that your read data are **numeric** values and that each **header is unique**. You can use the `read.csv(...)` function to read your CSV file. You may need to play around with the `sep`, `dec` or other parameters to read your file correctly. For example: `data <- read.csv("mydata.csv", sep = ";", dec = ",")` if you use comma "," for decimals and semicolon ";" to separate your columns (see the `read.csv` help for more details). For headers, you should fix the problem directly in your source file. For example, if you have a triplicate of an experiment A, don't name them all "exp A", instead name them "exp A.1", "exp A.2" and "exp A.3" to differentiate them. ## Feature overview {#feature-overview} ### The `MicrobialGrowth` class {#gompertz-class} The `MicrobialGrowth` package provides several features that require a structured data. To do this, we have created a class `MicrobialGrowth` (which we will call MicrobialGrowth-object) which will allow us to call on its members (variables) and methods (functions) useful for the smooth running of other functions. These MicrobialGrowth-objects can currently be obtained by one of the two functions: `MicrobialGrowth` (regression of given dataset) and `MicrobialGrowth.create` (user-defined growth parameters). The structure of a MicrobialGrowth-object is as follows: - Members - `x`: The x values used for the regression, or the simulated x values with `MicrobialGrowth.create`. - `y`: The y values used (i.e. once clipped) for the regression, or the simulated y values with `MicrobialGrowth.create`. - `clip`: The clipping values used for y, or `NULL` with `MicrobialGrowth.create`. - `start`: The start values used for the regression, or `NULL` with `MicrobialGrowth.create`. - `lower`: The lower values used for the regression, or `NULL` with `MicrobialGrowth.create`. - `upper`: The upper values used for the regression, or `NULL` with `MicrobialGrowth.create`. - `reg`: The regression returned by nls, or `NULL` with `MicrobialGrowth.create`. - `message`: The error message if any, or `NULL` with `MicrobialGrowth.create`. - `isValid`: A boolean indicating whether the MicrobialGrowth-object is valid (plottable, *etc.*) - `coefficients`: The N0, Nmax, mu and lambda values obtained by regression, or those set with `MicrobialGrowth.create`. - Methods (included if `f` variable) - `confint(level)`: A function returning the (default) 95% confidence interval of the N0, Nmax, mu and lambda values obtained by regression, or values set with `MicrobialGrowth.create`. - `doublingTime(level)` : A function returning the doubling time calculated with mu. - `intersection(x,y,base)`: A function returning, for a given x, the value y in logarithm form (by default), or the x value to reach a given y value in logarithm form. - `auc(from, to,level,base)` : A function returning the area under the curve, by default in logarithm. - `formula()` : A function returning the formula of the object (corresponding to the 4 different models). ### The `MicrobialGrowth` regression function {#gompertz-regression} The main feature of this package is the `MicrobialGrowth` function, allowing automatical fitting of your data to the 4 models presented ealier: modified Gompertz equation, Baranyi, Rosso and linear. #### Simple uses The `MicrobialGrowth` function requires three parameters, `x` which indicates the temporality, `y` the values to process and `model` the model to fit (`gompertz`, `rosso`, `baranyi` or `linear`). Here is an example of use: ```{r simple-regression} g <- MicrobialGrowth(example_data$time, example_data$y1, model = "gompertz") g ``` The return of this function is a MicrobialGrowth-object (see section [The `MicrobialGrowth` class](#MicrobialGrowth-class)). You can therefore display the result directly in the console as above, access one of its members (*e.g.* the growth rate $\mu$ from member `coefficients`) or even plot it (see section [The `plot` function](#MicrobialGrowth-plot)). It is possible to perform multiple regressions in a single line of code by specifying multiple `y`-values. The example below shows how to perform a regression on all the data in `example_data`: ```{r multiple-regression} G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="gompertz") #For `y`, we simply exclude the first column of time G$y1 # Print the regression of y1 ``` #### Advanced parameters The basic application of the `MicrobialGrowth` function, which should be suitable for most datasets, may not work for yours. But there are several parameters you will be able to modify to fit better your data. Before explaining these parameters, we need to explain the principle of the regression. The aim is to find the combination of coefficients ($N_0$, $N_{max}$, $\mu$ and $\lambda$) that fits better, meaning this combination minimizes the distance between the model and experimental data (with the *Nonlinear Least Squares* function, see the `nls` help for more details). To do so, the algorithm will test multiple combinations until finding the optimal one. To limit the number of tested combinations (and therefore the calculation time), we use the "port" algorithm. It helps us guide the regression by reducing the ranges of coefficient values in which it has to search. We provide the algorithm with a `lower` and a `upper` corresponding to the range of possible values for the four coefficients, as well as a `start` (contained between `lower` and `upper`) ideally close to the final solution (to converge faster). For example, a growth followed by optical density and for 24 h will have a $N_{max}$ included in `[0,2]` and a $\lambda$ included in `[0,24]`. Some other optional parameters are available for example for debugging purposes, to understand where the error comes from or to bypass it. These parameters are described below: - `clip`: given the application of the logarithm function, the values of `y` must be strictly positive. Therefore, we clip these values between the smallest y-value greater than zero and infinity. - `start`, `lower` and `upper`: these parameters help guiding the regression and minimizing calculation time. Default values are calculated as described in section "`start`, `lower` and `upper`" below. - `callbackError`: modifies the behavior of the `MicrobialGrowth` function in the event of regression failure. By default, regression failure is not blocking (no error raised) and returns a MicrobialGrowth-object with `NULL` values. - `nls.args`: modifies the parameters used when calling the `nls` function (which is inside the `MicrobialGrowth` function; see the `nls` help for more details). Here are some uses of the advanced parameters of the `nls` function: - the `weights` parameter which allows you to apply a larger weight to some of your data during regression - the `trace` parameter to debug - the `control = list(warnOnly=TRUE)` parameter to force a return value despite the regression failure ##### `clip` By default, $clip = (min(y), +\infty) \mbox{ with } y > 0$, but you might want to change it. Example: if the input is `y = c(-0.5, -0.25, 0, 0.25, 0.5, 0.75, 1)`, it will by default be transformed into `y = c(0.25, 0.25, 0.25, 0.25, 0.5, 0.75, 1)`. By applying the parameter `clip = c(0.1, Inf)`, we would have `y = c(0.1, 0.1, 0.1, 0.25, 0.5, 0.75, 1)`. Be careful, the `clip` parameter (set automatically or by yourself) can have a more or less strong influence on the return values of the regression. For example, the data series `y11` has negative values. With the default clipping value (set automatically at `0.002`) a growth rate $\mu$ of `0.2488` is returned. However, when fixing the clipping value at `0.001`, a growth rate $\mu$ of `0.3` is returned. ```{r clip-sensitivity-pretty, eval=FALSE} data.frame(cbind( default = unlist(MicrobialGrowth(example_data$time, example_data$y11,model="gompertz")$coefficients), # clip equivalent to c(0.002, Inf) custom = unlist(MicrobialGrowth(example_data$time, example_data$y11, model="gompertz", clip = c(0.001, Inf))$coefficients) )) ``` ```{r clip-sensitivity-real, echo=FALSE} print( round( data.frame(cbind( default = unlist(MicrobialGrowth(example_data$time, example_data$y11, model="gompertz")$coefficients), # clip equivalent to c(0.002, Inf) custom = unlist(MicrobialGrowth(example_data$time, example_data$y11, model="gompertz", clip = c(0.001, Inf))$coefficients) )), digits = 4 ) ) ``` This sensitivity will depend on your data. To be as comparable as possible, try to choose a global clipping value (applied to all regressions) that matches your data. Ideally, pre-process your data upstream to avoid the use of clipping. ##### `start`, `lower` and `upper` By default, these coefficients are calculated according to the following table for `gompertz`, `rosso` and `baranyi` models. | | default lower | default upper | default start | |:--------|:-------------------:|:-------------------:|:--------------------:| | N0 | $\small \frac{1}{\mbox{largeValue}} \approx 0^+$ | $\small exp((log(Y_{max})-log(Y_{min})) \times 0.2+log(Y_{min}))$ | $\small exp($average of values between $\small log(Y_{min})$ and $\small (log(Y_{max})-log(Y_{min})) \times 0.2)$ | | Nmax | $\small \exp((log(Y_{max}) -log(Y_{min})) \times 0.8+log(Y_{min}))$ | $\small 2 \times Y_{max}$ | $\small exp($average of values between $\small (log(Y_{max}) -log(Y_{min}))\times0.8$ and $\small log(Y_{max}))$ | | mu | $\small \frac{log(Y_{max}) -log(Y_{min})}{max(X)-min(X)}\ $ | Maximum slope between two adjacent points | slope of $\small log(Y)$ during growth phase | | lambda | $\small 0$ | $\small max(X)$ | First value above $\small startN0$ | For linear model, only values lambda and mu must be estimated, mu was estimated as above except on non transformed values. For lambda, both start and upper value were defined as the first value for which an increase is observed. But you may need to change them. When called in the function, these coefficients are presented as named lists, that can include our four coefficients. However, when specifying one of the parameter value (e.g. the `N0` `start` value), you are not obliged to specify the others. If not specified, they will keep the default value. The following example shows the change of all the starting coefficients as well as the lower bound for the lambda parameter: ```{r algorithm-parameters} MicrobialGrowth(example_data$time, example_data$y1, model="gompertz", start = list(N0=0.1, Nmax=2, mu=0.05, lambda=40), lower = list(lambda = 40)) ``` ##### `callbackError` A regression can go wrong. By default, the behavior of the `MicrobialGrowth` function is to remain silent and return a MicrobialGrowth-object; in this case with `NULL` values for most members. You can make verbose problems occur using the `callbackError` parameter. The `callbackError` parameter requires a pointer to a function (i.e. the variable containing the function, without the parentheses) which will take the error as input. For example, you can use the `print` function to display the error without making it blocking, or on the contrary use the `stop` function to block your script at the slightest error. Note: you should use the `warning` function instead of `print` to get a more suitable display (colored red, more succinct error; the `print` function also displays the function call concerned; etc.) ```{r callbackError, error=TRUE} g <- MicrobialGrowth(c(1,2,3,4,5),c(1,1,1,1,1), model="gompertz", callbackError = warning) # Warning g <- MicrobialGrowth(c(1,2,3,4,5),c(1,1,1,1,1), model="gompertz", callbackError = stop) # Stop by re-raising the error ``` Note that when using multiple data regression, simply using the `print` function causes you to lose the information about the data series that caused the error. You have two solutions: 1) use a classic `for` loop: ```{r callbackError-custom-forloop} myprint <- function(column.name) { function(x) { cat(paste("Error on", column.name, ":", x$message)) } } for (y in colnames(example_data)[2:ncol(example_data)]) { g <- MicrobialGrowth(example_data$time, example_data[[y]], model="baranyi", callbackError = myprint(y)) } ``` 2) use advanced functions: ```{r callbackError-custom-advanced, eval=FALSE} myprint <- function(x) { cat(paste("Error on", names(substitute(X, sys.frame(2)))[substitute(i, sys.frame(2))], ":", x$message)) } G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="baranyi", callbackError = myprint) ``` The first is more recommended since you control what you do (unlike the second which depends on calling functions). The second solution can still be useful if you have already created a script that regresses multiple data in this way rather than in a traditional `for` loop. ##### `nls.args` Finally, you can use the `nls.args` parameter to specify parameters to use in the `nls` function (see the `nls` help); except `formula`, `algorithm`, `start`, `lower` and `upper` which are hard-coded in the `MicrobialGrowth` function. For example, if you want to give more weight to certain points in your data, you can use the `weights` parameter: ```{r nls.args-weights, results='hold'} cat("Normal:\n") print(unlist(MicrobialGrowth(example_data$time, example_data$y1, model="gompertz")$coefficients)) # More weight on the steep part (~mu) to the detriment of the other points/parameters. cat("\nWeighted (×10 for x∈[50, 70]):\n") print(unlist(MicrobialGrowth(example_data$time, example_data$y1, model="gompertz", nls.args = list(weights = (function(x){(x >= 50 & x <= 70)*9 + 1})(example_data$time)))$coefficients)) ``` In some cases, the regression will fail to converge properly, which means the algorithm is unable to find a good combination of the coefficients ($N_0$, $N_{max}$, $\mu$ and $\lambda$) that minimizes the distance between the model and the data. It can happen when there are few growth measurements. In this case, the `control = list(warnOnly = TRUE)` parameter can be used, that returns a combination of parameters that is satisfying but that is not the result of the complete convergence and therefore not the optimized combination. This result _"might be useful for further convergence analysis, but not for inference."_ [https://rdrr.io/r/stats/nls.html]. It can therefore be used to help narrow the range of possible values to test. ```{r nls.args-warnOnly, results='hold'} x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11) y = c(1.26e+03, 1.00e+03, 7.94e+02, 1.26e+03, 1.26e+03, 3.16e+04, 7.94e+05, 5.01e+05, 1.26e+06, 1.00e+07) cat("Failure case:\n") print(unlist(MicrobialGrowth(x, y, model="gompertz")$coefficients)) cat("\nFailure case (bypassed):\n") print(unlist(suppressWarnings(MicrobialGrowth(x, y, model="gompertz", nls.args = list(control = list(warnOnly = TRUE)))$coefficients))) cat("\nUsing narrowed lower, start and upper values based on bypassed results:\n") print(unlist(MicrobialGrowth(x, y, model="gompertz",lower=list(N0=9E+2,Nmax=1E7,mu=1.25,lambda=3.5),start = list(N0=9.5E+2, Nmax=1.3E+7, mu=1.5, lambda=3.9),upper=list(N0=1E+4,Nmax=1.5E7,mu=1.75,lambda=4))$coefficients)) ``` #### Results interpretation ##### `summary` function The main results from the `MicrobialGrowth` function are the estimated values of the coefficients. Those coefficients are displayed once you run the regression but you can access them later with the `summary` function if you registered the regression in a variable. ```{r} g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz") summary(g) ``` In addition to those coefficients, their p-value is available. It can be useful to verify the validity of your regression. Indeed even though coefficients can be estimated, they might not be all significant, meaning the model isn't valid. It is especially the case for data with no growth. In the `example_data` the y15 curve seem to present a very slight beginning of growth but taking into account an error of DO reading we cannot consider their was growth. When realizing the regression, coefficients can be estimated but when looking at the p-value, lambda coefficient does not show significance. ```{r} plot(example_data$time,example_data$y15) # slight increase of DO after time 60 g = MicrobialGrowth(example_data$time,example_data$y15, model="gompertz") summary(g) # regression successful but with no significatvity on all coefficients ``` To avoid this problem we recommend you to first make a preselection on your data to select those for which a growth is observed, for example with a difference of at least 0.05 unit of DO is observed between the first and last points of the curve. ##### coefficients Estimated coefficients with the regression can also be recovered using the `coefficients` element. ```{r} g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz") g$coefficients # returns the four parameters g$coefficients$mu # returns only mu ``` ##### data You can recover the clipped data that was used for the regression by looking at the `data` element of the regression, as shown below. ```{r} g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz") head(g$data$x) # we only show the first values of the x data with head head(g$data$y) ``` ##### call With the `call` element of the regression you can access the various parameters used during the regression: - `g$call$x` and `g$call$y` provide you with the data used - `g$call$clip` provides you with the chosen value the the clip - `g$call$start`, `g$call$lower`, `g$call$upper` provide you with the chosen start, lower and upper values - `g$call$nls.args` provides you with the various `g$call$nls.args` parameters you implemented. ##### message Writing `g$message` returns the error message if applicable. ##### f With the `f` element of the regression you can access the following functions: - `g$f$confint()` returning the confidence intervals of each estimated coefficient. By default, confidence level is at 0.95 but you can specify the one you want. ```{r confint} g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz") g$f$confint() # by default level = 0.95 g$f$confint(level=0.99) ``` - `g$f$confint.lower()` and `g$f$confint.upper()` returning the lower and upper limits of the confidence band (see [The `plot` function](#MicrobialGrowth-plot) section to visualize graphically the confidence band with the `display.confint = TRUE` parameter of the plot) - `doublingTime(level)` : returning the doubling time calculated with mu as well as the time at which population is first doubled. By default, confidence level is at 0.95 but you can specify the one you want. ```{r doublingTime} g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz") g$f$doublingTime() # by default level = 0.95 g$f$doublingTime(level=0.99) ``` - `intersection()` returning, for a given x, the value y in logarithm form (by default), or the x value to reach a given y value in logarithm form. Base can be set to `NULL` for results not in logarithmic form or to `10` for results in base log10. It should be taken into account that formula used under base 10 or 2 correspond to `log(N/N0)`, while in base `NULL`, it is `N`. For linear model, only base `NULL` is available. ```{r intersection} x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11) y = c(1000, 6500, 62000, 32000, 320000, 114000000, 300000000, 1600000000, 380000000, 350000000) g = MicrobialGrowth(x,y, model="gompertz") g$f$intersection(y=3,base=10) # time to 3 log increase calculated with base 10 g$f$intersection(y=6-log10(g$coefficients$N0),base=10) # time to reach a population of 6 log calculated with base 10 g$f$intersection(y=10^(3+log10(g$coefficients$N0)), base=NULL) # time to 3 log increase calculated with base NULL g$f$intersection(y=10^6, base=NULL) # time to reach a population of 6 log calculated with base NULL ``` - `g$f$formula()` returning the formula of the object (corresponding to the 4 different models) ```{r formula} g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz") g$f$formula() ``` - `auc()` : A function returning the area under the curve, by default in logarithm. As for `interaction`, the base for calculation can be changed and for linear model, only base `NULL` is available. ```{r auc} g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz") g$f$auc() ``` ##### isValid To verify if the regression went well and coefficients were found, you can write `g$isValid`, that will return TRUE or FALSE. ##### quick overview If you want to access all the data present in the `data`, `call`, `message`, `coefficients`, `f` and `is.valid` elements without having to write them in the console, you can simply write `View(g)` ### The `MicrobialGrowth.create` function {#MicrobialGrowth-create} Another way to create a MicrobialGrowth-object is to use the `MicrobialGrowth.create` function. This can be useful, especially if you have already done the regression work previously and stored the results (N0, ..., lambda), and just want to recreate a plot (without having to redo the regressions). The `MicrobialGrowth.create` function requires the four coefficients `N0`, `Nmax`, `mu` and `lambda`, the `model` as well as a parameter `xlim` in order to simulate data over the given interval. In the case of a linear model, even though their are no N0 and Nmax, numerical values should be provided anyway to keep the same object structure. You can provide for example N0=1 and Nmax=2, the numerical value won't impact the MicroiologicalGrowth-object. ```{r MicrobialGrowth-create-simple} g <- MicrobialGrowth.create(N0 = 0.14, Nmax = 1.43, mu = 0.07, lambda = 45, model = "gompertz", xlim = c(0, 100)) ``` ```{r MicrobialGrowth-create-simple-linear} g <- MicrobialGrowth.create(N0 = 1, Nmax = 2, mu = 1, lambda = 5, model = "linear", xlim = c(0, 10)) ``` This function returns a MicrobialGrowth-object very similar to those obtained with the `MicrobialGrowth` regression function: the structure is the same, as are their uses (`print`, `plot`, etc.). The main difference is that many members have a `NULL` value (`reg`, `clip`, `start`, etc.) There are also some minor differences, notably in the `print` display and the non-availability of the `summary` method (which makes no sense without regression). Here are some comparisons between regression and user-defined MicrobialGrowth-object: ```{r MicrobialGrowth-create-comparison-pretty, eval=FALSE} g.reg <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz") g.custom <- MicrobialGrowth.create(g.reg$coefficients$N0, g.reg$coefficients$Nmax, g.reg$coefficients$mu, g.reg$coefficients$lambda, c(min(g.reg$data$x), max(g.reg$data$x)), model="gompertz") cat("1. Regression\n") print(g.reg) cat("\n2. User-defined\n") print(g.custom) oldpar <- par(mfrow = c(1,2)) plot(g.reg, coefficients.args = list(cex = 0.75), main = "Regression") plot(g.custom, coefficients.args = list(cex = 0.75), main = "User-defined") par(oldpar) ``` ```{r MicrobialGrowth-create-comparison-real-1, echo=FALSE, results='hold'} g.reg <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz") g.custom <- MicrobialGrowth.create(g.reg$coefficients$N0, g.reg$coefficients$Nmax, g.reg$coefficients$mu, g.reg$coefficients$lambda, c(min(g.reg$data$x), max(g.reg$data$x)), model="gompertz") cat("1. Regression\n") print(g.reg) cat("\n2. User-defined\n") print(g.custom) ``` ```{r MicrobialGrowth-create-comparison-real-2, echo=FALSE} g.reg <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz") g.custom <- MicrobialGrowth.create(g.reg$coefficients$N0, g.reg$coefficients$Nmax, g.reg$coefficients$mu, g.reg$coefficients$lambda, c(min(g.reg$data$x), max(g.reg$data$x)), model="gompertz") oldpar <- par(mfrow = c(1,2)) plot(g.reg, coefficients.args = list(cex = 0.75), main = "Regression") plot(g.custom, coefficients.args = list(cex = 0.75), main = "User-defined") par(oldpar) ``` For each of the four coefficients, you can specify from one to three values. Specifying two or three values allows you to simulate confidence intervals. For three values, the minimum and maximum values will be used respectively for the lower and upper bounds of the confidence interval, and the median value as the coefficient. For two values, the minimum and maximum values will be used respectively for the lower and upper bounds of the confidence interval, and the coefficient will be defined as the mean of the two values. For a single value, the coefficient and the bounds of the confidence interval will all be equal. ```{r MicrobialGrowth-create-confint} g <- MicrobialGrowth.create(N0 = 0.14, # Single value Nmax = c(1.41, 1.45), # Two values mu = c(0.06, 0.07, 0.08), # Three values lambda = c(45, 46, 44), # Values will be reordered model = "gompertz", xlim = c(0, 100)) print(g) g$f$confint() ``` ### The `plot` function {#MicrobialGrowth-plot} MicrobialGrowth-objects have their own plotting function available through the generic `plot` function. So you can just write: ```{r plot-simple} g <- MicrobialGrowth(example_data$time, example_data$y1,model="gompertz") plot(g) ``` The `print` function will display the data in y-logged form ($ln(N/N_0)$) together with the curve and the values\* obtained by regression. [\*Values are rounded to $10^{-4}$]{style="color:#AAAAAA"} `MicrobialGrowth` tries to offer you the greatest freedom, especially with the `plot` function which offers a great level of customization. Indeed, you can use the usual graphical parameters such as `pch`, `col`, `xlab`, *etc.* (see the `plot` help) ```{r plot-custom-classic} g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz") plot(g, pch = 4, cex = 2, col = "blue", xlab = "Time (hours)", main = "Gompertz regression") ``` Additional parameters are provided for plotting MicrobialGrowth-object: - hide the coefficients by setting the `display.coefficients` parameter to `FALSE` - modify the curve obtained by regression via the `reg.args` parameter by specifying any parameter compatible with the `curve` function (see the `curve` help) ```{r plot-custom-reg} g <- MicrobialGrowth(example_data$time, example_data$y1,model="gompertz") plot(g, display.coefficients = FALSE, reg.args = list(col = "green", lty = 2, lwd = 5)) ``` - display of confidence intervals by setting the `display.confint` parameter to TRUE. ```{r plot-confint} g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz") plot(g, display.confint = TRUE) ``` - customize these confidence intervals with the `confint.args` parameters containing the sublist(s) - `lines` (for the lower and upper curves): takes as value a list containing parameters compatible with the `lines` function (see the `lines` help) - `area` (for the area between the two curves): takes as value a list containing parameters compatible with `polygon` (see the help `polygon`) as well as an `opacity` parameter facilitating the transparency of this area. ```{r plot-confint-custom} g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz") plot(g, xlim = c(80, 100), ylim = c(1.8, 2.4), # Zoom in to see the example better display.confint = TRUE, confint.args = list( lines = list(col = "purple", lty = 2, lwd = 2), area = list(col = "green", opacity = 0.1) )) ``` - modify the display of the texts (coefficients, title and axis labels) via the `title.args` parameter for the title and the axis labels (These three texts use the `title` function internally) or the `coefficients.args` parameter for displaying the coefficients (which internally uses an `mtext`). ```{r plot-text-custom} g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz") plot(g, main = "Gompertz", coefficients.args = list(cex = 1.5, side = 4, line = 1), title.args = list(col.main = "blue", col.lab = "red")) ``` - define the number of points that will be plotted with the `n` parameter. It is applied to the curve obtained by regression as well as the curves and the area of the confidence interval. Increase its value if you observe visual anomalies. - as presented for the `MicrobialGrowth.create` function, for each of the four coefficients, you can specify two or three values to simulate confidence intervals. ```{r plot-conf-interval} g <- MicrobialGrowth.create(N0 = 0.14, Nmax = 1.43, mu = c(0.05, 0.08, 0.09), lambda = c(42, 48), model = "gompertz", xlim = c(0, 100)) plot(g, display.confint = TRUE, main = "From MicrobialGrowth.create", sub = "With custom variation on mu and lambda") ``` - it is possible to plot the curve in the desired base (classical (you directly see values of y) or base 10 (corresponding to log10)). By default, the curve is plotted in ln (equivalent to base exp(1)). Display in base 10 can be useful to display growth curves obtained by enumeration. ```{r plot-base} g <- MicrobialGrowth.create(N0 = 0.14, Nmax = 1.43, mu = c(0.05, 0.08, 0.09), lambda = c(42, 48), model = "gompertz", xlim = c(0, 100)) oldpar <- par(mfrow = c(2,2)) plot(g, display.coefficients = FALSE, main = "default base ln") plot(g, base=10, display.coefficients = FALSE, main = "base 10") plot(g, base=NULL, display.coefficients = FALSE, main = "y values") par(oldpar) ``` Note that all of these features are of course available with a user-created MicrobialGrowth object via `MicrobialGrowth.create`. ## How to ... This section brings together different questions and answers to find a solution to your questions more quickly without having to reread this entire document or browse through all the documentation. ### How to regress mutliple data Solution 1: directly by supplying several `y` to the `MicrobialGrowth` function ```{r how-to-multiple-regression} G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="gompertz") #For `y`, we simply exclude the first column of time ``` Solution 2: loop through your different `y` values with a `for` loop ```{r how-to-multiple-regression-forloop} G <- list() for (y in colnames(example_data)[2:ncol(example_data)]) { #For `y`, we simply exclude the first column of time G[[y]] <- MicrobialGrowth(example_data$time, example_data[[y]], model="gompertz") } ``` ### How to deal with a regression failure Solution 1: Modify the lower, upper or start value to improve regression by providing narrowed range of possible values for the coefficients than the ones provided by default (see section `start`, `lower` and `upper`. ```{r how-to-regression-failure1} x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11) y = c(1.26e+03, 1.00e+03, 7.94e+02, 1.26e+03, 1.26e+03, 3.16e+04, 7.94e+05, 5.01e+05, 1.26e+06, 1.00e+07) MicrobialGrowth(x,y,lower=list(N0=700)) ``` Solution 2: Use the `nls.args` parameter and the `control = list(warnOnly=TRUE)` subparameter to force return of coefficients (remain vigilant because it provides only a good combination among others, see `nls.args` section). ```{r how-to-force-coefficients-return, results='hold'} x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11) y = c(1.26e+03, 1.00e+03, 7.94e+02, 1.26e+03, 1.26e+03, 3.16e+04, 7.94e+05, 5.01e+05, 1.26e+06, 1.00e+07) MicrobialGrowth(x,y,nls.arg=list(control=list(warnOnly=TRUE))) ``` ### How to separate successful from failed regressions You can use the `isValid` member ```{r how-to-separate-regression} G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="gompertz") G.success = Filter(function(x) x$isValid, G) G.failed = Filter(function(x) !x$isValid, G) ``` Be careful, if you use MicrobialGrowth-objects created with `MicrobialGrowth.create`, these will be marked as valid. Likewise, if you force the return of coefficient values despite a regression problem (with the parameter `nls.args = list(control = list(warnOnly = TRUE))`) these will be marked as valid. ### How to export coefficients as data.frame ```{r how-to-coefficients-to-data.frame} df <- as.data.frame(lapply(G, function(x) unlist(x$coefficients))) # t(df) #To swap axes # write.csv(df, "file.csv") #To export as CSV file ``` ### How to apply a theme to all plots You want to modify the visual aspect of the plots but without having to specify a whole bunch of parameters each time ? You can create a wrapper function that will take care of calling the `plot` function for the MicrobialGrowth-objects with the parameters you want. ```{r how-to-theme-plot} my.plot <- function(x, ...) { plot(x, pch = 4, col = colorRampPalette(c("blue", "red"))(length(x$data$x)), reg.args = list(col = "green", lty = 2, lwd = 2), coefficients.args = list(side = 1, line = 4), ...) } # Keep `...` to use some customizations as the main title specific to each MicrobialGrowth-object my.plot(MicrobialGrowth(example_data$time, example_data$y1), main = "Plot with my theme") ```