MicrobialGrowth

Florentin L’Homme1, Clarisse Breard2

2025-03-08

Introduction

MicrobialGrowth calculates metrics with biological meaning to describe microbiological growth curves.

Microbial growth is measured in various biological experiments. Modern means makes it possible to recover great datasets on the evolution of these curves, for example by measuring the optical density of culture broth. These data need to be processed, analyzed and often compared.

In the MicrobialGrowth package, we propose to fit growth curves to various known microbial growth models. All of them allow straight-forward biological interpretation with initial population (\(N_0\)), final population (\(N_{max}\)), growth rate (\(\mu\)) and a lag time (\(\lambda\)) and describe the population size \(N_t\) as a function of the time \(t\). These models are :

\[ln\left(\frac{N_t}{N_0}\right) = ln\left(\frac{N_{max}}{N_0}\right) e^{-e^{\frac{\mu \cdot e}{ln\left(N_{max}/N_0\right)}(\lambda - t) + 1}}\]

\[A(t) = t + \frac{1}{\mu} ln\left(e^{-\mu t} + e^{-\mu \lambda} - e^{-\mu(t+\lambda)}\right)\]

\[N(t) = \mu * (t-\lambda) \]

From your data, MicrobialGrowthfinds the best fitting values of \(N_0\), \(N_{max}\), \(\mu\) and \(\lambda\) for the model you choose. Additional functions allow graphical visualization of the results. This package has been designed to be as free to use as possible.

Install and load the package

You can install the latest released version from CRAN from within R with:

install.packages("MicrobialGrowth")

You can install the latest development version from gitlab with:

# install devtools first if you don't already have the package
if (!("devtools" %in% installed.packages())) install.packages("devtools");

devtools::install_gitlab("florentin-lhomme/public/MicrobialGrowth")

Once installed, you can load the package as usual:

library(MicrobialGrowth)

Input data

example_data

The MicrobialGrowth package provides some example data, to practice and be able to apply the few examples attached. These data are stored in the variable example_data. Below are the first 10 lines of example_data.

time y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12 y13 y14 y15
0.000 0.158 0.171 0.218 0.224 0.161 0.184 0.206 0.185 -0.018 -0.017 -0.017 0.159 0.277 0.218 0.201
0.167 0.165 0.185 0.230 0.208 0.193 0.181 0.198 0.203 -0.018 -0.018 -0.018 0.195 0.208 0.225 0.219
0.333 0.132 0.132 0.199 0.155 0.181 0.137 0.152 0.174 -0.018 -0.018 -0.018 0.180 0.162 0.144 0.181
0.500 0.136 0.138 0.204 0.176 0.166 0.147 0.151 0.153 -0.018 -0.018 -0.018 0.167 0.171 0.160 0.190
0.667 0.132 0.133 0.169 0.161 0.142 0.143 0.157 0.139 -0.018 -0.018 -0.018 0.131 0.154 0.168 0.177
0.833 0.136 0.135 0.158 0.161 0.143 0.149 0.144 0.136 -0.018 -0.017 -0.017 0.131 0.159 0.156 0.176
1.000 0.136 0.134 0.161 0.171 0.135 0.139 0.140 0.140 -0.018 -0.017 -0.017 0.132 0.159 0.160 0.178
1.167 0.137 0.138 0.163 0.174 0.141 0.146 0.146 0.142 -0.018 -0.018 -0.018 0.135 0.173 0.160 0.197
1.333 0.138 0.138 0.147 0.165 0.139 0.141 0.143 0.144 -0.018 -0.018 -0.018 0.135 0.171 0.150 0.177
1.500 0.133 0.134 0.163 0.161 0.137 0.140 0.139 0.139 -0.018 -0.018 -0.018 0.134 0.173 0.161 0.168

The first column corresponds to the time, the following columns correspond to optical density values. From column y1 to y15, you will find examples of curves from the cleanest to the most chaotic (in fact, no growth).

Your own data

You are probably using a software as Excel, LibreOffice Calc, Numbers or a derivative. We recommend you to export your data in CSV format which avoids depending on a library of one of these software. Moreover, using CSV format helps saving memory space (good for the planet!), for our example_data, using the CSV format allowed us saving 20 to 86% of memory space compared to the software mentioned above.

Make sure that your read data are numeric values and that each header is unique. You can use the read.csv(...) function to read your CSV file. You may need to play around with the sep, dec or other parameters to read your file correctly. For example: data <- read.csv("mydata.csv", sep = ";", dec = ",") if you use comma “,” for decimals and semicolon “;” to separate your columns (see the read.csv help for more details). For headers, you should fix the problem directly in your source file. For example, if you have a triplicate of an experiment A, don’t name them all “exp A”, instead name them “exp A.1”, “exp A.2” and “exp A.3” to differentiate them.

Feature overview

The MicrobialGrowth class

The MicrobialGrowth package provides several features that require a structured data. To do this, we have created a class MicrobialGrowth (which we will call MicrobialGrowth-object) which will allow us to call on its members (variables) and methods (functions) useful for the smooth running of other functions. These MicrobialGrowth-objects can currently be obtained by one of the two functions: MicrobialGrowth (regression of given dataset) and MicrobialGrowth.create (user-defined growth parameters).

The structure of a MicrobialGrowth-object is as follows:

The MicrobialGrowth regression function

The main feature of this package is the MicrobialGrowth function, allowing automatical fitting of your data to the 4 models presented ealier: modified Gompertz equation, Baranyi, Rosso and linear.

Simple uses

The MicrobialGrowth function requires three parameters, x which indicates the temporality, y the values to process and model the model to fit (gompertz, rosso, baranyi or linear). Here is an example of use:

g <- MicrobialGrowth(example_data$time, example_data$y1, model = "gompertz")
g
## MicrobialGrowth, model gompertz: 
##          N0        Nmax          mu      lambda 
##  0.13980296  1.42694570  0.06962468 44.68998591

The return of this function is a MicrobialGrowth-object (see section The MicrobialGrowth class). You can therefore display the result directly in the console as above, access one of its members (e.g. the growth rate \(\mu\) from member coefficients) or even plot it (see section The plot function).

It is possible to perform multiple regressions in a single line of code by specifying multiple y-values. The example below shows how to perform a regression on all the data in example_data:

G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="gompertz") #For `y`, we simply exclude the first column of time
G$y1 # Print the regression of y1
## MicrobialGrowth, model gompertz: 
##          N0        Nmax          mu      lambda 
##  0.13980296  1.42694570  0.06962468 44.68998591

Advanced parameters

The basic application of the MicrobialGrowth function, which should be suitable for most datasets, may not work for yours. But there are several parameters you will be able to modify to fit better your data.

Before explaining these parameters, we need to explain the principle of the regression. The aim is to find the combination of coefficients (\(N_0\), \(N_{max}\), \(\mu\) and \(\lambda\)) that fits better, meaning this combination minimizes the distance between the model and experimental data (with the Nonlinear Least Squares function, see the nls help for more details). To do so, the algorithm will test multiple combinations until finding the optimal one. To limit the number of tested combinations (and therefore the calculation time), we use the “port” algorithm. It helps us guide the regression by reducing the ranges of coefficient values in which it has to search. We provide the algorithm with a lower and a upper corresponding to the range of possible values for the four coefficients, as well as a start (contained between lower and upper) ideally close to the final solution (to converge faster). For example, a growth followed by optical density and for 24 h will have a \(N_{max}\) included in [0,2] and a \(\lambda\) included in [0,24].

Some other optional parameters are available for example for debugging purposes, to understand where the error comes from or to bypass it.

These parameters are described below:

  • clip: given the application of the logarithm function, the values of y must be strictly positive. Therefore, we clip these values between the smallest y-value greater than zero and infinity.
  • start, lower and upper: these parameters help guiding the regression and minimizing calculation time. Default values are calculated as described in section “start, lower and upper” below.
  • callbackError: modifies the behavior of the MicrobialGrowth function in the event of regression failure. By default, regression failure is not blocking (no error raised) and returns a MicrobialGrowth-object with NULL values.
  • nls.args: modifies the parameters used when calling the nls function (which is inside the MicrobialGrowth function; see the nls help for more details). Here are some uses of the advanced parameters of the nls function:
    • the weights parameter which allows you to apply a larger weight to some of your data during regression
    • the trace parameter to debug
    • the control = list(warnOnly=TRUE) parameter to force a return value despite the regression failure
clip

By default, \(clip = (min(y), +\infty) \mbox{ with } y > 0\), but you might want to change it. Example: if the input is y = c(-0.5, -0.25, 0, 0.25, 0.5, 0.75, 1), it will by default be transformed into y = c(0.25, 0.25, 0.25, 0.25, 0.5, 0.75, 1). By applying the parameter clip = c(0.1, Inf), we would have y = c(0.1, 0.1, 0.1, 0.25, 0.5, 0.75, 1).

Be careful, the clip parameter (set automatically or by yourself) can have a more or less strong influence on the return values of the regression.

For example, the data series y11 has negative values. With the default clipping value (set automatically at 0.002) a growth rate \(\mu\) of 0.2488 is returned. However, when fixing the clipping value at 0.001, a growth rate \(\mu\) of 0.3 is returned.

data.frame(cbind(
  default = unlist(MicrobialGrowth(example_data$time, example_data$y11,model="gompertz")$coefficients), # clip equivalent to c(0.002, Inf)
  custom = unlist(MicrobialGrowth(example_data$time, example_data$y11, model="gompertz", clip = c(0.001, Inf))$coefficients)
))
##        default  custom
## N0      0.0017  0.0008
## Nmax    1.0169  0.9708
## mu      0.2488  0.3000
## lambda 15.5046 15.2904

This sensitivity will depend on your data. To be as comparable as possible, try to choose a global clipping value (applied to all regressions) that matches your data. Ideally, pre-process your data upstream to avoid the use of clipping.

start, lower and upper

By default, these coefficients are calculated according to the following table for gompertz, rosso and baranyi models.

default lower default upper default start
N0 \(\small \frac{1}{\mbox{largeValue}} \approx 0^+\) \(\small exp((log(Y_{max})-log(Y_{min})) \times 0.2+log(Y_{min}))\) \(\small exp(\)average of values between \(\small log(Y_{min})\) and \(\small (log(Y_{max})-log(Y_{min})) \times 0.2)\)
Nmax \(\small \exp((log(Y_{max}) -log(Y_{min})) \times 0.8+log(Y_{min}))\) \(\small 2 \times Y_{max}\) \(\small exp(\)average of values between \(\small (log(Y_{max}) -log(Y_{min}))\times0.8\) and \(\small log(Y_{max}))\)
mu \(\small \frac{log(Y_{max}) -log(Y_{min})}{max(X)-min(X)}\ \) Maximum slope between two adjacent points slope of \(\small log(Y)\) during growth phase
lambda \(\small 0\) \(\small max(X)\) First value above \(\small startN0\)

For linear model, only values lambda and mu must be estimated, mu was estimated as above except on non transformed values. For lambda, both start and upper value were defined as the first value for which an increase is observed.

But you may need to change them. When called in the function, these coefficients are presented as named lists, that can include our four coefficients. However, when specifying one of the parameter value (e.g. the N0 start value), you are not obliged to specify the others. If not specified, they will keep the default value.

The following example shows the change of all the starting coefficients as well as the lower bound for the lambda parameter:

MicrobialGrowth(example_data$time, example_data$y1, model="gompertz", start = list(N0=0.1, Nmax=2, mu=0.05, lambda=40), lower = list(lambda = 40))
## MicrobialGrowth, model gompertz: 
##          N0        Nmax          mu      lambda 
##  0.13980297  1.42694565  0.06962468 44.68998739
callbackError

A regression can go wrong. By default, the behavior of the MicrobialGrowth function is to remain silent and return a MicrobialGrowth-object; in this case with NULL values for most members. You can make verbose problems occur using the callbackError parameter.

The callbackError parameter requires a pointer to a function (i.e. the variable containing the function, without the parentheses) which will take the error as input. For example, you can use the print function to display the error without making it blocking, or on the contrary use the stop function to block your script at the slightest error.

Note: you should use the warning function instead of print to get a more suitable display (colored red, more succinct error; the print function also displays the function call concerned; etc.)

g <- MicrobialGrowth(c(1,2,3,4,5),c(1,1,1,1,1), model="gompertz", callbackError = warning) # Warning
## Error in numericDeriv(form[[3L]], names(ind), env, dir = -1 + 2 * (internalPars < : Valeur manquante ou infinie obtenue au cours du calcul du modèle
g <- MicrobialGrowth(c(1,2,3,4,5),c(1,1,1,1,1), model="gompertz", callbackError = stop) # Stop by re-raising the error
## Error in numericDeriv(form[[3L]], names(ind), env, dir = -1 + 2 * (internalPars < : Valeur manquante ou infinie obtenue au cours du calcul du modèle

Note that when using multiple data regression, simply using the print function causes you to lose the information about the data series that caused the error. You have two solutions:

  1. use a classic for loop:
myprint <- function(column.name) { function(x) { cat(paste("Error on", column.name, ":", x$message)) } }

for (y in colnames(example_data)[2:ncol(example_data)]) {
    g <- MicrobialGrowth(example_data$time, example_data[[y]], model="baranyi", callbackError = myprint(y))
}
## Error on y9 : NA/NaN/Inf dans un appel à une fonction externe (argument 1)
  1. use advanced functions:
myprint <- function(x) { cat(paste("Error on", names(substitute(X, sys.frame(2)))[substitute(i, sys.frame(2))], ":", x$message)) }
G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="baranyi", callbackError = myprint)

The first is more recommended since you control what you do (unlike the second which depends on calling functions). The second solution can still be useful if you have already created a script that regresses multiple data in this way rather than in a traditional for loop.

nls.args

Finally, you can use the nls.args parameter to specify parameters to use in the nls function (see the nls help); except formula, algorithm, start, lower and upper which are hard-coded in the MicrobialGrowth function. For example, if you want to give more weight to certain points in your data, you can use the weights parameter:

cat("Normal:\n")
print(unlist(MicrobialGrowth(example_data$time, example_data$y1, model="gompertz")$coefficients))

# More weight on the steep part (~mu) to the detriment of the other points/parameters.
cat("\nWeighted (×10 for x∈[50, 70]):\n")
print(unlist(MicrobialGrowth(example_data$time, example_data$y1, model="gompertz", nls.args = list(weights = (function(x){(x >= 50 & x <= 70)*9 + 1})(example_data$time)))$coefficients))
## Normal:
##          N0        Nmax          mu      lambda 
##  0.13980296  1.42694570  0.06962468 44.68998591 
## 
## Weighted (×10 for x∈[50, 70]):
##          N0        Nmax          mu      lambda 
##  0.14104750  1.42427062  0.07255779 45.71461731

In some cases, the regression will fail to converge properly, which means the algorithm is unable to find a good combination of the coefficients (\(N_0\), \(N_{max}\), \(\mu\) and \(\lambda\)) that minimizes the distance between the model and the data. It can happen when there are few growth measurements. In this case, the control = list(warnOnly = TRUE) parameter can be used, that returns a combination of parameters that is satisfying but that is not the result of the complete convergence and therefore not the optimized combination. This result “might be useful for further convergence analysis, but not for inference.” [https://rdrr.io/r/stats/nls.html]. It can therefore be used to help narrow the range of possible values to test.

x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11)
y = c(1.26e+03, 1.00e+03, 7.94e+02, 1.26e+03, 1.26e+03, 3.16e+04, 7.94e+05, 5.01e+05, 1.26e+06, 1.00e+07)

cat("Failure case:\n")
print(unlist(MicrobialGrowth(x, y, model="gompertz")$coefficients))

cat("\nFailure case (bypassed):\n")
print(unlist(suppressWarnings(MicrobialGrowth(x, y, model="gompertz", nls.args = list(control = list(warnOnly = TRUE)))$coefficients)))

cat("\nUsing narrowed lower, start and upper values based on bypassed results:\n")
print(unlist(MicrobialGrowth(x, y, model="gompertz",lower=list(N0=9E+2,Nmax=1E7,mu=1.25,lambda=3.5),start = list(N0=9.5E+2, Nmax=1.3E+7, mu=1.5, lambda=3.9),upper=list(N0=1E+4,Nmax=1.5E7,mu=1.75,lambda=4))$coefficients))
## Failure case:
##     N0   Nmax     mu lambda 
##     NA     NA     NA     NA 
## 
## Failure case (bypassed):
##           N0         Nmax           mu       lambda 
## 9.488826e+02 1.330271e+07 1.522831e+00 3.871129e+00 
## 
## Using narrowed lower, start and upper values based on bypassed results:
##           N0         Nmax           mu       lambda 
## 9.488771e+02 1.330431e+07 1.522814e+00 3.871108e+00

Results interpretation

summary function

The main results from the MicrobialGrowth function are the estimated values of the coefficients. Those coefficients are displayed once you run the regression but you can access them later with the summary function if you registered the regression in a variable.

g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz")
summary(g)
## 
## Formula: log(y) ~ log(N0) + (log(Nmax) - log(N0)) * exp(-exp(((mu * exp(1))/(log(Nmax) - 
##     log(N0))) * (lambda - x) + 1))
## 
## Parameters:
##         Estimate Std. Error t value Pr(>|t|)    
## N0     1.398e-01  3.207e-04   435.9   <2e-16 ***
## Nmax   1.427e+00  8.336e-03   171.2   <2e-16 ***
## mu     6.962e-02  4.071e-04   171.0   <2e-16 ***
## lambda 4.469e+01  1.028e-01   434.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03534 on 601 degrees of freedom
## 
## Algorithm "port", convergence message: relative convergence (4)

In addition to those coefficients, their p-value is available. It can be useful to verify the validity of your regression. Indeed even though coefficients can be estimated, they might not be all significant, meaning the model isn’t valid. It is especially the case for data with no growth.

In the example_data the y15 curve seem to present a very slight beginning of growth but taking into account an error of DO reading we cannot consider their was growth. When realizing the regression, coefficients can be estimated but when looking at the p-value, lambda coefficient does not show significance.

plot(example_data$time,example_data$y15)  # slight increase of DO after time 60

g = MicrobialGrowth(example_data$time,example_data$y15, model="gompertz")
summary(g)  # regression successful but with no significatvity on all coefficients
## 
## Formula: log(y) ~ log(N0) + (log(Nmax) - log(N0)) * exp(-exp(((mu * exp(1))/(log(Nmax) - 
##     log(N0))) * (lambda - x) + 1))
## 
## Parameters:
##         Estimate Std. Error t value Pr(>|t|)    
## N0     1.677e-01  6.801e-03  24.664  < 2e-16 ***
## Nmax   2.052e-01  2.929e-03  70.034  < 2e-16 ***
## mu     3.243e-03  4.271e-04   7.594 1.19e-13 ***
## lambda 0.000e+00  1.493e+01   0.000        1    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08399 on 601 degrees of freedom
## 
## Algorithm "port", convergence message: relative convergence (4)

To avoid this problem we recommend you to first make a preselection on your data to select those for which a growth is observed, for example with a difference of at least 0.05 unit of DO is observed between the first and last points of the curve.

coefficients

Estimated coefficients with the regression can also be recovered using the coefficients element.

g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz")
g$coefficients     # returns the four parameters
## $N0
## [1] 0.139803
## 
## $Nmax
## [1] 1.426946
## 
## $mu
## [1] 0.06962468
## 
## $lambda
## [1] 44.68999
g$coefficients$mu  # returns only mu
## [1] 0.06962468
data

You can recover the clipped data that was used for the regression by looking at the data element of the regression, as shown below.

g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz")
head(g$data$x)   # we only show the first values of the x data with head
## [1] 0.0000000 0.1666667 0.3333333 0.5000000 0.6666667 0.8333333
head(g$data$y)
## [1] 0.158 0.165 0.132 0.136 0.132 0.136
call

With the call element of the regression you can access the various parameters used during the regression: - g$call$x and g$call$y provide you with the data used - g$call$clip provides you with the chosen value the the clip - g$call$start, g$call$lower, g$call$upper provide you with the chosen start, lower and upper values - g$call$nls.args provides you with the various g$call$nls.args parameters you implemented.

message

Writing g$message returns the error message if applicable.

f

With the f element of the regression you can access the following functions: - g$f$confint() returning the confidence intervals of each estimated coefficient. By default, confidence level is at 0.95 but you can specify the one you want.

g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz")
g$f$confint()   # by default level = 0.95
##              2.5 %      97.5 %
## N0      0.13917312  0.14043281
## Nmax    1.41057360  1.44331780
## mu      0.06882522  0.07042413
## lambda 44.48819246 44.89177936
g$f$confint(level=0.99)   
##              0.5 %      99.5 %
## N0      0.13897424  0.14063169
## Nmax    1.40540405  1.44848736
## mu      0.06857279  0.07067656
## lambda 44.42447536 44.95549646
  • g$f$confint.lower() and g$f$confint.upper() returning the lower and upper limits of the confidence band (see The plot function section to visualize graphically the confidence band with the display.confint = TRUE parameter of the plot)
  • doublingTime(level) : returning the doubling time calculated with mu as well as the time at which population is first doubled. By default, confidence level is at 0.95 but you can specify the one you want.
g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz")
g$f$doublingTime()   # by default level = 0.95
##                  2.5 %     97.5 %
## doublingTime  10.07112   9.842467
## start        120.57091 111.075169
g$f$doublingTime(level=0.99)   
##                 0.5 %     99.5 %
## doublingTime  10.1082   9.807313
## start        122.9708 110.047652
  • intersection() returning, for a given x, the value y in logarithm form (by default), or the x value to reach a given y value in logarithm form. Base can be set to NULL for results not in logarithmic form or to 10 for results in base log10. It should be taken into account that formula used under base 10 or 2 correspond to log(N/N0), while in base NULL, it is N. For linear model, only base NULL is available.
x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11)
y = c(1000, 6500, 62000, 32000, 320000, 114000000, 300000000, 1600000000, 380000000, 350000000)
g = MicrobialGrowth(x,y, model="gompertz")
g$f$intersection(y=3,base=10)  # time to 3 log increase calculated with base 10
## [1] 4.473836
g$f$intersection(y=6-log10(g$coefficients$N0),base=10)   # time to reach a population of 6 log calculated with base 10
## [1] 4.050293
g$f$intersection(y=10^(3+log10(g$coefficients$N0)), base=NULL)    # time to 3 log increase calculated with base NULL
## [1] 4.473836
g$f$intersection(y=10^6, base=NULL)   # time to reach a population of 6 log calculated with base NULL
## [1] 4.050293
  • g$f$formula() returning the formula of the object (corresponding to the 4 different models)
g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz")
g$f$formula()   
## function (x) 
## {
##     A = log(Nmax/N0)
##     return({
##         (A * exp(-exp((mu * exp(1)/A) * (lambda - x) + 1))) * 
##             log(exp(1), base)
##     })
## }
## <bytecode: 0x558c5788fd98>
## <environment: 0x558c578973f8>
  • auc() : A function returning the area under the curve, by default in logarithm. As for interaction, the base for calculation can be changed and for linear model, only base NULL is available.
g = MicrobialGrowth(example_data$time,example_data$y1, model="gompertz")
g$f$auc()  
## 85.86874 with absolute error < 8e-05
isValid

To verify if the regression went well and coefficients were found, you can write g$isValid, that will return TRUE or FALSE.

quick overview

If you want to access all the data present in the data, call, message, coefficients, f and is.valid elements without having to write them in the console, you can simply write View(g)

The MicrobialGrowth.create function

Another way to create a MicrobialGrowth-object is to use the MicrobialGrowth.create function. This can be useful, especially if you have already done the regression work previously and stored the results (N0, …, lambda), and just want to recreate a plot (without having to redo the regressions).

The MicrobialGrowth.create function requires the four coefficients N0, Nmax, mu and lambda, the model as well as a parameter xlim in order to simulate data over the given interval. In the case of a linear model, even though their are no N0 and Nmax, numerical values should be provided anyway to keep the same object structure. You can provide for example N0=1 and Nmax=2, the numerical value won’t impact the MicroiologicalGrowth-object.

g <- MicrobialGrowth.create(N0 = 0.14, Nmax = 1.43, mu = 0.07, lambda = 45, model = "gompertz", xlim = c(0, 100))
g <- MicrobialGrowth.create(N0 = 1, Nmax = 2, mu = 1, lambda = 5, model = "linear", xlim = c(0, 10))

This function returns a MicrobialGrowth-object very similar to those obtained with the MicrobialGrowth regression function: the structure is the same, as are their uses (print, plot, etc.). The main difference is that many members have a NULL value (reg, clip, start, etc.) There are also some minor differences, notably in the print display and the non-availability of the summary method (which makes no sense without regression).

Here are some comparisons between regression and user-defined MicrobialGrowth-object:

g.reg <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz")
g.custom <- MicrobialGrowth.create(g.reg$coefficients$N0, g.reg$coefficients$Nmax, g.reg$coefficients$mu, g.reg$coefficients$lambda, c(min(g.reg$data$x), max(g.reg$data$x)), model="gompertz")

cat("1. Regression\n")
print(g.reg)
cat("\n2. User-defined\n")
print(g.custom)

oldpar <- par(mfrow = c(1,2))
plot(g.reg, coefficients.args = list(cex = 0.75), main = "Regression")
plot(g.custom, coefficients.args = list(cex = 0.75), main = "User-defined")
par(oldpar)
## 1. Regression
## MicrobialGrowth, model gompertz: 
##          N0        Nmax          mu      lambda 
##  0.13980296  1.42694570  0.06962468 44.68998591 
## 
## 2. User-defined
## MicrobialGrowth, model gompertz: 
##          N0        Nmax          mu      lambda 
##  0.13980296  1.42694570  0.06962468 44.68998591

For each of the four coefficients, you can specify from one to three values. Specifying two or three values allows you to simulate confidence intervals. For three values, the minimum and maximum values will be used respectively for the lower and upper bounds of the confidence interval, and the median value as the coefficient. For two values, the minimum and maximum values will be used respectively for the lower and upper bounds of the confidence interval, and the coefficient will be defined as the mean of the two values. For a single value, the coefficient and the bounds of the confidence interval will all be equal.

g <- MicrobialGrowth.create(N0 = 0.14,   # Single value
                     Nmax = c(1.41, 1.45),     # Two values
                     mu = c(0.06, 0.07, 0.08), # Three values
                     lambda = c(45, 46, 44),   # Values will be reordered
                     model = "gompertz",
                     xlim = c(0, 100))
print(g)
## MicrobialGrowth, model gompertz: 
##     N0   Nmax     mu lambda 
##   0.14   1.43   0.07  45.00
g$f$confint()
##          min   max
## N0      0.14  0.14
## Nmax    1.41  1.45
## mu      0.06  0.08
## lambda 44.00 46.00

The plot function

MicrobialGrowth-objects have their own plotting function available through the generic plot function. So you can just write:

g <- MicrobialGrowth(example_data$time, example_data$y1,model="gompertz")
plot(g)

The print function will display the data in y-logged form (\(ln(N/N_0)\)) together with the curve and the values* obtained by regression. *Values are rounded to \(10^{-4}\)

MicrobialGrowth tries to offer you the greatest freedom, especially with the plot function which offers a great level of customization. Indeed, you can use the usual graphical parameters such as pch, col, xlab, etc. (see the plot help)

g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz")
plot(g, pch = 4, cex = 2, col = "blue", xlab = "Time (hours)", main = "Gompertz regression")

Additional parameters are provided for plotting MicrobialGrowth-object: - hide the coefficients by setting the display.coefficients parameter to FALSE - modify the curve obtained by regression via the reg.args parameter by specifying any parameter compatible with the curve function (see the curve help)

g <- MicrobialGrowth(example_data$time, example_data$y1,model="gompertz")
plot(g, display.coefficients = FALSE, reg.args = list(col = "green", lty = 2, lwd = 5))

g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz")
plot(g, display.confint = TRUE)

g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz")
plot(g, xlim = c(80, 100), ylim = c(1.8, 2.4), # Zoom in to see the example better 
     display.confint = TRUE, 
     confint.args = list(
         lines = list(col = "purple", lty = 2, lwd = 2), 
         area = list(col = "green", opacity = 0.1)
     ))

g <- MicrobialGrowth(example_data$time, example_data$y1, model="gompertz")
plot(g, main = "Gompertz", 
     coefficients.args = list(cex = 1.5, side = 4, line = 1), 
     title.args = list(col.main = "blue", col.lab = "red"))

g <- MicrobialGrowth.create(N0 = 0.14, Nmax = 1.43, mu = c(0.05, 0.08, 0.09), lambda = c(42, 48), model = "gompertz", xlim = c(0, 100))
plot(g, display.confint = TRUE, main = "From MicrobialGrowth.create", sub = "With custom variation on mu and lambda")

g <- MicrobialGrowth.create(N0 = 0.14, Nmax = 1.43, mu = c(0.05, 0.08, 0.09), lambda = c(42, 48), model = "gompertz", xlim = c(0, 100))
oldpar <- par(mfrow = c(2,2))
plot(g, display.coefficients = FALSE, main = "default base ln")
plot(g, base=10, display.coefficients = FALSE, main = "base 10")
plot(g, base=NULL, display.coefficients = FALSE, main = "y values")
par(oldpar)

Note that all of these features are of course available with a user-created MicrobialGrowth object via MicrobialGrowth.create.

How to …

This section brings together different questions and answers to find a solution to your questions more quickly without having to reread this entire document or browse through all the documentation.

How to regress mutliple data

Solution 1: directly by supplying several y to the MicrobialGrowth function

G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="gompertz") #For `y`, we simply exclude the first column of time

Solution 2: loop through your different y values with a for loop

G <- list()
for (y in colnames(example_data)[2:ncol(example_data)]) { #For `y`, we simply exclude the first column of time
    G[[y]] <- MicrobialGrowth(example_data$time, example_data[[y]], model="gompertz")
}

How to deal with a regression failure

Solution 1: Modify the lower, upper or start value to improve regression by providing narrowed range of possible values for the coefficients than the ones provided by default (see section start, lower and upper.

x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11)
y = c(1.26e+03, 1.00e+03, 7.94e+02, 1.26e+03, 1.26e+03, 3.16e+04, 7.94e+05, 5.01e+05, 1.26e+06, 1.00e+07)
MicrobialGrowth(x,y,lower=list(N0=700))
## MicrobialGrowth, model gompertz: 
##           N0         Nmax           mu       lambda 
## 9.488770e+02 1.330430e+07 1.522814e+00 3.871108e+00

Solution 2: Use the nls.args parameter and the control = list(warnOnly=TRUE) subparameter to force return of coefficients (remain vigilant because it provides only a good combination among others, see nls.args section).

x = c(0, 1, 2, 3, 4, 6, 8, 9, 10, 11)
y = c(1.26e+03, 1.00e+03, 7.94e+02, 1.26e+03, 1.26e+03, 3.16e+04, 7.94e+05, 5.01e+05, 1.26e+06, 1.00e+07)
MicrobialGrowth(x,y,nls.arg=list(control=list(warnOnly=TRUE)))
## Warning in (function (formula, data = parent.frame(), start, control =
## nls.control(), : Convergence failure: singular convergence (7)
## MicrobialGrowth, model gompertz: 
##           N0         Nmax           mu       lambda 
## 9.488826e+02 1.330271e+07 1.522831e+00 3.871129e+00

How to separate successful from failed regressions

You can use the isValid member

G <- MicrobialGrowth(example_data$time, example_data[2:ncol(example_data)], model="gompertz")
G.success = Filter(function(x) x$isValid, G)
G.failed = Filter(function(x) !x$isValid, G)

Be careful, if you use MicrobialGrowth-objects created with MicrobialGrowth.create, these will be marked as valid. Likewise, if you force the return of coefficient values despite a regression problem (with the parameter nls.args = list(control = list(warnOnly = TRUE))) these will be marked as valid.

How to export coefficients as data.frame

df <- as.data.frame(lapply(G, function(x) unlist(x$coefficients)))
# t(df) #To swap axes
# write.csv(df, "file.csv") #To export as CSV file

How to apply a theme to all plots

You want to modify the visual aspect of the plots but without having to specify a whole bunch of parameters each time ?

You can create a wrapper function that will take care of calling the plot function for the MicrobialGrowth-objects with the parameters you want.

my.plot <- function(x, ...) {
  plot(x, pch = 4, col = colorRampPalette(c("blue", "red"))(length(x$data$x)), reg.args = list(col = "green", lty = 2, lwd = 2), coefficients.args = list(side = 1, line = 4), ...)
}

# Keep `...` to use some customizations as the main title specific to each MicrobialGrowth-object
my.plot(MicrobialGrowth(example_data$time, example_data$y1), main = "Plot with my theme")


  1. ↩︎

  2. INRAE, ↩︎