Hacking the waveRAPID Data Fitting

Author
Affiliations

Eric Largy

ARNA, INSERM U1212, CNRS UMR 5320, Université de Bordeaux

UFR des Sciences Pharmaceutiques, Université de Bordeaux

Published

March 18, 2026

Introduction

This document provides an overview of our current efforts in fitting “waveRAPID” data using the direct estimation method described in a 2020 patent (Fievet and Cottier 2020), and used in the publication by Kartal and colleagues (Kartal et al. 2021). The workflow includes the generation of fake experimental data, smoothing, derivation, initial parameter estimation, nonlinear optimization, and bootstrapping for confidence interval determination.

The goal is to estimate kinetic parameters (\(k_a\), \(k_d\), \(R_{\text{max}}\)) from potentially noisy sensorgram data with a non-commercial approach, while assessing its robustness and reproducibility.

Fake experimental data generation

To generate fake, but realistic, experimental data, we first created an Excel file gathering the pulse presets available on the commercial system. The presets are defined for weak, intermediate and strong binders, and include the duration of each pulse, inter-dissociation time, last dissociation time, total time of the analysis, and acquisition frequency. To avoid typing mistakes, the total time of the analysis is actually re-calculated from the pulse durations and dissociation times prior to generating the fake data. A custom preset column was also included to allow for testing of custom pulse profiles.

Code
librarian::shelf(readxl, data.table, DT)

presets <- readxl::read_excel("R/pulse_presets.xlsx") |> as.data.table()
DT::datatable(presets, options = list(pageLength = 10, scrollX = TRUE))
Table 1: Pulse presets used for simulating waveRAPID data, including pulse durations, inter-dissociation times, last dissociation times, and total analysis times for weak, intermediate, strong binders, and a custom preset.
Code
# Function to determine the current pulse
source("R/sim_waveRAPID.R")

# True parameters used for data simulation 1
ka <- 2710          # Association rate constant (M^-1 s^-1)
kd <- 0.278         # Dissociation rate constant (s^-1)
Rmax <- 14.77       # Maximum response (RU)
conc <- 100E-6      # Analyte concentration (M)
noise_sd <- 0.01   # Standard deviation of noise
cutoff_time <- 50   # Time cutoff for simulation (s)

# Simulate waveRAPID data using the defined parameters and the intermediate pulse preset,
# with a time cutoff of 50 seconds to focus on the most informative part of the response, 
# and a random seed for reproducibility. 
# The noise standard deviation is set to mimic realistic experimental conditions while allowing for parameter estimation.
sim <- simulate_waveRAPID(
  true_ka = ka,
  true_kd = kd,
  true_Rmax = Rmax,
  pulse_hgt = conc,
  preset    = "intermediate",
  time_cutoff = cutoff_time,
  seed      = 42,
  noise_sd  = noise_sd,
  file = "R/pulse_presets.xlsx"
)

Fake experimental data was generated by simulating the response of a Langmuir binding model to a waveRAPID concentration profile. As a reminder, the concentration profile consists of multiple pulses of same concentration of analyte but increasing durations. The response is generated by solving the corresponding ordinary differential equation (ODE) with added noise to mimic experimental conditions.

Here, we developed an R function coined sim_waveRAPID, which i/ builds the pulse profile and concentration function \(c(t)\), then ii/ models the binding kinetics using the Langmuir model (Equation 1):

\[\frac{dR}{dt} = k_a \cdot c(t) \cdot (R_{\text{max}} - R) - k_d \cdot R \tag{1}\]

In this example, the parameters were set to \(c(t)=\) \(10^{-4}\), \(k_a =\) 2710, \(k_d =\) 0.278, and \(R_{\text{max}} =\) 14.77. The pulse starts and ends are those from the intermediate preset (Table 1). The signal was “collected” every 10 seconds over a total duration of 50 seconds (shorter than in the preset to focus on the most informative part of the response).

The ODE was solved using the ode function from the deSolve package, which provides a numerical solution for the response \(R(t)\) over time. The resulting response was then stored in a data table, and random noise was added with a normal distribution of standard deviation 0.01 to create a more realistic dataset for parameter estimation. The resulting kinetic data, presented in Figure 1, replicate faithfully the first example of Figure 4 from (Kartal et al. 2021).

Code
sim_plot(sim) #plot simulated data
Figure 1: Simulated waveRAPID data showing the observed response (with noise) and the concentration profile over time. Grey ribbons indicate the association phases (when the analyte is present), following the pulses shown in the panel below.

Fitting of the fake experimental data

The following calculation and plotting were performed using the fit_waveRAPID function, which implements the smoothing and derivative estimation of data (Section 3.1), direct estimation method for waveRAPID data (Section 3.2), as well as nonlinear optimization for parameter refinement (Section 3.3). The function takes the simulated data as input, along with options for smoothing and regularization, and returns a fitted model object containing the estimated parameters and various diagnostic plots. The function is structured to allow for flexibility in the fitting process, including the choice of optimization method (OLS vs WLS) and the ability to perform bootstrapping for confidence interval estimation, using boot_waveRAPID (Section 3.4.2).

Data Smoothing and Derivative Estimation

Code
source("R/fit_waveRAPID.R") #function to fit

# Setup
smooth_value <- 0.01 # Smoothing parameter for the smoothing spline, set to a small value to allow for a close fit to the data while reducing noise-induced artifacts in the derivative estimation.
lbd <- 0   # Regularization parameter for the direct estimation method, set to 0 for no regularization in this initial fitting step. This can be tuned based on the level of noise and the desired balance between bias and variance in the parameter estimates.

fit <- fit_waveRAPID(
  sim_data = sim,
  do_optimization = FALSE,
  smooth_param = smooth_value,
  lbd = lbd
)

To estimate the kinetic parameters from the noisy response data, we first need to smooth the observed response and compute its derivative. The derivative will be used in the direct estimation method to construct the observation matrix and vector for parameter estimation. Noisy data can lead to artifacts in this derivative, so smoothing is essential to obtain reliable estimates.

Here, we use a smoothing spline to fit the observed response_data and then compute the derivative of the smoothed response dR_dt (Figure 2).

Code
fit_plot(fit, "deriv")
Figure 2: Smoothing spline fit to the observed response data and its derivative.

Initial Parameter Estimation

Observation Matrix and Vector Construction

The direct estimation method for waveRAPID data involves constructing an observation matrix \(X\) and vector \(b\) from the smoothed response and its derivative. This approach is derived from the linearization of the Langmuir binding model, as described in the patent (Fievet and Cottier 2020). The observation matrix \(X\) is constructed as:

\[ X = \begin{bmatrix} -c(t) & c(t) \cdot \hat{R}(t) & \hat{R}(t) \end{bmatrix} \tag{2}\]

or more explicitely:

\[ X = \begin{bmatrix} -c_1 & c_1 \cdot \hat{R}_1 & \hat{R}_1 \\ \vdots & \vdots & \vdots \\ -c_n & c_n \cdot \hat{R}_n & \hat{R}_n \end{bmatrix} \tag{3}\]

where:

  • \(c(t)\) is the concentration profile,
  • \(\hat{R}(t)\) is the smoothed response,
  • \(t\) is the time index, from 1 to n.

The observation vector \(b\) is simply the derivative of the smoothed response \(\frac{d\hat{R}}{dt}\): \[ b = \begin{bmatrix} (\frac{d\hat{R}}{dt})_1 \\ \vdots \\ (\frac{d\hat{R}}{dt})_n \end{bmatrix} \tag{4}\]

Parameter Estimation with Tikhonov Regularization

The intermediate parameters \(\theta = [\theta_1, \theta_2, \theta_3]^T\) are estimated by solving the regularized linear system:

\[ \hat{\theta} = -(X^T X + \Lambda I)^{-1} X^T b \tag{5}\]

where \(\Lambda\) is a regularization parameter to stabilize the solution. This follows the principle of Tikhonov regularization (also called Ridge regression), which is commonly used to estimate coefficients in multiple-regression problems, especially when the variables are highly correlated.

Here \(\Lambda\) is a diagonal matrix with \(\lambda\) on the diagonal. This regularization term may help to prevent overfitting and ensure that the parameter estimates are stable, especially when the observation matrix \(X\) is ill-conditioned or when there is significant noise in the data. \(\lambda\) (and therefore \(\Lambda\)) can be tuned based on the level of noise and the desired balance between bias and variance in the parameter estimates. It was here kept at 0.

The kinetic parameters are then recovered as:

\[ k_a = \theta_2, \quad k_d = \theta_3, \quad R_{\text{max}} = \frac{\theta_1}{\theta_2} \tag{6}\]

Here, we found the following values for the kinetic parameters:

\(k_a =\) 1498, \(k_d =\) 0.2583, and \(R_{\text{max}} =\) 17.02.

Note that R was artifically decreased to 80% of its estimated value, as more accurate estimations produced issues with the optimization step, below. Hence, the corresponding simulation of the response using the ODE model does not match the experimental response intensity (Figure 3). However, the kinetic parameters are close to the true parameters, and the shape of the response is reasonably well captured, which is sufficient for the optimization step.

Code
fit_plot(fit, "init_estim")
Figure 3: Comparison of the fake observed response with the response predicted using the initial parameter estimates obtained from the direct estimation method with regularization.

Parameter Optimization by Ordinary Least Squares (OLS) and Weighted Least Squares (WLS) Fitting

Objective Function Definition and Optimization Setup

Code
fit <- fit_waveRAPID(
  sim_data = sim,
  do_optimization = TRUE,
  smooth_param = smooth_value,
  lbd = lbd
)

To further refine the parameter estimates obtained from the direct estimation method, we can perform nonlinear optimization using ordinary least squares (OLS) or weighted least squares (WLS) fitting. The objective function is defined as the sum of squared residuals (SSR) between the observed response and the predicted response from the ODE simulation.

\[ SSR_{\text{OLS}} = \sum_{i=1}^{n} (R_{\text{obs}, i} - R_{\text{pred}, i})^2 \tag{7}\]

Here, we chose to optimize the parameters in log-space to ensure positivity and to handle the wide range of parameter values effectively during optimization. Hence, the log_objective_function takes the log-transformed parameters, simulates the response using the ODE model, and computes the SSR. The ODE is solved with the lsoda method, which automatically handles stiff and non-stiff problems, and we set tolerances and maximum steps to ensure a balance between accuracy and computational efficiency. We leveraged the R implementation included in the deSolve package (Soetaert, Petzoldt, and Setzer 2010), which interfaces with the underlying Fortran code (Petzold 1983).

If the dissoc_only flag is set to TRUE, the optimization switches to WLS by by assigning zero weights to the association phases in the SSR calculation, and therefore focuses on dissociation data only. This is suggested in the original publication to avoid RI disturbance (Kartal et al. 2021), so that misfits in the association parts of the pulses do not impact the parameter estimation. Concretely, we defined association and dissociation phases based on the concentration profile, where association phases correspond to times when \(c(t)\) is greater than zero and dissociation phases correspond to times when \(c(t)\) is nil. Equation Equation 7 is therefore modified to include weights as follows:

\[ SSR_{\text{WLS}} = \sum_{i=1}^{n} w_i (R_{\text{obs}, i} - R_{\text{pred}, i})^2 \tag{8}\]

where \(w_i\) is the weight for each time point, defined as: \[w_i = \begin{cases}0 & \text{if } c(t_i) > 0 \quad (\text{association phase}) \\1 & \text{if } c(t_i) = 0 \quad (\text{dissociation phase})\end{cases} \tag{9}\]

The optimization is performed using the COBYLA algorithm (Constrained Optimization BY Linear Approximations) for derivative-free optimization with nonlinear inequality and equality constraints. We used the R implementation of COBYLA available in the nloptr package (Ypma and Johnson 2011), adapted from Jean-Sébastien Roy’s C implementation of the original Fortran code by M. J. D. Powell (Powell 1994).

The optimization is bounded around the initial estimates to ensure that the parameter search is focused and to prevent divergence. The default output of the fit function is shown below, including the nubmer of iterations, the final parameter estimates, and the value of the objective function at the optimum.

Code
fit
── waveRAPID_fit ──────────────────────────────────────────
  Stages run    : 1, 2, 3
  True params   : ka = 2710  |  kd = 0.278  |  Rmax = 14.77
  Initial estim : ka = 1497.54  |  kd = 0.258332  |  Rmax = 17.0173
  OLS           : ka = 2269.83  |  kd = 0.280808  |  Rmax = 16.7971
    Convergence : status = 4  |  iter = 99  |  obj = 0.967274
    Message     : NLOPT_XTOL_REACHED: Optimization stopped because xtol_rel or xtol_abs (above) was reached.
  WLS           : ka = 2487.79  |  kd = 0.281455  |  Rmax = 15.8079
    Convergence : status = 4  |  iter = 169  |  obj = 0.175332
    Message     : NLOPT_XTOL_REACHED: Optimization stopped because xtol_rel or xtol_abs (above) was reached.
  Residual summary (sd):
    OLS                  sd = 0.0389
    WLS                  sd = 0.0290
  Available plots: obs, deriv, init_estim, fit, comparison, residuals
───────────────────────────────────────────────────────────

Table 2 presents the kinetic parameters obtained from the initial estimation, OLS optimization, and WLS optimization, along with their comparison to the true parameters used for data simulation. The table also includes the percentage of the true parameters represented by the initial estimates and the optimized estimates for both approaches. WLS allows the closest estimation of the true values; in particular the improvement of the association constant is significant. The \(R_{\text{max}}\) value is also closer to reality.

These refined estimates of the kinetic parameters were used to simulate the response, and compare it to the observed data for visual validation (Figure 4). The slight overestimation of \(R_{\text{max}}\) by OLS is visible but modest, and the fits are generally visually excellent.

Code
get_param_table(fit)
Table 2: Comparison of the true parameters, initial estimates, and optimized estimates obtained from the nonlinear optimization of the objective function, showing both OLS (association + dissociation) and WLS (dissociation only) approaches.
Code
fit_plot(fit, "fit")
Figure 4: Comparison of the observed response data with the predicted response using the optimized parameter estimates obtained from the nonlinear optimization of the objective function, showing both OLS (association + dissociation) and WLS (dissociation only) approaches.

Fitting diagnostics and confidence

Residuals

Residuals are a key diagnostic tool for assessing the quality of the fit and identifying any systematic deviations between the observed data and the model predictions. They were calculated as the difference between the observed response and the predicted response from the ODE simulation using the estimated parameters. The residuals were then plotted over time to visually inspect for patterns, trends, or heteroscedasticity, which may indicate model misspecification or areas where the fit could be improved. Figure 5 shows an improvement in the residuals using WLS, as expected from the observations above. However, these residuals do not seem to be normally distributed.

Q-Q plots can help to identify any deviations from normality, such as skewness or heavy tails, which could impact the validity of the parameter estimates and the confidence intervals derived from them. In particular, OLS optimizations assumes that the residuals are normally distributed. Here, the points in the OLS Q-Q plot deviate substantially from the dashed reference line, especially at both tails (lower and upper quantiles), suggesting that the residuals are not normally distributed. This may be due to the influence of RI during association phases. In contrast, the WLS Q-Q plot shows points that are much closer to the reference line, indicating that the residuals from the WLS fit are more normally distributed. There is still some deviation, particularly in the lower quantiles. Other approaches to opmitization could be explored to further improve the fit and the normality of the residuals, such as using more robust regression methods (e.g., Huber regression, Tukey’s bisquare). Generalized least squares (GLS) could also be considered to account for any heteroscedasticity in the residuals, by explicitely modeling their covariance structure.

Code
fit_plot(fit, "residuals")
Figure 5: Left: Residuals of the fit for the initial parameter estimates and the optimized parameter estimates obtained from both OLS (association + dissociation) and WLS (dissociation only) approaches, plotted over time to assess the quality of the fit and identify any systematic deviations. Right: Q-Q plot of the residuals to assess their normality, which is an assumption of the OLS fitting method.

Bootstrapping

Finally, we performed non-parametric bootstrapping analysis to provide a robust estimate of the uncertainty in the kinetic parameters. The principle of bootstrapping is to assess the variability of parameter estimates by resampling the observed data with replacement and re-estimating the parameters for each bootstrap sample. This approach allows us to derive confidence intervals for the parameters without relying on strict parametric assumptions, which is particularly useful in complex models like ours where the distribution of parameter estimates may not be normal.

Practically, we generated bootstrap samples by resampling the residuals from the best-fitting model (using WLS) and adding them back to the smoothed response. For each bootstrap sample, we re-estimated the parameters using the same RAPID linear approach followed by nonlinear refinement with COBYLA. We carried out 100 iterations to balance accuracy and computational time. To significantly decrease the latter, we implemented the bootstrap procedure in parallel on multiple CPU cores using the parallel package.

The resulting distributions of parameter estimates across all bootstrap iterations were then summarized to obtain median values and 95% confidence intervals for each parameter (Table 3).

Code
param_table(boot) 
Table 3: Bootstrap summary table showing the median and 95% confidence intervals for each kinetic parameter, along with the true parameter values used for data simulation.

The bootstrap distributions are also shown as density plots, showing the true parameter values (dashed blue lines), the median of the bootstrap estimates (dashed pink), and the 95% confidence intervals (dotted pink; Figure 6).

Code
plot(boot, "density")
Figure 6: Bootstrap distributions of kinetic parameters, showing the true parameter values, the median of the bootstrap estimates, and the 95% confidence intervals. The density plots illustrate the variability in the parameter estimates across bootstrap iterations, while the vertical lines indicate the true values (blue dashed), the median of the bootstrap estimates (pink dashed), and the 95% confidence intervals (pink dotted).

Additionally, we plotted the correlation between the bootstrap estimates of \(k_a\) and \(k_d\), colored by \(R_{\text{max}}\), to explore any potential relationships between the parameters across the bootstrap iterations (Figure 7). The somewhat linear relationship that is observed between \(k_a\), \(k_d\) and \(R_{\text{max}}\) across the OLS bootstrap iterations may indicate a trade-off between these parameters, which is important to consider when interpreting the results. This is much less the case with WLS.

Code
plot(boot, "scatter")
Figure 7: Correlation between bootstrap estimates of ka and kd, colored by Rmax, with a linear regression line to show the correlation between the parameters across the bootstrap iterations. The blue dashed lines indicate the true parameter values used for data simulation.

Application to other experiments

Below, we apply the same approach to the two other experiments shown in Figure 4 from the original publication (Kartal et al. 2021). Relatively good fits are obtained, with a significant improvement when using WLS.

Code
ka_2 <- 1.16E4        
kd_2 <- 4.60E-2       
Rmax_2 <- 26.30    
conc_2 <- 10E-6     
noise_sd_2 <- 0.01  
cutoff_time_2 <- 50  

ka_3 <- 8.40E3
kd_3 <- 5.50E-3
Rmax_3 <- 25.80
conc_3 <- 10E-6
noise_sd_3 <- 0.01  
cutoff_time_3 <- 50  
Code
fit_2 <- fit_waveRAPID(
  sim_data =  simulate_waveRAPID(
    true_ka = ka_2,
    true_kd = kd_2,
    true_Rmax = Rmax_2,
    pulse_hgt = conc_2,
    preset    = "intermediate",
    time_cutoff = cutoff_time_2,
    seed      = 42,
    noise_sd  = noise_sd_2,
    file = "R/pulse_presets.xlsx"
  ),
  do_optimization = TRUE,
  smooth_param = smooth_value,
  lbd = lbd
)


fit_3 <- fit_waveRAPID(
  sim_data =  simulate_waveRAPID(
    true_ka = ka_3,
    true_kd = kd_3,
    true_Rmax = Rmax_3,
    pulse_hgt = conc_3,
    preset    = "intermediate",
    time_cutoff = cutoff_time_3,
    seed      = 42,
    noise_sd  = noise_sd_3,
    file = "R/pulse_presets.xlsx"
  ),
  do_optimization = TRUE,
  smooth_param = smooth_value,
  lbd = lbd
)
Code
fit_plot(fit_2, "fit")
Figure 8: Comparison of the observed response data with the predicted response for experiment 2.
Code
get_param_table(fit_2)
Table 4: Comparison of the true parameters, initial estimates, and optimized estimates obtained from the nonlinear optimization of the objective function for experiment 2, showing both OLS (association + dissociation) and WLS (dissociation only) approaches.
Code
fit_plot(fit_3, "fit")
Figure 9: Comparison of the observed response data with the predicted response for experiment 3.
Code
get_param_table(fit_3)
Table 5: Comparison of the true parameters, initial estimates, and optimized estimates obtained from the nonlinear optimization of the objective function for experiment 3, showing both OLS (association + dissociation) and WLS (dissociation only) approaches.
Code
boot_2 <- boot_waveRAPID(fit_2, n_boot = n_boot)
Code
boot_3 <- boot_waveRAPID(fit_3, n_boot = n_boot)

Noisy data

In this last example, we reproduce the data from the first example but with 10 times more noise (Figure 10), to better evaluate the robustness of the approach to noise, and the potential improvement brought by WLS.

Code
# True parameters used for data simulation 1
ka <- 2710          # Association rate constant (M^-1 s^-1)
kd <- 0.278         # Dissociation rate constant (s^-1)
Rmax <- 14.77       # Maximum response (RU)
conc <- 100E-6      # Analyte concentration (M)
noise_sd <- 0.1     # Standard deviation of noise (10 times higher)
cutoff_time <- 50   # Time cutoff for simulation (s)

# Simulate waveRAPID data using the defined parameters and the intermediate pulse preset,
# with a time cutoff of 50 seconds to focus on the most informative part of the response, 
# and a random seed for reproducibility. 
# The noise standard deviation is set to mimic realistic experimental conditions while allowing for parameter estimation.
sim_noise <- simulate_waveRAPID(
  true_ka = ka,
  true_kd = kd,
  true_Rmax = Rmax,
  pulse_hgt = conc,
  preset    = "intermediate",
  time_cutoff = cutoff_time,
  seed      = 42,
  noise_sd  = noise_sd,
  file = "R/pulse_presets.xlsx"
)
Code
#+ label: fig-noise
#| fig-cap: "Simulated waveRAPID data with increased noise, showing the observed response (with higher noise) and the concentration profile over time. Blue ribbons indicate the association phases (when the analyte is present), while white areas indicate the dissociation phases (when the analyte is absent).
#| fig-height: 8
sim_plot(sim_noise)

Influence of the smoothing parameter

We use below three smoothing parameters values, spanning three orders of magnitudes, to evaluate the influence of this parameter on the fit quality in the context of increased noise in the data. The regularization parameter is kept at 0 to focus on the impact of smoothing only.

The fit quality is visually similar (Figure 11) for the different smoothing parameters, and the estimated values are generally very close. The largest smoothing value (0.1) seems to produce a slightly worse fit, and a slightly worse estimation of the kinetic parameters, but the differences are not very significant. The WLS approach seems to be more robust to estimate the association constant, which in any case remains relatively far from the true value.

Code
smooth_values <- c(0, 0.001, 0.01, 0.1) 
captions <- paste("Smoothing parameter:", smooth_values)
lbd <- 0   

fit_noises <- lapply(smooth_values, function(x) {
  fit_waveRAPID(
    sim_data = sim_noise,
    do_optimization = TRUE,
    smooth_param = x,
    lbd = 0
  ) 
})

p_noise_1 <- (fit_plot(fit_noises[[1]], "deriv") + ggtitle(captions[1])) /
  (fit_plot(fit_noises[[2]], "deriv") + ggtitle(captions[2])) /
  (fit_plot(fit_noises[[3]], "deriv") + ggtitle(captions[3])) /
  (fit_plot(fit_noises[[4]], "deriv") + ggtitle(captions[4]))
Code
p_noise_2 <- (fit_plot(fit_noises[[1]], "fit") + ggtitle(captions[1])) /
  (fit_plot(fit_noises[[2]], "fit") + ggtitle(captions[2])) /
  (fit_plot(fit_noises[[3]], "fit") + ggtitle(captions[3])) /
  (fit_plot(fit_noises[[4]], "fit") + ggtitle(captions[4]))
Code
p_noise_1
Figure 10: Comparison of the observed response data with the predicted response using the optimized parameter estimates obtained from the nonlinear optimization of the objective function for different smoothing parameters, showing both OLS (association + dissociation) and WLS (dissociation only) approaches, in the context of increased noise in the data. The impact of the smoothing parameter on the fit quality is evaluated, especially in the context of increased noise in the data.
Code
p_noise_2
Figure 11: Fit results of noisy data using increasingly large smooth parameters.
Code
tbl_noises <- lapply(fit_noises, function(fitty) {
  get_param_table(fitty)
})

tbl_noises[[1]]
tbl_noises[[2]]
tbl_noises[[3]]
tbl_noises[[4]]
Table 6: Estimated kinetic parameters for the fits of noisy data using increasingly large smooth parameters.

Influence of the Λ regularization parameter

Here, we use four values of the regularization parameter, spanning several orders of magnitude, to evaluate the influence of this parameter on the fit quality in the context of increased noise in the data. The smoothing parameter is kept at 0.01 to focus on the impact of regularization only. The fit quality is visually similar (Figure 12) for \(\lambda \ll 10^{-6}\), and the estimated values are generally very close. The largest regularization value (\(\lambda = 10^{-6}\)) fails to fit correctly the data.

Code
smooth_value <- 0.01 
lbds <- c(0, 1E-12, 1E-9, 1E-6)
captions <- paste("&Lambda; =", lbds)

fit_lbds <- lapply(lbds, function(x) {
  fit_waveRAPID(
    sim_data = sim_noise,
    do_optimization = TRUE,
    smooth_param = smooth_value,
    lbd = x
  ) 
})


p_lambda_2 <- (fit_plot(fit_lbds[[1]], "fit") + ggtitle(captions[1])) /
  (fit_plot(fit_lbds[[2]], "fit") + ggtitle(captions[2])) /
  (fit_plot(fit_lbds[[3]], "fit") + ggtitle(captions[3])) /
  (fit_plot(fit_lbds[[4]], "fit") + ggtitle(captions[4]))
Code
p_lambda_2 
Figure 12: Fit results of noisy data using increasingly large lambda parameters.
Code
tbl_lbds <- lapply(fit_lbds, function(fitty) {
  get_param_table(fitty)
})
 
tbl_lbds[[1]]
tbl_lbds[[2]]
tbl_lbds[[3]]
tbl_lbds[[4]]
Table 7: Estimated kinetic parameters for the fits of noisy data using increasingly large lambda parameters.

Conclusion

Overall, we have successfully implemented the direct estimation method for fitting waveRAPID data, as described in the original patent and publication. The approach allows for the estimation of kinetic parameters from noisy sensorgram data, with the option to focus on dissociation phases to mitigate RI effects. The use of bootstrapping provides a robust way to assess the uncertainty in the parameter estimates. Future work could explore alternative optimization methods to further improve the fit quality and parameter estimation, especially in the context of high noise levels.

References

Fievet, Lucas, and Kaspar Cottier. 2020. Method for calculating kinetic parameters of a reaction network. WO2020016669A1. PCT, issued January 23, 2020. https://patents.google.com/patent/WO2020016669A1/en.
Kartal, Önder, Fabio Andres, May Poh Lai, Rony Nehme, and Kaspar Cottier. 2021. “waveRAPIDA Robust Assay for High-Throughput Kinetic Screens with the Creoptix WAVEsystem.” SLAS Discovery 26 (8): 995–1003. https://doi.org/10.1177/24725552211013827.
Petzold, Linda. 1983. “Automatic Selection of Methods for Solving Stiff and Nonstiff Systems of Ordinary Differential Equations.” SIAM Journal on Scientific and Statistical Computing 4 (1): 136–48. https://doi.org/10.1137/0904010.
Powell, M. J. D. 1994. “A Direct Search Optimization Method That Models the Objective and Constraint Functions by Linear Interpolation.” In, 51–67. Springer Netherlands. https://doi.org/10.1007/978-94-015-8330-5_4.
Soetaert, Karline, Thomas Petzoldt, and R. Woodrow Setzer. 2010. “Solving Differential Equations inR: PackagedeSolve.” Journal of Statistical Software 33 (9). https://doi.org/10.18637/jss.v033.i09.
Ypma, Jelmer, and Steven G. Johnson. 2011. “Nloptr: R Interface to NLopt.” The R Foundation. https://doi.org/10.32614/cran.package.nloptr.