BIOMARKER QUANTITATION: ANALYTICAL CONSIDERATIONS FOR LIGAND BINDING ASSAY REGRESSION CURVES AND QUALITY CONTROL SAMPLES

As biomarkers grow in relevance for both the design and support of therapeutics and the clinical trial s associated with them, there is an ever increasing n eed for accurate quantitation of these biochemical entities in biological matrices. While quantifying many biot herapeutics via ligand binding assay platforms can be fairly straightforward, biomarkers present some uni q e challenges that must be taken into account duri ng assay development, validation and subsequent sample analysis. These challenges can be especially confounded by the relationship between two ligand b inding assay tools: The regression curve and qualit y control samples. Due diligence must be performed to evelop an assay that takes into account matrix vs . buffer effects and endogenous biomarker presence. L ack of diligence in these areas can lead to less th an reliable results, thus potentially rendering the in t nded use of the assay moot.


INTRODUCTION
Biomarkers play an important role in the development of therapeutics. By up-regulating or downregulating in response to disease states or pharmacological intervention, biochemical biomarkers are important indicators of disease progression and drug efficacy (Frank and Hargreaves, 2003). They have been used to show proof of mechanism for drug efficacy, as safety indicators in response to drug dosing and even as screening criteria for potential patient enrollment in clinical trials (Colburn and Lee, 2003;Chau et al., 2008). Due to this underlying importance, the need for accurate quantitation of biomarkers in various biological matrices requires an understanding of not only how they differ biochemically from therapeutics, but also what the optimal method of quantitation might be for each individual analyte of interest. The intended use of the assay, whether fit-for-purpose in early development or fully quantitative in support of clinical trials, will drive the need for accuracy and reproducibility in results.

Biomarker Quantitation
There are multiple quantitative platforms available that have both the sensitivity and dynamic range to adequately assess biomarker concentrations in various matrices. Many of these fall into the ligand binding assay category (Sittampalam et al., 1997;Jong et al., 2005), specifically the microtiter variety. Standard Enzyme-Linked Immunosorbent Assays (ELISA) have been around for decades, (Weeman and Schuurs, 1971;Engvall and Perlmann, 1971) and although they are known to have fairly narrow dynamic ranges (up to 1½ logs), these assays are simple and well established in most bioanalytical labs (Porstmann and Kiessig, 1992). Greater sensitivity and increased range has been shown with the use of fluorometric substrates in lieu of standard colorimetric substrates (Rodriguez et al., 1998), mainly due to the reduced signal to background noise. Still, project needs may necessitate greater sensitivity, particularly if a biomarker is at low levels in a diseased population or is a therapeutic target and expected to become scarce in matrix after drug administration.
Electro chemiluminescent ELISA has been developed to enable greater sensitivity for analyte quantitation via use of light counts from redox reactions stimulated by laser excitation. Most notable in this area is Meso Scale Discovery, whose plates are manufactured with electrodes in each micro well to enable the laser excitation at time of data capture. This technology has Science Publications AJI been shown to have advantages in dynamic range, sensitivity and reduced interference levels when quantifying biotherapeutics (Thway et al., 2010) and biomarkers (Lembo et al., 2009;Sloan et al., 2012) compared to standard ELISA described above. This platform also has the added advantage of multiplexing potential, whereby a single sample can be assayed for several analytes simultaneously.
Nanoliter scale immunoassays have come into increasing use in the past decade. Leading the way is Gyros, whose Gyrolab™ workstation utilizes microfluidics on compact discs that automate workflow for reduced assay times and increased throughput of samples (Barry and Ivanov, 2004). The nano-scale microfluidics is particularly useful for biomarker quantitation in rare matrices, as there are minimal sample volume requirements. Automation of the assay workflow eliminates manual liquid handling and operator variability, vastly improving reproducibility and precision of the quantitated analyte. Gyrolab™ consumables costs have been a concern for some in the bioanalytical arena (Roman et al., 2011;Funelas and Klakamp, 2012), but reduced assay development time and sample throughput advantages often make this platform a practical choice.
In terms of absolute sensitivity, high definition immunoassays on the Singulex® Erenna® are commonly used to detect low levels (picogram/mL or fentogram/mL) of analyte. Although there are rare circumstances whereby these levels of sensitivity can be achieved by other ligand binding assay formats, Singulex® is the easiest format with which to achieve them. The technology uses magnetic micro particles to increase binding surface area and an advanced digital detection in conjunction with a proprietary single molecule curve fit algorithm. Although not known for high sample throughput, this platform can be especially useful for biomarkers (Todd et al., 2007) and has far reaching potential for multivariate quantitation (Tarasow et al., 2011).
Each of the ligand binding assay platforms described above all use a few common tools for analyte quantitation. The first is a set of control samples usually referred to as standards or calibrators that are run with each assay and used to construct a regression curve. The type of regression curve used is usually a 4 Parameter (4PL) or 5 Parameter (5PL) fit, based on the non-linear relationship of concentration versus raw data which is inherent to ligand binding assays (Findlay and Dillard, 2007). The non-linear relationship is due to ligand binding assay formats which measure signal from a series of interactions governed by the law of mass action and binding affinity kinetics, whereby response error relationships are not constant and highest precision does not necessarily coincide with highest sensitivity (Miller and DeSilva, 2007). Unknown sample concentrations are extrapolated from the regression curve, so its importance cannot be overstated.
The second tool is a set of samples of known concentrations usually referred to as quality controls that are used to assess assay performance relative to extrapolated results from the curve. Biomarker quality control samples can be endogenous or spiked with known concentrations of reference material. Since unknown sample concentrations are extrapolated from regression curves, the reference material used to formulate the standards (and quality control samples if not endogenous) must be well characterized and representative of the analyte to be measured (Viswanathan et al., 2007). When available for biomarkers, endogenous material is preferable for this purpose because it is most representative of the marker being quantified; however, well-characterized purified recombinant material is often used when native markers are not available or impractical to extract from matrix. If the biomarker reference standard is a recombinant protein, it must be evaluated versus its endogenous counterpart to ensure similar assay performance. This is done most effectively via parallelism studies (Plikaytis et al., 1994;Gottschalk and Dunn, 2005).

Analytical Challenges of Biomarker Quantitation
There is an important distinction to make between the quantitation of biotherapeutics and biomarkers. Most biotherapeutics are humanized recombinant proteins that simply do not have a native presence in any matrices. Because of this, reference material spiked into control matrix for the formulation of standard and quality control samples is easy to calculate because theoretically it is the only protein of its type in the matrix. Assuming reagent selectivity for the molecule, quantitation is straightforward. By contrast, biomarkers are present in their respective matrices at levels that are dependent upon many variables. The endogenous biomarker level of a matrix sample is the naturally circulating concentration that can vary not only from one subject to another, but from one time point to another (days, hours, even minutes). If sample stability is an issue than levels can also vary significantly from one analytical run to another. This endogenous presence complicates the formulation of standard and quality control samples because reference material is being spiked into a matrix that already contains a certain concentration of the biomarker. There are several analytical approaches that can be used to address this complication (Lee, 2003;2009;Lee et al., 2005;Rifai et al., 2006;Miller et al., 2001), which are discussed in detail below.

Analytical Strategies for Regression Curve Formulation
Science Publications

AJI
When formulating matrix calibrators, the matrix used is typically a pool from numerous donors (healthy or diseased population, depending on the assay needs). While it is possible to formulate standard curves from pooled matrices containing endogenous biomarkers, assay sensitivity needs usually obviate this option because the lowest calibrator, even if just the endogenous level, might very well be at a concentration above the desired sensitivity. Perhaps the most common method of overcoming the endogenous analyte issue when formulating calibrators is the use of treated matrix. The endogenous biomarker presence can be removed from the intended matrix by a number of procedures including charcoal stripping, heat inactivation and hydrolysis and affinity chromatography. Once the matrix is treated and there is no measurable biomarker remaining, formulation of calibrators is a relatively simple exercise of spiking known concentrations of analyte at optimized concentrations over the determined calibration curve range. Preparation of calibrators in the matrix of interest, even when treated via one of the methods described above, has the advantage of controlling for non-specific interference in the assay. Although not encumbered by the endogenous analyte issue, this non-specificity is a primary reason why the use of matrix for calibrator formulation is the preferred method for pharmacokinetic assays (DeSilva et al., 2003;Findlay et al., 2000) and a common approach for titration curve calibrators in immunogenicity assays (Liang et al., 2007;Klakamp et al., 2007). It is also worth noting that treated matrix calibrators are a common generic tool for non ligand binding biomarker assay formats such as liquid chromatography/mass spectrometry (Haughton et al., 2009).
Sometimes treatment of matrix does not entirely remove the intended biomarker, treatment is not feasible from a convenience or cost standpoint, or the matrix is rather rare (i.e., cerebrospinal fluid, tears, fetal fluids, tissues) and not available in quantities suitable for assay validation and subsequent sample analysis. An alternative approach is the use of a surrogate matrix, of which there are a few types. Heterologous matrices are from a different species than that being analyzed and may be either completely deficient in the biomarker of interest or contain a homolog that is less reactive with assay components. Although not of the same species, a heterologous matrix can have several components in common with the matrix of interest and thus render its surrogate nature less pronounced. Another type of surrogate matrix is a protein-containing buffer that lacks the biomarker, but usually has better stability and convenience for long term use due to lack of individual variability. This surrogate matrix type is commonly found in commercially available ligand binding assay biomarker kits because there is a great deal of control between lot productions and ease of use for the customer. However, what these buffer surrogates lack is true comparability to the matrix of interest in a biological sense.
The strategies described above are essentially substitute calibrator matrices that are different from test sample matrices. This preparation of calibrators in substitute matrices is a major difference of biomarker assays from that of biotherapeutics and other drug compounds. The potential differences in assay performance that might exist between calibrators prepared in matrix (treated or untreated) vs. surrogate matrix should not be underestimated; in fact, it has been a fairly common practice to use a matrix vs. buffer curve comparison to assess potential assay matrix interference and selectivity (ability to measure the analyte of interest in the presence of other sample components) (Shah et al., 1991;. The oftentimes expected differences between the two underscore just how dissimilar the two curve types can be. While there are no limits to what can be used as appropriate substitutes for calibrator formulation, assurance that concentration-response relationships are similar in both the substitute and test sample matrices must be secured during pre-study development and assay validation. This is another situation where parallelism studies are highly appropriate (Valentin et al., 2011).

Analytical Strategies for Quality Control Sample Formulation
Quality Control (QC) samples are tested with each assay run to assess assay performance. They are typically run at three levels: High (approximately 75% of the upper limit of quantitation), medium (geometric mean of calibration range) and low (approximately 3X the lower limit of quantitation). The concentrations of these QC levels are determined during method development and established during method validation. Additionally, two more QC samples are used during method validation to establish the upper and lower limits of quantitation for the assay. The upper and lower limits of quantitation are the highest and lowest concentrations at which the assay can accurately and reproducibly measure within the standard curve range.
Ideally, all QC samples are an accurate reflection of test samples. From a biomarker quantitation standpoint, this is most easily accomplished by using endogenous analyte in donor matrix. Several donors can be screened with the assay to determine varying levels of biomarker and those with certain concentrations can be used as QC samples. The screened donor samples should be assessed multiple times to assure an accurate and reproducible concentration for each. If biomarker concentrations are particularly high in most donor matrix, lower QC levels Science Publications AJI (up to and including the lower limit of quantitation) can be reached by dilution in treated matrix or surrogate buffer. Pooling of matrix can nullify potential individual donor selectivity issues, but this can also reduce the availability of potential QC sample concentrations within a donor sample population. The use of endogenous biomarker for QC samples is advantageous in that it is a true comparison to test samples, as opposed to a recombinant counterpart which may have different immunochemical properties and require additional experimentation during assay validation.
If the use of endogenous biomarker is not feasible, then spiking matrix with recombinant analyte is a viable option. The recombinant can be spiked into one of two types of matrix: Those that are treated in some way to ensure biomarker removal as described above with calibrator formulation and those that are untreated and contain a certain amount of endogenous analyte. QC sample formulation with treated matrices is a relatively simple process, as there is no endogenous material of interest and calculating final biomarker concentrations is based solely on reference stock concentration and volume spiked. Conversely, spiking reference material into untreated matrices has an additive effect and the endogenous concentration of the analyte (determined over several assay iterations) must be taken into account when determining final QC sample concentrations. A common calculation used to determine final concentration in such a circumstance is:

Expected Concentration = (SpikeConc)(SpikeVol) + (BasepoolConc)(BasepoolVol) TotalVol
If the additive effect of endogenous biomarker is not taken into account, it can have drastic effects on assay results, particularly at lower concentrations ( Table 1). Such inaccuracies can lead to assay failure based on lack of QC sample recovery and cause unnecessary sample re-analysis over the course of a study.
QC samples formulated in buffer are used for the same reasons that standard curve samples prepared in buffer are used: matrix treatment isn't practical or effective from a cost or feasibility standpoint, matrix is rare and must be conserved, or the QC samples are provided in commercial kits. While there are certain advantages to using buffer QC samples (no native analyte to account for in spike calculations, use with multiple matrices, fewer stability issues), there are a few significant drawbacks. First and foremost, buffer QC samples simply are not an accurate reflection of matrix study samples. Additionally, signal suppression from matrix effects will not be seen, particularly if buffer QC samples are run with buffer curves. If buffer QC samples are run with matrix curves, then signal suppression can cause inaccuracies, especially at the lower end of the curve where background tends to be higher with matrix.

An Analytical Example
For the purposes of providing some substrate for the analytical strategies described above, an experiment was performed using a commercially available colorimetric ELISA kit for the quantitation of fibroblast growth factor 21 (FGF-21). FGF-21 is an important biomarker in the fields of obesity and diabetes (Zhang et al., 2008;Mraz et al., 2009) and thus is often quantified in immunoassays. Using recombinant FGF-21 provided in the kit, standard curve calibrators ranging from 1920 ng mL −1 to 30 ng mL −1 were formulated in each of three conditions: matrix (pooled human K2 EDTA plasma, 2X charcoal stripped), kit buffer and Phosphate Buffered Saline (PBS). In parallel, QC samples (800 ng mL −1 , 400 ng mL −1 and 200 ng mL −1 ) were also formulated with the recombinant kit FGF-21 in each of the same three conditions. Standard curve samples were run in duplicate wells and QC samples were assayed twice, each iteration in duplicate wells. The assay was performed according to the protocol provided with the kit. Each set of QC samples (treated matrix, kit buffer and PBS) was extrapolated from each of the three calibration curves (treated matrix, kit buffer and PBS). Figure 1 graphically shows results for the three regression curves. The curve fit is the commonly used 4PL and what becomes readily apparent is the drastic difference between the matrix curve and the two others in terms of background. While all three curves have similar slopes, the kit buffer and PBS curves are nearly identical with respect to background. This is not unusual, as many kit buffers are simple formulations of buffered saline with a small percentage of carrier protein such as Bovine Serum Albumin (BSA).
Whereas Fig. 1 is a graphical representation of the curves, Table 2 is a summary of results for all nine QC sample regressed results (three sample formulations from each of three curves). When the QC sample and calibrator treatment is alike (matrix/matrix, kit buffer/kit buffer, PBS/PBS), the assay produces excellent accuracy for the QC samples (relative error (%RE) between-9.7 and 8.0%). However, when QC and calibrator treatments are mixed, results are highly skewed when the combination involves matrix and non-matrix counterparts. When the calibrators are matrix and QC samples are kit buffer or PBS, the QC samples Science Publications AJI drastically under-recover, so much so that the low QC samples are Below the Limit of Quantitation (BLQ) of the assay. When calibrators are kit buffer or PBS and QC samples are matrix, the QC samples grossly over-recover and again the effect is most drastic at the low QC concentrations. When both calibrators and QC samples are non-matrix but not same buffer formulation, the effects are far less pronounced but still have potential to impact assay acceptance. When kit buffer QC samples are extrapolated from the PBS curve, all QC samples over-recover; the high and medium QC levels have excellent recovery, but the low QC level's over-recovery is pronounced (up to 33.3% RE) and depending on assay acceptance criteria could be cause for concern.

AJI
When PBS samples are extrapolated from the kit buffer curve, all QC samples under-recover. In this instance, the high QC level again has excellent recovery, but the under-recovery of the medium and low QC levels (up to-21.1% RE) could be cause for concern in terms of assay acceptance criteria. The assay results summarized above are a rather stark example of just how important biomarker ligand binding assay strategies are for standard curve and QC sample formulation. While the kit buffer and PBS curves are graphically quite similar, extrapolated results of one from another show a distinct difference with over-and under-recovery of QC samples. This is an excellent example of how not all buffers are created equal and due diligence should be performed to ensure that correct choices are made for the betterment of the assay. Background resulting from matrix curves or QC samples can often be muted with sample dilution, but there can be a delicate balance between negating background and diluting away any desired assay sensitivity. As previously mentioned, parallelism studies are common tools for assurance of calibrator and QC sample compatibility.

CONCLUSION
Biomarkers have a well established role in the design and support of therapeutics and the clinical trials associated with them. Because of this, there is an ever increasing need for accurate quantitation in biological matrices. Quantifying biomarkers can present some unique challenges that must be taken into account during assay development, validation and subsequent sample analysis. One of the most important challenges understands the relationship between standard curve samples and quality control samples. Due diligence must be performed to develop an assay that takes into account matrix vs. buffer effects and endogenous biomarker presence. A lack of thorough investigation into these assay parameters can lead to an assay that is not a best fit for its intended purpose.

ACKNOWLEDGEMENT
The researcher would like to thank Dr. Stephanie Fraser and Mr. Rick Steenwyk for their review of this article.