Interassay Variation

Interassay variation between different quantitative serum tests and between different qualitative urine tests can cause confusion and misdiagnosis of pregnancy. Most clinical laboratories participate in external proficiency testing through the College of American Pathologists (CAP). CAP surveys have been the biggest indicator of interassay variation. The five most recent CAP surveys (each with a single hCG preparation) of 2307 to 2324

testing laboratories showed 1.50- to 1.59-fold variations from the mean in results for the 16 most commonly used hCG assays in the United States (18). In 1990, Bock wrote an article titled "hCG Assays: A Plea for Uniformity," strongly criticizing the wide variation in CAP survey hCG results and demanding that manufacturers resolve these differences (19). Clearly, the plea was not effective and the problem, although slightly less pronounced, still exists today. Because of the differences between assays, individual assay-specific reference intervals must be established and should not be interchanged. This can be a source of confusion when hCG is measured at different medical centers.

Figure 4a illustrates the interassay variation in the 10 most commonly used hCG tests from the most recent CAP survey report (18). We recently requested that 11 laboratories quantitate the 1st RR, the newest, purest, and supposedly the most homogenous WHO hCG standard available (16), using the same 10 hCG tests (each calibrated against the current 3rd IS). As shown in Fig. 3, Panel B, wide variation in results was recorded between assays (1.6-fold variation). Interestingly, the tests giving the highest and lowest results with the 1st RR standard [the Bayer Centaur (Bayer, Medfield, MA) and Roche Elecsys (Roche Diagnostics, Indianapolis, IN) tests, respectively] are completely different than those giving the highest results and lowest results with CAP proficiency test material [the DPC Immulite and Beckman Access-2 (Beckman Coulter, Fullerton, CA), respectively]. Similarly, when the same 10 assays were calibrated directly with the 1st RR, 1.6-fold variation remained (data not shown), with the Dade Dimension RXL (Dade Behring, Newark, DE) yielding the highest and the Bayer Centaur the lowest result (17). All of these observations indicate the complexity of interassay variation.

We attribute much of the interassay variation to two main factors: (1) differences in assay specificity, and (2) standardization material. Each will be discussed in more detail below.

Assay Specificity

One of the causes of interassay variation between hCG measurements is the use of antibodies that recognize different hCG molecules and degradation products (Table 4). The manufacturer's choice of antibodies will significantly affect results because antibody specificity will determine the number of different hCG molecules the assay will be able to detect. For instance, the reason the DPC Immulite hCG test consistently gives the highest result with the CAP proficiency test material, and not with the pure 1st RR, is that it is the only commercial hCG test that detects P-core fragment (Fig. 5). Whereas the 1st RR is pure and free of such contaminants, the CAP hCG proficiency test material is derived from a crude urine preparation (CAP technical service; Northfield, IL; personal communication), which likely contains significant P-core fragment, recognized only by the DPC Immulite hCG test. In addition, serum hCG assays differ in their ability to recognize hyperglycosylated hCG (Fig. 5). Quantification of a purified hyperglycosylated hCG standard ranged from 468 IU/L (Dade Dimension RXL) to 1544 IU/L (Roche Elecsys) (9,12,17). This is a 3.3-fold variation. Similarly, serum hCG assays differ in their ability to recognize hCG free P-subunit (2.2-fold variation) and nicked hCG (1.5fold variation). For a comprehensive list of comercially available hCG immunoassays and their specificity, see Cole (12).

Significant interassay variation is also observed with home pregnancy tests and urine point of care tests. These tests are all approved for use with urine, yet in a recent study, we found that only 2 of 14 home pregnancy tests and a similar low proportion of urine

Fig. 4. Interassay variation of hCG tests. Interassay variation of10 tests was investigated, the Abbott AxSym (test 1), Bayer ACS-180 (test 2), Bayer Centaur (test 3), Beckman Access-2 (test 4), Dade Dimension (test 5), Dade Stratus (test 6), DPC Immulite (test 7), Roche Elecsys (test 8), Tosoh A1A600 (test 9), and Vitros Eci (test 10). Panel A shows interassay variation as reported as part of ongoing CAP interlaboratory comparisons, involving 2307 laboratories (18). Panel B shows a blind 11-laboratory comparison with the same 10 assays and the new 1st RR hCG standard.

Fig. 4. Interassay variation of hCG tests. Interassay variation of10 tests was investigated, the Abbott AxSym (test 1), Bayer ACS-180 (test 2), Bayer Centaur (test 3), Beckman Access-2 (test 4), Dade Dimension (test 5), Dade Stratus (test 6), DPC Immulite (test 7), Roche Elecsys (test 8), Tosoh A1A600 (test 9), and Vitros Eci (test 10). Panel A shows interassay variation as reported as part of ongoing CAP interlaboratory comparisons, involving 2307 laboratories (18). Panel B shows a blind 11-laboratory comparison with the same 10 assays and the new 1st RR hCG standard.

point-of-care devices detect p-core fragment, the principal immunoreactive subunit in pregnancy urine throughout gestation (3). These tests are intended for use in early pregnancy detection. However, in a study of 14 home pregnancy test devices, 3 (Rite Aid, Long's, and Inverness Medical One-Step) had poor detection or completely failed to detect hyperglycosylated hCG, the principal hCG molecule produced in the first weeks of gestation. In a more recent study, three other products (Sav-Osco, Target, and CVS one-step tests) were also shown to either not recognize or poorly recognize hyper-glycosylated hCG (Cole L, Khanlian S, Sutton J, and Davies S, unpublished data). It was also observed that two products, Answer and Answer Quick and Simple, preferentially detected hyperglycosylated hCG.

Nicked Hcg

Fig. 5. Interassay variation of hCG tests. Interassay variation of 10 tests was investigated, the Abbott AxSym (test 1), Bayer ACS-180 (test 2), Bayer Centaur (test 3), Beckman Access-2 (test 4), Dade Dimension (test 5), Dade Stratus (test 6), DPC Immulite (test 7), Roche Elecsys (test 8), Tosoh A1A600 (test 9), and Vitros Eci (test 10). All were determined blindly in 11 laboratories (17). Panel A is pure hyperglycosylated hCG, calibrated by mass as 100 [g/L. Panel B is pure free P-subunit (250 [g/L). Panel C is pure nicked hCG, nicked only at P47-48 (7) (400 [g/L). Panel D is pure P-core fragment (50 [g/L).

Fig. 5. Interassay variation of hCG tests. Interassay variation of 10 tests was investigated, the Abbott AxSym (test 1), Bayer ACS-180 (test 2), Bayer Centaur (test 3), Beckman Access-2 (test 4), Dade Dimension (test 5), Dade Stratus (test 6), DPC Immulite (test 7), Roche Elecsys (test 8), Tosoh A1A600 (test 9), and Vitros Eci (test 10). All were determined blindly in 11 laboratories (17). Panel A is pure hyperglycosylated hCG, calibrated by mass as 100 [g/L. Panel B is pure free P-subunit (250 [g/L). Panel C is pure nicked hCG, nicked only at P47-48 (7) (400 [g/L). Panel D is pure P-core fragment (50 [g/L).

Standardization Material

The second factor that impacts interassay variation is the supply of pure WHO standards (20). As stated earlier, the quantities of WHO standards are limited and are not sufficient for the provision of calibrators for all laboratories or standards for quality controls. As such, manufacturers purchase crude or partially purified urine-derived hCG and calibrate these with the WHO standard for use as calibrators and quality control material. This can greatly affect the hCG result. If a test manufacturer, for instance, uses a WHO-calibrated commercial hCG standard that contains significant nicked hCG, and the test does not detect nicked hCG, it will appear as if the assay gives lower results than it actually should (9,12,17). However, even if WHO standards were available in larger quantities, as stated earlier, they are based on urine-based material, which may be inappropriate as a calibrator for serum-based assays.


One of the principal uses of an hCG assay is to aid in the early detection of pregnancy. Home pregnancy and point-of-care urine tests are often the first indication of pregnancy during the third, fourth, and fifth weeks following the last menstrual period. Most home pregnancy and point of care test devices have manufacturer claims of "over 99% accurate" and "use as early the first day of missing a period." How valid are these claims?

Production of hCG does not begin until the blastocyst implants in the uterus. A study by Wilcox et al. reported that 10% of the 136 pregnancies they examined had not yet implanted by the first day of the missed menses (10). Therefore, the highest possible screening sensitivity for an hCG test on the first day of the missed menses is 90% (95% confidence interval [CI] 84-94%). By 1 wk after the day of the missed menses, the highest sensitivity was estimated to be 97% (95% CI 94-99%). In addition, once hCG production has begun, urine hCG concentrations vary greatly between individuals at the same gestational age. A recent study by our group showed that to detect 95% of pregnancies with urine tests the day of the missed period, and at days 1, 2, and 3 after missing menses, would require test detection limits as low as 12.4, 21, 35, and 58 mlU/mL, respectively (21). Although most urine point of care tests have claimed detection limits of 25-100 mlU/mL, we found that only 1 of 18 home pregnancy tests (First Response Early Result) had a detection limit as low as 12.5 mlU/mL (21). A urine concentration of 100 mlU/mL hCG was needed, together with extended incubation times beyond the time suggested by manufacturers for 18 of 18 devices to yield positive results. At a detection limit of 100 mlU/mL it was estimated that only 16% of pregnancies would be detected on the day of the missed menses (21). Based on these recent studies, we determined that a detection limit of 25 mlU/mL should detect 95% of pregnancies somewhere between 1 and 2 d after missing menses, and approx 74% of pregnancies at the time of missed menses (21).

How can manufacturers claim such a high accuracy in very early gestation? The answer relates to an arcane Food and Drug Administration (FDA) 510(k) regulation. The manufacturer needs to demonstrate only that its test results agree with those of an existing test more than 99% of the time in order to advertise "greater than 99% accuracy." A new product is compared to an older FDA-approved test and evaluated with more than 100 urine samples supplemented with hCG at a concentration close to claimed sensitivity of the old test and with more than 100 urine samples containing no added hCG. The suggested use at the time of the missed menses and the 25 mIU/mL cutoff all come from studies performed with serum samples in the 1960s and 1970s and the now-proven erroneous assumption that serum and urine hCG concentrations are the same (3). These FDA guidelines and 510(k) evaluations, however, have no bearing on the ability of a product to detect early pregnancy. New guidelines are required for both home and point-of-care pregnancy tests that would require determination of the proportion of pregnancies detected by the product on a specific day of gestation (i.e., time of missed menses).

A word of caution should be given about the use of urine in general. Urine hCG concentrations can vary significantly depending on fluid intake. Therefore, a serum hCG-positive individual may test negative on a urine hCG test because of very dilute urine. This is one of the limitations of urine testing to confirm pregnancy, especially on random samples. A positive result is useful, but a negative result must be interpreted with caution.

Serum-Based Tests

Most quantitative serum hCG tests have a lowest detection limit of greater than 1 mIU/ mL. Considering that there can be background hCG of pituitary origin in serum (22), most manufacturers recommend reporting negative results as less than 5 mlU/mL and using 5 mlU/mL as a cutoff for detecting pregnancy. Clearly, serum hCG tests are many times more sensitive than home pregnancy and point of care tests, and can detect more than 95% of pregnancies on or before the time of the missed menses. For this reason, quantitative serum hCG tests are the test of choice for pregnancy confirmation or for accurate early pregnancy detection.

Between one-quarter and one-third of naturally fertilized pregnancies and a much higher proportion of in vitro fertilized or assisted reproductive technology pregnancies fail to implant properly and result in early pregnancy losses. These lead to a transient increase in serum and urine hCG concentrations that diminish by the time of menses. An early pregnancy loss may also postpone menses 2 d (23). This transient production of hCG can potentially cause a so-called false-positive pregnancy test result with urine or serum tests at or around the time of the missed menses. This is referred to as a "biochemical pregnancy." Repeat testing 2 to 3 d later will yield a true negative test result (13,23,24). Recent studies by O'Connor and colleagues indicate minimal hyperglycosylated hCG production by early pregnancy losses, and specific measurement of hyper-glycosylated hCG may avoid the false detection of early pregnancy losses (13).

In summary, it is clear that no test, serum or urine, can detect pregnancy with 99% accuracy at the first day of the missed menses. However, taking into account all of the above factors, it is estimated that urine or serum tests can detect pregnancy with 99% accuracy by 5 wk of gestation, or 1 wk following a missed menses.

LIMITATIONS OF hCG TESTS Manufacturing Defects

Home pregnancy and point-of-care urine hCG tests are inexpensive disposable devices. Some poorly made devices can give false positive results in the absence of hCG, others can fail to function properly resulting in false negatives. These devices incorporate a test window and a control window. The test window indicates a positive or negative result and the control window indicates that the device is functioning properly (see "Principals of hCG Tests" above). Invalid tests fail to show a band in the control window. In a recent study examining home pregnancy tests (21), two devices (Confirm and Clear Choice home pregnancy tests) gave one or more false-positive results with a urine solution containing 0 mIU/mL hCG. Similarly, 9 of 30 tests using the Clear Choice and 10 of 30 tests using the Confirm were invalid because they lacked the proper formation of a band in the control window. It is important to confirm all home pregnancy and point-of-care urine test results with quantitative serum hCG tests. Qualitatitive and semiquantitative serum point-of-care tests are calibrated and somewhat better controlled products. Although they offer an improvement over urine point-of-care products, they are still a "second best choice" and positive results need to be confirmed using a quantitative serum test.

The Hook Effect

The "hook effect" is a major limitation of all serum and urine one-step immunometric assays (see "Principals of hCG Tests" section). The hook effect occurs when extremely high concentrations of an analyte such as hCG occupies all the sites on both the capture and detection antibodies and prevents the formation of a so called "sandwich." The end result is that few or no tracer antibody + hCG + immobilized-antibody complexes will be formed, yielding a false negative result. The "hook effect" does not occur with two-step quantitative assays because excess analyte is washed away before the tracer antibody is added. This problem has been documented for both quantitative and qualitative hCG assays (25-29). As a rule, if a physician finds that the hCG result is inconsistent with the clinical presentation (e.g., patient is clearly at 8-10 wk of pregnancy, and a quantitative hCG is 10 mlU/mL, or a point-of-care test is negative), the qualitative or quantitative test should be repeated with 10- and 100-fold diluted sample (9,22,30).

Heterophilic Antibodies

Heterophilic antibodies are human antibodies against other antibodies (human or animal) that can link a capture and tracer antibody in the absence of specific analyte and give a false-positive hCG result (30-33). Because antibodies are large glycoproteins and generally do not cross the glomerular basement membrane and enter urine, this is only a problem in serum, not urine, assays (30-33). The USA hCG Reference Service is a reference facility consulting with physicians in cases of conflicting or nonrepresentative hCG test results. In the last 5 yr, the USA hCG Reference Service has identified 54 cases of women who were erroneously treated for gestational trophoblastic disease, choriocarcinoma, or ectopic pregnancy because of false-positive hCG results (4,22,32-38). In the first few months of operation, the USA hCG Reference Service investigated three unusual cases (22). In all three cases, the women had an incidental pregnancy test that was positive. The positive hCG persisted with small apparent changes in concentrations. Ultrasound, dialation and curetlage, and laparoscopy ruled out pregnancy or ectopic pregnancy. The diagnosis of gestational trophoblastic disease or choriocarcinoma was assumed. In two of the three cases, chemotherapy was started, and in one case a hysterectomy was carried out. At this time, the reported hCG concentrations were 17, 53, and 110 IU/L, respectively. In all three cases, the presence of hCG was not confirmed and it was shown that these assays were subject to interference by heterophilic antibodies. To date, approx 145 individuals have been referred to the USA hCG Reference Service for investigating potential false positive hCG results. Fifty-four subjects were confirmed to have true false-positive results. False-positive results were identified by the following criteria (22,32-36):

1. the finding of more than a fivefold difference in serum hCG results when tested with an alternative immunoassay (critical criterion)

2. the presence of hCG in serum and absence of detectable hCG or hCG-related molecule immunoreactivity in a parallel urine sample (critical criterion)

Note: Where possible, urine testing should be performed using a quantitative hCG test with high sensitivity (sensitivity 2 mIU/mL). Even though this is an "off-label" application, it is the most appropriate and quickest confirmation of real or false-positive hCG. Point-of-care urine tests are limited to a sensitivity of 25 mIU/mL. We have found that a qualitative urine test is not certain to confirm a serum hCG result unless that result exceeds at least 100 mIU/mL. In addition, because most qualitative assays currently detect intact hCG, they will not detect free P-hCG in the urine in cases of P-hCG-producing germ cell tumor. Germ cell tumors that produce only free p chain can present a similar clinical picture as heterophile antibodies (i.e., unexpected positive quantitative serum hCG, no clinical signs of pregnancy, and a negative qualitative urine hCG test).

3. the observation of false-positive results in other tests for molecules not normally present in serum, such as urine P-core fragment (confirmatory criterion)

4. the finding that a heterophilic antibody blocking agent (such as HBR, produced by Scantibodies Inc.) prevents or limits false positive (confirmatory criterion)

In all 54 confirmed false-positive cases, there was no prior history of trophoblastic disease or other tumors. Patients were treated for a diagnosis of ectopic pregnancy, gestational trophoblastic disease, or choriocarcinoma. Each case started with an incidental pregnancy test. Forty-five of 54 received needless surgery or single-agent chemotherapy; many received an unnecessary hysterectomy or other major surgery, or cytotoxic combination chemotherapy (32-38). To the best of our knowledge, in all cases, after false-positive hCG was identified, all treatment was halted, even though the quantitative test remained positive. Women having false-positive hCG results may also have falsely elevated results in other unrelated tests such as CEA, CA125, PSA, thyroid hormones, troponin, and other tumor and cardiac markers (39).

In most cases, false positive hCG results (in the USA hCG Reference Service assays) were eliminated by pretreatment of serum with a heterophilic antibody blocking agent, HBT (Scantibodies Inc., San Diego, CA) (36). It is noteworthy that certain hCG assays seem to have a propensity for producing false-positive results. All of the 53 false-positive cases arose from physicians who were monitoring patients with the Abbott AxSym hCGp assay, Bayer Centaur test, Bayer ACS180, Bayer Immuno-1, Roche Elecsys, J&J Vitros ECI, Tosoh Nexia, and Dade Dimension RXL quantitative serum tests or with the Beckman Icon 2 serum point of care hCG test. It is noteworthy that 43 of the 53 false-positive cases detected by the USA hCG Reference Service arose from centers monitoring patients with the Abbott AxSym hCGp assay. This has been observed by other centers (38). The Abbott AxSym test appears to be particularly prone to giving false-positive hCG results. An examination of the instruction sheet shows that animal serum is added in the Abbott AxSym test to just the diluent, rather than to the antibody preparation. As such, undiluted samples may not be protected from heterophilic antibody interference. This may explain the preponderance of false-positive results with this test (36,37).

Many of the clinicians who managed the 54 false-positive cases were misled by transient decreases in the hCG values after chemotherapy or surgery. This is because the decreases in hCG falsely suggested the presence of disease or indicated the success of therapy. The transient decreases were likely an interim weakening of the immune system after chemotherapy or surgery, reducing circulating heterophilic and antianimal antibody concentrations, leading to decreased false-positive hCG results.

Was this article helpful?

0 0

Post a comment