During my recent talk at the International Marine Aquarium Conference, I outlined the evolution of modern reef keeping explaining that the hobby had evolved through a series of stages to reach its current status. I suggested that the haphazard trial and error that had helped the hobby reach its current level of understanding would carry us no further. To continue to evolve and grow, the hobby had to enter a new phase where scientific methods replaced the "voodooism" that has characterized too much of what had guided the hobby to this point.
A recent series of articles on metals in artificial seawater and reef tanks would seem to be an example of what I was advocating. (Shimek 2002a 200b) Although that is true to some degree, the series also illustrates the potential pitfalls and dangers that we face as we enter this new phase of reef keeping. The response to the articles also demonstrates that many reef hobbyists are naive regarding scientific methods and are ill prepared to interpret the validity and usefulness of seemingly significant research.
If the hobby is to benefit from scientific methods, hobbyists need to develop a greater understanding of the principles and methods of science as well as an ability to distinguish between good and bad science, The goal of this and subsequent articles examining the reef tank metals articles is to help develop a sufficient understanding within reef keeping that hobbyists can critically judge articles that appear in hobbyist literature.
What is science?
In its most basic form, science is simply systematic study of phenomena. The Oxford Dictionary defines science a little more elaborately as, "a connected body of demonstrated truths or with observed facts systematically classified and more or less comprehended by general laws, and which includes reliable methods for the discovery of new truths." There are many moving parts to the definition and each matters in deciding whether something is science. It has also been said that the history of science consists of a series of conjectures and refutations. That's because in science, most conjectures are wrong and few are right. In other words, scientists are wrong more often than they are right. So one might more accurately describe scientists as seekers of falsehoods. (This is one key difference between the psyche of the scientist and that of the reef keeper. Scientists accept error as inevitable and the uncovering of errors of others as essential elements of scientific progress. Many reef keepers seem uncomfortable with this notion.)
Scientific knowledge has grown through the centuries as scientists have built on the discoveries (and errors) of previous work. Observations led to hypotheses, that in turn led to tests and experiments. Some experiments supported hypotheses while others disproved them, but each step of the way scientists learned a little more about our world. Scientific methods refer to the use of tested and accepted methods to study phenomena. The use of accepted scientific methods lends credence to one's findings and help other scientists understand the results. It also helps others replicate the studies to confirm their findings.
The metals studies are useful in illustrating both the use and abuse of scientific methods. In the following sections, I'll examine each stage of the studies and explain what is right and wrong about the author's methods and conclusions.
Scientific work involves a series of steps, all of which determine the accuracy and usefulness of the work. A misstep at any point in the process can undermine the entire process, so each step must be closely examined. Scientific papers generally begin with an introductory section explaining the central question under study. This section will generally review previous studies and relevant literature. It will explain why the study is important, and outline the hypotheses to be tested. In the metals articles the author asserted that metal concentrations in the average reef tank were significantly higher than on natural reefs and that scientific studies had conclusively demonstrated that at the levels found in reef tanks metals were toxic to marine organisms, He proposed that metal accumulation in reef tanks might explain why some tanks deteriorate over time (old tank syndrome).
When critically examining research, one should first question the premises of the author. Are the arguments of the author logical? Given what we know, are his assertions reasonable? If they are, will the approach he proposes address the issues he raises?
Methods-and why you can't always trust the government
Scientific papers always have a methods section devoted to outlining the methods used in the study. One way to judge a study is to examine the methods used and see if the methods are a reasonable means to test the author's hypotheses. To prove that the average reef tank has high levels of metals, one has to measure metal levels in a reasonable number of tanks representing a cross section of hobbyists. Ideally, a sample of participants would be drawn reflecting all the possible variables that might affect metal levels. The age of each system, the experience of each reef keeper, the different sources of make-up water, and so on should be considered in designing a sample. In the case of this study, the author solicited the help of hobbyists who would be willing to submit tank water for evaluation and pay for the analysis. Ultimately, only 23 hobbyists submitted samples. In statistical terms, this is a self-selected sample. Rather than sample a cross section of tanks, the author simply accepted whoever had the money and inclination to participant. As a rule, self-selected samples tend to be unrepresentative of large populations. With only 23 self-selected participants, one should be quite cautious about assuming that any analyses of these tanks can be extrapolated to the hobby as a whole.
The methods section should also outline how the levels of metals will be determined. In the metals articles, the author listed the method as, "Inductively Coupled Plasma Emission Spectrometry or ICP Scan, EPA method 200.7." A long impressive name like this lends an air of authenticity to the work. Most hobbyists have probably never heard of the technique and are in no position to judge the appropriateness of the method. In a scientific paper, the methods section will often present an explanation of why a certain technique was chosen. It will also explain any limitations in the method. The series author, unfortunately, did not address limitations in the method chosen. A review of the scientific literature on the subject of measuring metals in seawater finds significant problems with the ICP method. (Crompton 1989) Saltwater is a complex soup of chemicals in widely varying concentrations. For technical reasons addressed in a recent column by Randy Holmes-Farley, an ICP scan has great difficulty differentiating metals, particularly toxic metals. (Holmes-Farley 2003) Scientists do use ICP scans to study seawater, but the metals are first concentrated using resins or other methods.
So why has the EPA approved the use of ICP scans? Because they are fast and inexpensive. The Federal government's interest is in finding a method that can provide cost effective data for monitoring sites and enforcement of environmental laws. Other methods are more sensitive, but more time consuming and costly. Consequently, an EPA endorsement of any methodology is not evidence that it is the most accurate or useful for scientific studies.
Every experiment and study has multiple potential errors, and it is important to consider how the author deals with potential error. The samples were collected by individual hobbyists and then shipped to the author who in turn shipped the samples to the lab. This means that 23 different people collected the 23 samples. No attempt was made to filter particulates out of the tested water, and it isn't clear how careful the 23 hobbyists were in collecting their samples. Under these circumstances, the risk for contamination is great. The metals of interest are in extremely small concentrations, and it would be easy for a hobbyist to inadvertently introduce foreign substances into his sample. For example, even if rinsed repeatedly, using the same cup one uses to add supplements or feed the tank would inevitably contaminate the sample water. In a study of professional marine scientists, it was found that even professionals produced wildly varying results when analyzing metals in seawater, probably because of contamination. Because contamination is so easy when testing for metals, very elaborate procedures have been developed to make sure that contamination is minimized at each stage of analysis. The author makes no mention of handling techniques, so it is unlikely that the hobbyists involved used accepted procedures for handling the water.
The hobby tends to confuse accuracy and precision. A measurement might be carried out to three decimal places and still be inaccurate, whereas a measurement carried out to a single decimal point may be much more accurate. ICP scans carry out most metal levels to two decimal places, but does that mean the measurements are accurate to that level? Not necessarily. Computer programs analyze the results of an ICP scan, and then make their best estimate of the levels detected. The stated detection level is the level at which a single element can be detected. However, it does not tell us how well the ICP machine can differentiate multiple metals simultaneously, and that is what a scan does. Consequently, the practical limits of detection are much less precise than the theoretical limits of the technology. Because seawater contains high levels of some metals like sodium and magnesium, ICP operators make sequential dilutions of the samples to determine the concentrations of some of the metals. Each dilution introduces another potential for contamination and error. While professional labs do their best to avoid such problems, detecting metals in seawater tests the limits of even the most conscientious lab technician.
So without going much further than the methods portion of the study, one finds serious methodological flaws that raise questions regarding the likelihood that valid data will come out of the study. The small sample of reef tanks may not be representative of most reef tanks. The method used to determine metal concentrations, the ICP scan, has serious flaws as used here, and the handling of the samples may have introduced contaminations.
Lies, damn lies, and statistics
Mark Twain once wrote that there were three kinds of lies: lies, damn lies, and statistics. This brings us to the analysis portion of the articles. Once one has collected the data, the next step is to aggregate it into some sort of useful summary. Since the goal of the study was to determine metal concentrations in reef tanks, a logical first step would be to average the results from all 23 tanks to determine what an "average" tank looks like. On the face of it, this seems simple and obvious, but as it turns out, one of the most serious methodological mistakes in these articles occurred at this point.
An average or mean is calculated by summing all the values and dividing by the number of values. Consider five tanks that measure as follows:
0.02 0.01 0.02 0.03 0.02
The mean for the five tanks is 0.02, which seems like a reasonable average value. We can also calculate something called the standard deviation, which tells us how much the values vary from the mean. In this case, the standard deviation is 0.01 which means that two-thirds of all values are within 0.01 of the mean and that 95% of all values are within 0.02, or two standard deviations from the mean. What happens if one tank is very different from the others? Let's say that five different tanks measure as follows:
0.02 0.01 0.02 1.00 0.02
In this case, the mean is 0.21, ten times the mean of the first example. While mathematically correct, a hobbyist should be wary of assuming that for this sample of tanks, the mean is synonymous with average. Four of the tanks were within .01, so it seems like the "average" tank should be closer to .02 but the one high value distorts the results. This points out the most serious problem in using a mean to characterize data. Means are sensitive to extreme values. In statistical terms, the 1.0 value is called an outlier. It is a data point so removed from the rest of the data that it is reasonable to suspect that it is an error or aberration.
A statistician will look at data and first determine whether the values are reasonable. If outliers exist, he may exclude them, suggest that the tests be repeated, or qualify the results by noting the outliers. Another option is to use the median of the data rather than the mean. This is particularly useful if one has a limited data set and does not want to exclude any of the data. The median is the mid-point in a distribution. It is half way from the highest and lowest values. The median for both examples is .02, which is probably closer to the average tank than the mean value.
One can calculate the median only if we have access to the original data, and the author has refused to publish the data of individual tanks. Consequently, we have no way to calculate the median metal levels. The author did, however, provide standard deviations for each of the examined metals, so we can make some inferences about the range of values. High standard deviations indicate widely varying values. For example, in the study cobalt had an average value of .037 mg/l with a standard deviation of .031. These numbers suggest that two-thirds of all tanks have cobalt levels between .006 and .068, a ten fold difference. Furthermore, it also tells us that 95% of reef tanks have cobalt levels between -.025 (let's call that zero) and nearly 0.1 mg/l. Such a wide range of possibilities raises the question of whether we can draw any conclusions about the level of cobalt in an average reef tank based on these data.
Large standard deviations should be a red flag for anyone reviewing the results of a study. In this case it means that for some metals, the concentration levels varied widely among the 23 tanks. It also means that we should be very cautious about assuming that the mean values in the articles really represent the average reef tank. This by itself is a serious problem for the study, but an even more egregious statistical error was committed in analyzing the results. Some of the tanks had metal levels below the detectable limits of the ICP scan. For example, if the level of antimony detected was less than .01 mg/l, the print-out read < .01. In other words, the machine was saying that we know the level is no higher than .0099 mg/l, but there is no way of knowing how much lower it might be.
The author chose to ignore any undetectable levels when calculating mean values. For example, the detectable limit for arsenic is .01 mg/l. One tank had a level of .02 mg/l and none of the other tanks had detectable levels. The author then claimed that the mean for arsenic was .02 mg/l. Is this reasonable? Excluding 22 of 23 tanks because the ICP scan detected no arsenic potentially creates the false impression that high levels are present in the average tank. A more reasonable method to treat undetectable levels is to use the detection level to calculate a mean. Using the detection level for the other 22 tanks, mean arsenic becomes .01, half what the author claims. And at .01 mg/l, the mean probably over estimates the level of arsenic in the average reef tank.
Once the author had calculated means for all of the metals, he then proceeded to show elevated metals by comparing the tank test results with published metal levels of natural seawater. The tank concentrations were much higher than the published NSW concentrations. Is this conclusive proof? Not necessarily. The published concentrations of metals in natural seawater is the result of elaborate studies using sophisticated equipment and procedures. None of the published studies used ICP scans to determine metal levels.
One might take the position that if levels found in reef tanks exceeds the detection limits of ICP, the method of testing is irrelevant; Levels still exceed NSW. The reality is somewhat more complex. As I pointed out earlier, the practical detection limits of ICP when testing seawater are considerably higher than the theoretical limits. Because of this, pristine natural seawater might test higher in metals using ICP scan than using the methods of published studies. Because of this possibility, the author should have tested natural seawater along with reef tank water to see if higher reef tank metal concentrations were an artifact of the chosen methodology.
In part two, we'll look at metals in natural seawater using the ICP scan, We'll also take a look at metal concentrations in several reef tanks not included in the original study to see if reef tanks are really the toxic waste dumps that the author believes they are.
Crompton, T.R. 1989. Analysis of Seawater. Butterworths & Company, London
Holmes-Farley, Randy. 2003. Aluminum in the Reef tank. Advanced Aquarist July 2003. http://www.advancedaquarist.com/issues/july2003/chem.htm
Shimek, R.L. 2002a. It's (in) the water. Reefkeeping.com February 2002 http://reefkeeping.com/issues/2002-02/rs/feature/index.htm
Shimek, R.L. 2002b. It's still in the water. Reefkeeping.com March 2002 http://reefkeeping.com/issues/2002-03/rs/feature/index.htm