VetInfo Digest
June 2006
Table of Contents:
Statistics in Veterinary Medicine
This Months Note:
The first thing that you should know about statistics and veterinarians is that most of the “statistics” that are quoted in veterinary offices are made up on the spot. This sounds awful, but people want information quickly and respond best to familiar forms of information, so it is easier to say “ 50% of patients in this condition will survive” than it is to say “I have no idea what the true odds of survival are because the human memory is very fallible and like most humans I have a tendency to link cause and effect in a way that might not reflect reality but I’m feeling strongly that your pet has as much chance of survival as it does of dying.” Plus people seem to think I’m smarter and more reliable when I give them the first line than when I give them the second one, even though the reverse is almost certainly true.
It is probably important to point out here that I am not a statistician. I am aware of the fact that I do not fully understand statistics but I have tried to come to at least a general understanding of them because of the important role they play in understanding veterinary medicine. If anyone has a better or more clear explanation for any of the concepts covered and wants to share it with other subscribers, feel free to write to me.
You are probably familiar with the phrase “The art of medicine”. The meaning of this phrase is based on the way that medicine has been practiced for centuries. Individual practitioners learned through experience how to care for patients. Often there was sharing of information between practitioners but it was still based on the experiences of individuals. Eventually it became possible to compile and compare information among much larger groups of practitioners and their patients. The data collected in making these comparisons had to be analyzed in some way and statistical analysis of medicine became more common. Only very recently have human physicians made the case for practicing based primarily on the analysis of studies, or “evidence based medicine” as a primary method of practice. In veterinary medicine it is not possible to make this case, yet. There simply are not enough studies that include enough patients and are well designed and well executed in many areas of veterinary medicine to be able to practice by sticking primarily to procedures, tests and treatments that can be backed up by statistically valid evidence. It is doubtful that there will ever be a time when it is possible to practice completely “scientifically” in veterinary medicine, but it is a good idea to at least understand the basis behind doing this.
People are very resistant to the idea that the connections they make between their observations of events and the causes for them are often incorrect assumptions. Humans have such a strong desire to link cause and effect that they often assume connections between events that are just not there. Once such a connection has taken hold in a person’s memory or thought process it is very hard for people to give up on it. I sometimes still have a very hard time relying on data and statistics that I just don’t believe, for whatever reason, so this isn’t just a problem for people who haven’t been introduced to the idea of analyzing data to improve the accuracy of their assumptions.
The availability of information, good and bad, on the internet, in textbooks, in numerous journals and other written materials makes it nearly impossible for a practitioner to be up to date on every subject. It is helpful if you are willing to take on some of the responsibility and research yourself. Find the breed related problems, keep an eye out for information about health care issues, such as the recent pet food recall and do a little research when your pet is suspected of having a disorder such as hyperthyroidism or pancreatitis. You may be able to find information useful to you and to your vet. You will have a lot more success discussing this information with your vet if you take the time to sort out the unsupported information from the information that has statistically valid data to support it, though. Even if you want to pursue something that isn’t supported scientifically it helps a great deal in the discussion with your vet if you are aware of that fact before initiating the discussion.
Statistics play an incredibly important role in modern medicine. It is critical for veterinarians to have some understanding of the statistical basis for practicing a certain way and to be able to explain that basis in a manner that clients can understand. Some veterinarians who really understand statistics and how to apply them don’t have a clue how to explain them to a client. Other vets who are very good at communicating have less skill in correctly interpreting statistics. For this reason, if you are capable of doing your own research and are interested in researching conditions that affect your pets, it is helpful to also have some idea how statistics work in relation to medicine and medical practices. Even if you have no interest in these things, it still helps in asking questions about your pet’s health if you can interpret the statistical basis for the answers given, at least to some degree.
Mark Twain credited this quote to Disraeli, so hopefully he is correct: “There are lies, damned lies and statistics.” I think that very many people have this opinion. It is not the fault of the statistics themselves, usually, when statistics mislead. It is more often the result of inappropriate use of statistics, or even deliberately misleading use of statistics, that gives this impression. It doesn’t help that similar studies sometimes yield very different statistical information about a particular problem but understanding why that can occur can make it much more clear when you are hearing a lie and when you are hearing a statistic.
Perhaps the first step in reviewing how statistics are used in veterinary medicine is to review how the scientific process is supposed to work. In science the ideal situation is to formulate an hypothesis, design a study that will prove or disprove that hypothesis, test the hypothesis with the study and then revise the process based on the information obtained, until it is possible to prove or disprove the original hypothesis or to formulate a new hypothesis that matches the data generated. This sounds complicated just to describe but it is a much more complicated task in medicine than it is in some other fields of endeavor. In many fields it is possible to isolate the subjects of the study so that only one variable changes in the environment, making it the sole reason that an action causes a reaction or a process occurs. In medicine it is often impossible to completely remove all variables except one. There are just too many variables that affect a living organism. This is one of the reasons that many people continue to argue that medicine is an art rather than a science. Despite that feeling, it is possible to come up with ways to study some aspects of medicine in a scientific manner. It seems unlikely that we will ever figure out a way to test all aspects of medicine using scientific principles, though. There are far too many variables to control all of them when studying live subjects.
The very basics of statistics
At the very basic level the study of statistics consists of compiling information and attempting to apply some order to it so that the information makes sense. This is accomplished in a number of familiar ways. I’m sure that in computing you have probably heard the expression “garbage in, garbage out”. It is equally true in studying medicine that the data collected must be good in order to obtain a useful study.
The data collection method has to be accurate, precise and repeatable for a study to mean much. These terms overlap some. Accurate data is usually repeatable data but accuracy refers to the “correctness” of the data. Precision is a little different but also implies repeatability. To give an example, if the speedometer in your car is 5 mph slow, but always exactly 5 mph slow, it is not accurate but it is precise. This type of variation in data can easily occur with laboratory equipment, as well.
There are a number of types of errors that can occur during data collection. Machines often introduce predictable errors. It is not usually a problem in a study when a machine error is recognized and accounted for. An example of one type of machine error is non-linearity, the tendency of machines to have a variation in the accuracy or precision of data as weight increases or color changes (in measurements involving colorimetry). As long as the changes are predictable and accounted for this sort of problem usually doesn’t impact the value of data produced.
Human error and human bias are more difficult to detect and to control when collecting data samples. Humans have a very strong tendency to see what they want to see, or what they expect to see. This often has an impact on data collection. This type of error is particularly prevalent in subjective studies where the examiner must make a judgment, such as improvement in lameness or in studies that depend on owner surveys of how a pet feels. Lack of familiarity with testing equipment can lead to errors. Boredom while doing repetitive tasks such as counting bacterial colonies can impact the accuracy of results. In general, it is a good idea to read the descriptions of how a study was done to get an impression of how likely it is that human error is impacting the results of the study.
Once data has been collected it has to be organized and presented in a manner that is understandable. Most people have seen examples of a pie chart or a bar graph designed to make it easy to see what the underlying data represents. For uncomplicated collections of data these are often sufficient to allow people to interpret the data correctly by themselves. More complex problems require more complex methods of presentation and interpretation often requires some effort on the part of the reader. Sometimes people just look at the statistical information included in a study, figure it isn’t worth trying to figure out what it means and decide to just trust the author of the study. This works better when reading studies published in refereed journals than in popular magazines but even there it isn’t a very good plan.
I do think that most veterinarians, over the course of their careers, eventually settle on a few authors or researchers whose work and opinions they trust. Conversely, there will be a few authors whose research or opinions do not appear trustworthy. In general I think that this approach works well (if the initial choice to trust them was a good one), as long as somewhere in the back of your mind you remember that even people whose work is generally good sometimes make mistakes and people whose work is often bad still usually do some good work and it is important to try to recognize it when that happens.
One of the reasons that people find statistics to be daunting and perhaps one explanation for why they often seem like they might be misleading, is the manipulation of data using mathematical formulas before including it in a paper. There are often very good reasons for doing this but it really can distort the meaning of the data if you are not familiar with the particular method of manipulation used.
A very simple example is the arithmetic mean. This is obtained by looking at the number of subjects ( 5 people for instance) and the data set (weight of the subjects as an example). The individual weights are added up and then divided by the number of subjects. The arithmetic mean can be very misleading when there are subjects whose associated data point is much heavier or much lighter than the rest. A 600 lb. person changes the arithmetic mean a great deal in a small study of humans, for instance.
To get around the problems with the mean, early statisticians came up with the concept of the median value. This is the data point that is right in the middle of a set of data. In our example, if the weights were 155 lbs, 168 lbs, 192 lbs, 224 lbs. and 600 lbs., the arithmetic mean would be 267.8 lbs. This obviously distorts the data. On the other hand, the median would be 192 lbs., which seems to better represent the whole group. The median is limited in its usefulness, too. It isn’t really a mathematical calculation, making it hard to use in other mathematical formulations that explain data. It isn’t as likely to be affected by a few individuals whose data varies widely from the group but it still can be.
The geometric mean is a calculation that is used because it changes the distribution of the data in a way that makes it easier to present. It is obtained by using the logarithm of the values that make up the data set to obtain a mean. It tends to be closer to the median value than the arithmetic mean in samples that are skewed (a simplification) and it can be used more easily to calculate other statistical values. Sometimes if you miss the mention of the use of a geometric mean rather than an arithmetic mean you get the feeling that the study is flawed as you make a mental estimate of what the arithmetic mean should be.
Mean, geometric mean, median and mode (the data point that is repeated most often) are used to try to condense or summarize a group of observations in a single value. The concepts of ranges and variance are an effort to give an overview of all the data. The entire range is simply the difference between the smallest and greatest observations in a set. It is very useful to know the boundaries of a set of observations. Often it is more useful to know the boundaries of portions of the data, though.
Due to problems with variability in the mean values in samples the concept of a confidence interval for the mean was developed. This is a measurement of range in a sample that will include the true mean for the sample. A confidence interval includes a low value and high value, between which the mean should lie. A narrow confidence interval (small difference between the low and high value) shows that a sample is a good estimate of the whole population’s mean. A wide confidence interval shows that the data is a poor estimate of the whole population. The confidence interval stated as a percentage value. A 95% confidence interval will contain the population mean 95% of the time. This number is most useful for obtaining an overview of the reliability of the data as it pertains to the population being tested.
In medicine there are several measurements that relate to the range of values that are frequently used. One of the more frequently quoted ones is the “normal range” or reference range. This is an estimate of the number of animals that are healthy within a set of observations such as white blood cell counts. The most commonly used normal range includes 95% of the population. The very low and very high values for a data set are excluded from the normal range. This can cause a lot of confusion when evaluating laboratory results. Blood is drawn from individuals assumed to be healthy based on clinical examination and history. By definition, 5% of the patients will have blood values that fall outside the “normal range”, even though they appear to be healthy. Patients who are sick may still have blood values that fall inside the reference range, even for values that relate to their illness. To give some idea of how this might work, some dogs will have white blood cell counts on a routine basis that are at or below the bottom of the reference range, usually about 5,000 WBCs. When these dogs experience a doubling of their WBC, the actual count still falls within the normal range because the high end of the reference range for a dog is about 15,500. Still, the reference range is a useful value because it gives some basis for making assumptions about health. These assumptions can be modified based on the individual patient as more is known about him or her.
Another measure applied to ranges is the standard deviation. This is a way of combining the idea of looking at the range with looking at the mean. The standard deviation is a calculation of the average of how far individual data points vary from the mean. This gives some idea of where the majority of values in a range will lie. Two standard deviations from the mean in both directions (plus and minus) will include 95% of the subjects, matching the most commonly used reference range. The standard deviation helps a lot when values in a study vary over a wide range as it gives a way to tell if a value should be worried about when it varies a small amount or whether much larger variations from the median are likely.
The statistic that was used most commonly to show that a study was valid when I started out in veterinary medicine was the probability (P-value). A small P value is better than a larger one. So a P-value of 0.05 is better than 0.10, for instance. It is extremely important to understand two things about probability values as they are used in medical studies. The first thing that you must understand is that the probability value doesn’t work in a vacuum. It is the probability as it relates to the author’s hypothesis. The P-value is most useful when an author’s hypothesis is wrong, based on the data collected. In this case, it is possible to say that there is a statistically significant reason to reject the hypothesis. If the P-value doesn’t show a reason to reject the study that doesn’t mean that it proves the study is true. It just doesn’t prove that it’s not true. So the two things that you have to be sure to recognize are 1) a result can only be statistically significant if it tends to disprove the author’s hypothesis; 2) the P-value can’t be used to prove a hypothesis, only to show that it has to be considered among the likely explanations for the data. I am not sure that I am correct in this assumption, but I think that the P-value is skewed a little in the author’s favor in most studies – because the author gets to select the way the hypothesis is written. On the other hand, some statisticians say that the P-value is skewed against veterinary author’s hypotheses because the sample sizes are usually small and this makes it harder to make a statistically valid case.
If you were paying attention you noticed that the author’s assumptions have a lot to do with the way that the statistics are used. It is critically important to read what the assumptions were and to make sure that they really apply to the data presented. It is also important to note what isn’t stated. In a study comparing two methods of cruciate ligament repair the author noted that neither method was superior in the conclusion of a paper I read. He also stated that after five years the patients who had either surgery were indistinguishable from the control patients. I believe it should be clear that means that neither surgery provided any benefit – but that wasn’t how the author stated the conclusion.
It is important to remember that just because a result of a study can be shown to be valid based on statistical analysis that still doesn’t mean that it is has clinical or biological significance. If 5% of patients respond to a medication for arthritis pain does that make it a good choice, even if it is pretty clear that the improvement will occur in that many patients, especially if there are medications which work 75% of the time? Does a temporary improvement at all in the clinical situation matter if the condition being studied is rapidly fatal? These types of questions can be hard to answer but it is still important to ask them.
How do the statistics relate to the clinical reality? Many dogs suffer from arthritis pain that severely impacts their quality of life. If you hear that there twice the risk of bone cancer due to a particular arthritis medication does that matter? It sounds impressive but if the risk of bone cancer is 1 in 10,000 a 50% increase means the risk is 1 in 5,000. If the medication is likely to contribute to a comfortable life for your pet does that small an increase in risk matter? The prospect of great improvement in the quality of live for 4999 patients but a poor outcome for 1 out of the 5000 seems like a reasonable gamble to me. You have to assess your own aversion to risk to know how you feel about these sorts of statistics, but do try to find out what the actual increase in risk is rather than the relative risk (percentage changes in risk).
If you don’t know anything at all about statistics it is often possible to carefully review the raw data and to draw reasonably good conclusions from that analysis. If the conclusions drawn by the author don’t match your analysis of the data then it is important to understand the statistical analysis used. This is the point where many people throw in the towel and just assume they are right or assume that the author is right. In this case most people choose to believe whichever conclusion they favor in the first place. This situation should at least make you want to do a little additional research to see if other papers support the conclusion you have drawn.
I hope that it is obvious after reading this newsletter that depending on the synopses of articles available in databases online, such as PubMed, has the serious drawback of not allowing you to look at the raw data, the selection process and other elements of the study that might cause you to make a different interpretation of the results than the author. It is OK to use abstracts when they are the only thing available but it is better to have several concurring abstracts rather than relying on one since you can’t know for sure how valid the author’s conclusions are.
I want to return to the note at the beginning of this month’s issue. It really is important to understand that many of the statistics that you hear in veterinary offices, and probably in other medical offices, are more of an attempt to convey general information than they are an attempt to provide factual information. As much as I would like to be able to provide each and every patient a tailor made statistical analysis of their individual condition there is no real way to do that. Unfortunately, it is too easy to use the framework of statistical analysis in conveying information without having the real data to back up what is said. Once again, I truly do not believe that your veterinarian is trying to mislead you deliberately. In most cases your veterinarian is trying very hard to give you information in a format he or she believes will be easy for you to understand. Despite the good intentions it is necessary to ask your vet where the statistics actually came from when you are told something like “50 percent of the patients with this disorder recover without medication. If you learn to ask where the data came from you will probably get more accurate information over time as your vet learns that you will question him or her about where statistical information came from. It just becomes easier to actually look up the information the first time for clients who are persistent in asking for valid data. You are not being a pain when you ask for this information. You are being a good guardian for your pet!
Contacting Us
We have had some serious problems with email contacts over the last six months. Some of this has been due to problems with our virtual hosting service for the web site. Some of it appears to be due to impersonation of us through the use of false email addresses rigged to look like real email from our web site. Finally, we are forced by the sheer volume of mail to use spam filters and occasionally we miss a legitimate email that gets shunted to the spam folders. If you are having problems contacting me by email you can use any one of these addresses. Please use them one at a time so that email doesn’t get too confusing for me:
mervet@inna.net
mervet@vetinfo.com
miker0409@aol.com
Michal can be reached at vetinfo@vetinfo.com.
If you send us an email and do not get a reply within 2 to 3 days please try again. We try hard to answer most mail within 72 hours, unless we have posted a notification that we are on vacation or otherwise unavailable on the subscriber web site. If you have forgotten the URL for the subscriber site it is www.vetinfo.com/subscriber/subscriber.html
Thanks for
your Support!
The VetInfo Digest is published by:
TierCom, Inc.
P.O. Box 476
Cobbs Creek, VA 23035.
The opinions expressed in this newsletter are those of Michael Richards, DVM.,
author.
Copyright 2007, TierCom, Inc.