NavigationPopular content |
Independence and RandomnessWhile no reader actually submitted this question, it was presented to me while I was chatting with someone online. The question goes something like this: Suppose you have a ten-sided die (0-9). You want to create four-digit numbers with it, so you will roll it four times and mark down those results down, and that will be one four-digit number. The question is whether numbers with repeated digits will be less frequent. When I first heard this, I wasn't exactly sure what was meant. What does "less frequent" mean? It turns out that he meant less frequent than you would expect. How often do we expect a number with the same four digits to occur, anyway? Let's assume that we would allow 0000 (zero) to be a "four digit repeat." That is, if our first roll is a zero, we would not have to re-roll because we are allowing any digit to be any number. In that case obviously there are only ten possible four-digit repeats. And there are obviously 10,000 possible outcomes (0-9999), so one in a thousand of our random numbers on average would be a four-digit repeat. The idea that four-digit repeats would be less likely is somewhat absurd, really, once you consider a wonderful test for a very basic idea in statistics, that of independence. Of course, independence has a definition, and it's not very hard to grasp, but the test I suggest is to ask yourself, "Does the ___ know what came before?" In this case, does the die know what rolls came before? No, so each roll is independent and the probabilities we calculated above are accurate. (If we are dealing out cards without replacement, does the deck "know" what came before?) And think about this: a four-digit repeat has no meaning to dice. It is only in our own minds that we can look at a number and see significance like that. Would one expect one's birthday to turn up less often than expected because it's such a special number?
InfinityInfinity is a funny concept. I can give you infinitely many things while at the same time taking away infinitely many things and leave you with nothing, infinitely many things, or any number of things I want! Think about this scenario: It's one second to noon. I give you ten gold coins (number them 1-10) and take away the first gold coin I gave you (number 1). At 1/2 second to noon I give you another ten coins (11-20) and take away the second coin I gave you (2). If I continue giving you coins 10n-9 through 10n and taking away coin n at 1/n seconds to noon, when noon arrives you will have no coins at all! (Why? Is infinity a number, so that two times infinity is greater than infinity? Google "cardinality" to learn more.)
Inadequate EnglishThe scenario: Bob tosses two quarters and, looking at them, casually announces that at least one of them is heads. What is the probability that both are heads? Your first impulse might be to say "50 percent," but think about this: the set of possible outcomes is {(H,H),(H,T),(T,H),(T,T)}. Since at least one of the coins is heads in three of these cases, and only one of them is double heads, does that make it one third? In other words there are two cases in which Bob would say "at least one of them is heads" when the other coin isn't, and only one case where both are heads. Or is that the wrong way to think about it? Does order not matter, and all we have to say is that the probability that both are heads is the same as the probability that just one coin, the "other coin," is heads, and the probability is in fact 50 percent?
Impossible?How can we define "impossible" statistically? As every elementary statistics or probability text will tell you, for an event A, P(A) = 0 does not mean that A cannot or will never occur. (Why is this? Do you think this is true in the "real world"?) Obviously if a probability of zero is insufficient to make an event impossible, we must look at something different from likelihood. Surely the only way for an event in probability-world (and maybe real life) to be impossible is to contain a contradiction. That is, an event A must contain both B and NOT B, where B is some other event.
Tools for StatisticiansWhat tools do statisticians use? If you're into free, you can try R,which is a command-based statistical package. It comes with the ability to do plotting, script-writing, and a variety of statistical tests. It does not offer a way to manipulate your data visually, however. If you're desperate, you can use Microsoft Excel. If you want to do much beyond basic descriptive stats, you should look elsewhere, because you may have to program these in yourself. The professionals with big budgets mostly use SPSS and SAS. $1500+ big, just for the core system. They both offer excellent raw data viewing and manipulation as well as advanced stat tests with many options. They work with very big datasets. SPSS stands for "Sweet Piece of Statistics Software."*
|
SearchRecommended ReadingUser login |
Recent comments
6 hours 31 min ago
6 hours 52 min ago
1 day 8 hours ago
2 days 1 hour ago
2 days 1 hour ago
2 days 1 hour ago
2 days 1 hour ago
2 days 1 hour ago
2 days 1 hour ago
2 days 1 hour ago