I can’t say why I needed to know but lets just say I have been a little distracted recently as I have been working on a problem. It turns out I could map my problem into the German tank problem.
I had actually kind of pulled an equation out of the air and proclaimed (to myself), “this looks and feels right”. But I needed something more than my gut telling me that. It turns out for a uniform distribution (which, for the most part, my problem is) the best estimate of the true population size based on a limited sample of the population numbers is:
N = m + m/k – 1
Where ‘N’ is the population estimate, ‘m’ is the largest serial number of the samples you have and ‘k’ is the number of samples you have.
This could be used, presuming the serial numbers are sequential, to estimate the number of iPhones or Androids sold. This is far, far, from my application but still a fun application of statistics.
In my application I could substitute in an expression for ‘m’ which made my problem identical to the German tank problem. After rearranging the resulting equation I came up with the exact equation I had, essentially, pulled out of the air!
I’m still marveling at the implications of that result. In a few days I have a meeting with people who may or may not be thrilled to know that much of the work they have done for the past couple of years is bogus and that I have the solution to make it all better.