Subject:      Statistics, Lies and Business (Was: The Key Issue)
From: (Arjun Ray)
Date:         1996/10/05
Message-ID:   <534sc1$>
References:   <>
Organization: FUDGE Dispersal Systems
Newsgroups:   comp.infosystems.www.authoring.html

In <>, 
Ken Bigelow <> writes:

| businesses have to pay attention to statistics; they have no choice.

| *Businesses will always base their decisions on recorded facts.* 

Truisms. Don't forget what statistics and which facts.

| You and I both know the truth about "liars, damned liars, and
| statisticians" or variations thereof. 

The correct variation is "There are three kinds of lies: lies, damned
lies, and statistics". Its mordant truth consists of the propensity to
confuse statistics with facts...

| It still won't bother the businesses.

... and businesses are not immune. 

Some businesses (e.g. insurance, credit cards, mortgage loans) are
based on professional statistical procedures, even though most of this
expertise has been automated and few if any experts exist in-house.
Another class of businesses (e.g. market research, polling) in effect
sell such professional service, but barring any rigorous analysis of
methodology, for which the clients demonstrably are *not* qualified,
the quality of the product is moot. The vast majority of businesses
either buy their statistics (directly or indirectly) from consultants
or crank it out themselves, in either case simply taking it on faith
that the presumed relation between *true* statistics and *relevant*
facts holds for those reports on that desk, er, screen.

In general, the faith is misplaced. The biggest mistake (among many)
is to attribute to statistics the reliability it has in experimental
(branches of) sciences. Typically it takes the form of ignoring the
concept of the control and therefore failing to formulate a Null
Hypothesis -- the conclusion you want to reject based on statistical
*un*likelihood. A related error is to ignore (or be unaware of)
systematic bias or drift, and instead blithely assume that averaging
over a larger data set will somehow automatically be more "accurate".
A third error is to forget that random sampling is difficult even
under laboratory conditions (which web sites definitely do not even
remotely approximate), and so *unexplained* variation in the data is
*not* likely to be due to random causes. And so on. The common thread
is the naive fallacy that statistics establish facts. Quite the
contrary: assertions are tested for statistical likelihood.

That said, businesses still do what they do. If they've cranked their
own statistics, drawn their own conclusions, and simply tell you what
they want done, as a hired gun you have no choice -- other than to
refuse the contract, i.e. you're being paid for labor, not expertise.
But when you're supposed to be the expert, what then? You could play
on their ignorance and tell them anything you gauge they're likely to
believe (let's say because you want the numbers to "say" that they
"need" what you're good at), or you could as a professional actually
conduct a statistical analysis.   

| According to these numbers, it still makes no business sense to
| double the costs to go after 1% of the market. There's no profit in
| it; even if that entire 1% bought something, it wouldn't pay back
| the development costs to get it. 

This is both obvious as stated and misleading in its context. Look at
this way: what percentage *would* justify doubling the costs? And if
such a percentage exists, wouldn't having to double the costs suggest
fundamental flaws in the original plan?

You just knocked down a straw man. The principle you're trying to
establish has a correct formulation, but this wasn't it.

| If you want these businesses to change their approach, you'll have
| to convince them that it makes good sense *to them* to accommodate
| you.

They might be relying on *your* good sense:-)

| This means you'll have to gather provable statistics on how many
| users actually use text-based browsers to check out business sites
| before looking at the images. [...] you'll need to show, by the
| numbers, that there are enough such users on the net to make it
| profitable, *from a business sense,* for those businesses to expend
| that extra effort.

Much closer this time:-) The notion of "extra effort" is very
relevant. But from what baseline? In which direction? Why? Let's look
at your own testimony.

| My resources are measured in time at this point, and my 'profit' as
| clients who hit my site, stay to look around, and maybe come back to

| see what else I may have done. When/if I go commercial, I'll have
| that same business requirement to maximize my profits, which means
| maximum return for minimum outlay. 

Right. Consider first the outlay "quantum" that could be considered
Fixed or Sunk Costs. If you're a professional, it should be no real
effort to produce structurally correct HTML. You know what elements
can be used where, and apart from inadvertent typos or transpositions
this is all relatively routine work. Note that this covers or defines
the effective 100% of the market you could *ever* be interested in.
The first quantum of *extra* effort is in bells and whistles that
degrade gracefully. This too is essentially "costless" -- what
economists call a Pareto-superior outcome (i.e. some gain, nobody
loses.) You're now at the "frontier" -- the zone of true *economic*
decision: because now any further "optimization" requires you to
choose a particular direction where some will gain and others lose.

Time for some numbers, and an example. Let's assume a 5% callback rate
on your site (i.e. 5% of the hits lead to business), and consider the
classic choice of whether to put an imagemap as the only navigation on
your entry or home page: by assumption, this choice means losing all
business from people browsing in text-mode. Let this percentage be 20%
(your 1% figure appears extreme, and I'm choosing a number mainly for
ease of arithmetic). For this choice to break even, you have to make
up the 5%x20% = 1% business you've deliberately forgone with *extra*
business in the *narrower* market -- your callback rate has to go up
to 6.25% on the 80% you went after. In other words, the *economic*
justification of the imagemap can be cast in the following terms:
without it, you have 5% on 100%, and with it, you have *more* than
6.25% of 80%. 

But will this be true? There's no ROI analysis out there on web sites
that I know of based on anything more than SWAGs (sophisticated
wild-ass guesses.) Don't take this particular example too literally:
but the principle involved -- tradeoffs at the margin -- should be
clear, and yes, good numbers will help, too:-)

Besides, what if your next client knows his economics and