Tuesday, February 12, 2008

More stats on the world. (updated)

It would be lovely if there was easy access to more fine-grained statistics about the world. It would help in making risk assessments.

For example, I enjoy creative writing. I'd like to estimate my chances of ever getting published. If it's low, e.g. 1%, I just won't bother sending out manuscripts. Instead I'll write solely for enjoyment.

A panel of authors once said that 10% of aspiring writers get published. This number is not helpful. The pertinent data would be a set of writing samples at various levels of ability. For each, it would aggregate the percentage of authors who eventually got published.

Burgeoning writers can then look for a level similar to their own (or ask friends to rate them) and get the associated probability.

Another example is marriage and divorce. Supposedly there are well-known factors that influence the probability of staying together. The well-touted 50% divorce rate is skewed by shotgun weddings, according to a story I heard. If you're not marrying due to unexpected pregnancy, your divorce probability falls below 50%. But it rises if you live together first. Your marriage also has a much better chance if the ratio of positive to negative interactions exceeds 5:1.

It would be nice if there exists a service where people could pay money to fill out a bunch of information about them and their significant other, and get a customized probability score. They might decide to ignore it, but at least curious people can get an assessment.

Another example is startups. 90% of startups fail. Are these startups with great founders and innovative product ideas? Or is it skewed by people who caught entrepeneur fever, without accompanying skills?

Stats are good. More detailed stats would be better.


Some comments are along the lines of "I'm sure if J. K. Rowling knew her chances of getting published were very low , maybe she'd would have never written Harry Potter."

This completely misses the point of the spectrum-odds. J. K. Rowling is an amazing writer. She is a master of suspense. She weaves a spellbinding world. If she used the spectrum system I talked about, she would probably find that while overall odds of publishing are very low, the J.K. Rowling-specific odds are actually high.

Dan's comment captures this well:

Many of the comments are supporting ignorance. "You probably won't succeed, so it's best that you don't know the odds." Ridiculous: if the overall odds are against you, all the more reason to know the conditional odds.

Another set of comments imply "Don't give up writing just because the odds of publishing are low." Who said anything about giving up writing? If you actually read my post above, I said:

I'd like to estimate my chances of ever getting published. If it's low, e.g. 1%, I just won't bother sending out manuscripts. Instead I'll write solely for enjoyment.

The time spent on printing out manuscripts, binding them, and mailing them could be spent on doing more writing.

Why do we promote making blind decisions? Instead of saying, "The odds are low, so you should avoid discovering what they are." we should be saying, "The odds are low, and you should find out exactly what they are, so you can make a rational choice on whether you will try anyway."


John K. Lin said...

I'm sure if J. K. Rowling knew her chances of getting published were very low , maybe she'd would have never written Harry Potter. But she had to write. She was a divorced, single mother on welfare who had never been published, let alone written a novel. Her chances of success was probably less than 1%...

Wouldn't it be great if life was so easily deterministic and calculatable? Sure, one can always get more details stats, but each individual case and situation is usually fairly unique that I doubt any detailed stats could truly be tuned to provide an accurate probability of success, especially with things regarding matters of the heart and human spirit.

Philipp Lenssen said...

I believe you need to fight against the odds to get published Niniane. Besides, there's too many factors involved for any human to come up with a precise number anyway. And probably, if you're the 1%-or-whatever of successful authors, you will not merely write to be published, but because you have something important or challenging to say -- so challenging/ ahead of its time that it may well initially fail your "do people get it" rating test. Probably, in the end, the success number also has a lot more to do with how hard you try... and without a little believe against odds, you might not even end up trying hard.

Now, when it comes to things like shotgun weddings, yeah, I mean then we enter the realms of good advice for success. Apparently, as you indicate, shotgun weddings are not the best recipe for good marriages. Many good similar pieces of advice may be available for other realms, like writing too, and should be reflected by the author (and rejected where it conflicts with personal artistic vision).

Anonymous said...

i think some absurdly low percentage of people who interview at google get jobs there - i sure as hell wouldn't have wanted to know that!

statistician bib said...


Liars use statistics.

But statistics never lie.


pacifist suck said...

Interesting Stats

Military losses for 20 years...

These are some rather eye-opening facts: Since the start of the war on terror in Iraq and Afghanistan , the sacrifice has been enormous. In the time period from the invasion of Iraq in March 2003 through now, we have lost over 3000 military personnel to enemy action and accidents.

As tragic as the loss of any member of the US Armed Forces is, consider the following statistics - the annual fatalities of military members while actively serving in the armed forces from 1980 through 2006:

1980 ..........2,392
1981 ........ 2,380
1984 ......... 1,999
1988 ......... 1,819
1989 ......... 1,636
1990 ......... 1,508
1991 ..........1,787
1992 ..........1,293 <------------------------------
1993 ..........1,213
1994 ..........1,075
1995 ..........2,465
1996 ........ 2,318 8 Clinton years @ 13,417 deaths
1997 .......... 817
1998 ........ 2,252
1999 ......... 1,984 <-----------------------------
2000 ..........1,983
2001 .......... 890
2002 ..........1,007 7 Bush years @ 9,016 deaths
2003 ......... 1,410
2004 ......... 1,887
2005 ......... 919
2006.......... 920

If you are confused when you look at these figures...so was I.

Do these figures mean that the loss from the two latest conflicts in the Middle East are LESS than the loss of military personnel during Mr. Clinton's presidency; hen America wasn't even involved in a war?

And, I was even more impressed; when I read that in 1980, during the reign of President (Nobel Peace Prize) Jimmy Carter, there were 2,392 US military fatalities!

These figures indicate that many of our Media & Politicians pick and choose. They present only those "facts" which support their agenda-driven reporting.

Why do so many of them march in lock-step to twist the truth?

Where do so many of them get their marching-orders for their agenda?

Our Mainstream print and TV media, and many politicians like to slant; that these brave men and women, who are losing their lives in Iraq, are mostly minorities! Wrong AGAIN--- just one more media lie! he latest census, of Americans, shows the following distribution of American citizens, by Race:


European descent (White) ....... 69.12%
Hispanic ................................ 12.5%
Black..................................... 12.3%
Asian ...................................... 3.7%
Native American ..................... . 1.0%
Other ...................................... 2.6%


Now... here are the fatalities by race; over the past three years in Iraqi Freedom:

European descent (white) ..... 74.31%
Hispanic ............................. 10.74%
Black ................................... 9.67%
Asian ................................. . 1.81%
Native American .................... 1.09%
Other .................................... .33%

You do the math! These figures don't lie... but, the liberal media chooses to ignore the facts...and they sway public opinion! These statistics are published by Congressional Research Service, and they may be confirmed by anyone at:


Now ask yourself these two questions:

"Why does the mainstream print and TV media never print statistics like these?" and "Why do the mainstream media hate the web as much as they do?"

Because the mainstream media is controlled by liberals who have an agenda that is not supported by publication of these facts. Bottom Line? Do your own research and do not be swayed by what you see on TV or read in most newspapers.

Anonymous said...

You're just a killjoy for spontaneity and being surprised aren't you?

Assume P = 1,

1) send your manuscript to lulu
2) marry a robot and
3) work for Fortune 500 company which began as a startup


logical one said...

Hey.. you... "pacifist sux" guy

Interesting, but...

Stop quoting statistics.....

Liberals become emotional and stage mindless protest when you they try to decifer logic.

It hurts their fragile widdle minds.

Berkeley is too busy as it is now with the Marine recruter protests.

Stop making trouble.

Anonymous said...

pacifist suck - you're an idiot or you're manipulating the facts (though you correctly attributed 1980 to Carter)

Clinton was *elected* in 1992, be he didn't start his term until January 20th, 1993, so when you add up all the military casualties during the Clinton years (1993 -> 2000 (not counting Jan 1st -> Jan20th 2001), there were 7,500 deaths over 8 years.

Under George W. Bush from 2001 -> 2006, 8,792 in *6 years* according to your table and the .pdf you reference. According to CNN, there were 853 soldiers killed in Iraq (not including Afghanistan, other conflicts and accidental deaths), bringing the Bush era deaths to a total of 9,645.

If there was an active war during the Carter and Clinton years, there would have been a lot more military deaths.

And the American public is angry about the deaths in Iraq because there were many before the war was started that were *against* the war, and a lot more who want the U.S. out of the pre-emptive war we waged in Iraq.

Yes, facts are facts, and the reality is, more Americans dies in car accidents every year (about 40,000) and about 20,000 Americans die a year from the flu - a lot more than those killed on 9/11.

Liberal media my ass - the media failed us post-9/11 for fear of looking un-American and let George W. Bush have a free ride going pre-emptively into Iraq.

Anonymous said...

I meant (not counting Jan 1st -> Jan20th 2000 - not 2001)

Anonymous said...

Oops, maybe I am an idiot too :-) - I did meant (not counting Jan 1st -> Jan20th, 2001)

Strider Aragorn said...

I agree that, had some writer (any writer) known their chances of publishing that first book, it wouldn't have stopped them. They wrote just because they enjoyed it or were compelled to do so in some way. Likewise, while there is a degree of luck and chance involved with startups, many new CEO's do it for the opportunity and the chance. My CEO said it best, a good startup will do their best and do it well. Money is like air. You don't worry about air when you need to breathe, you just know that it'll be there. If you focus on good products and doing your job well, the money will be there.

So, focus on doing what you love and doing it well. Success, money, whatever else comes along with it, will come on it's own. If you spend too much time focusing on those things, you'll lose sight of the big picture and end up lost in a place you don't want to be.

Alex said...

It's likely that your initial submissions will be rejected. You'll then use the experience you gain (from feedback and so on) to make your later submissions better. This process will increase over time the likelihood of your submissions being accepted.

If there were some magical process that could tell you your initial likelihood of success, it would likely yield a very low number which would put you off. That would be a shame, since then you would not have the opportunity to improve.

Thus the availability of the statistics you desire would actually be detrimental to your overall likelihood of success.

Anonymous said...

C-3PO: Sir, the possibility of successfully navigating an asteroid field is approximately 3,720 to 1.

Han Solo: Never tell me the odds.

Alex said...

The author of "The One Thing You Need to Know" claims that the best predictor of a successful marriage is that each partner has a higher opinion of their partner than their partner has of themself.

So I suppose the most efficient dating strategy would be to begin each date by both filling out a questionnaire where they grade themselves on a scale of 1-10 on axes such as attractiveness, intelligence, wittiness, etc. At the end of the date, you each fill out a report for the other where you grade them similarly. If you both consistently grade the other higher than they graded themselves, then you get engaged!

This seems like a very simple, practical, and efficient way to perpetuate the species. Having written all this, I find myself very surprised that it is not already popular!

Anonymous said...

“It is known that there are an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Any finite number divided by infinity is as near nothing as makes no odds, so the average population of all the planets in the Universe can be said to be zero. From this it follows that the population of the whole Universe is also zero, and that any people you may meet from time to time are merely products of a deranged imagination.”

-Douglas Adams

Case said...

Check out "The Black Swan".

It talks about succeeding in writing being one of those hard to predict events.

Lots of other discussion along the line of your thoughts.

John K. Lin said...

Silicon Valley male: "The odds of a Silicon Valley female for being in a relationship is a lot better than for a Silicon Valley male."

Silicon Valley female: "The odds are good, but the goods are odd."

Case - I think Nassim Nicholas Taleb's "Fooled by Randomness" is better than his second book, "The Black Swan." On that note, I've actually seen a Black Swan - having traveled to Australia.

Adam Sweet said...

Writers, like any other artists, write because they have to. Your success is dictated by where you submit to, not how good you are.

I say, if you're a real writer you will continue to write, no matter what the odds are.

And if you want to make money, you'll submit your material to as many publishers as you possibly can.

Anonymous said...

The bottom line is that there are an infinite number of variables to control for in such a quest to quantify things like relationships and likelihood of publication. And statistics seek to define probabilities and means. What you're really looking for is certainty, reassurance that you won't make a mistake.

That problem is better solved through finding meaning in whatever choices you make in life, not expecting the choices themselves to have any Meaning whatsoever in a larger sense.

BarryNg 黄国团 said...

"It would be nice if there exists a service where people could pay money to fill out a bunch of information about them and their significant other, and get a customized probability score. They might decide to ignore it, but at least curious people can get an assessment."

Good positioning for thoughts; encourage u to start one then!

Min Liu said...

dear niniane,

please write more frequently. thank you.



John K. Lin said...

"Um, ... Well, I think probably, uh, the #1 factor that contributed to our [Google's] success over the past 7 years, uh, is luck...uh, so I don't know if there's much great advice I can give about, you know, how, how you, about you getting that. We were very fortunate in many ways... I think we followed our hearts in terms of research areas."


- Sergey Brin, Web 2.0 conference, as interviewed by John Battelle -
Minute 2:25, 2005-10-06


LMB said...

anonymous & pacifist suck

If not now when?

The Mideast has been perpetual shit-hole since time forgotten.

So you think Uday and Quesay would have lead Iraq to a better place?
50 years from now with Uday in charge Iraq would still be a boil in the butt ox of the world for sure.

Saddam lied that he had WMD to keep his evil neighbors at bay.
(In his own words.. he thought the USA would only bomb a few sites and not actually invade)

Played the wrong card. Thought the USA would let it slide that he had WMD

Sad that Iraq is not like Canada or Mexico,Canada and Mexico decry USA but reap the benefits of not needing more that a cute little military.
(wow how cheap for them)
Attach them and the USA will attack you.

Saddam also said. "Sad but you cannot select your off spring" (Referring to his heir Uday)

Who's to blame for his invasion?
The one who has been called on his bluff for sure.
Not the one who made the best move based on meticulous card counting consultants.

If you lie to an authority figure.. a teacher, a cop, a judge, the UN, USA, Congress...
You will not like the results. And deservedly so... (Well the UN has no teeth or balls)

At some point in time I prey the people in the Mid East can have the great life that I have had in
Silicone Valley. My biggest worry.... Will my stock options fall.. Oh the horror...
Or that I had to wing it at the outlook booked meeting because I did no prepair.

My good Irish friend grew up in Iran under the Shaw's administration.
His parents working for a orphanage.
Wow .. such hope and brightness.

Uday, Quesay, Amhad, Brasher, have brought no potential advancements to society.
They all need to be destroyed... and in short order.

Painful in the short run.
Necessary sooner or later.
The sooner the better.

I would say Fuck that side of the world.
But the people there are no different then anyone.

I prey for them everyday.

Is world peace possible?

No weapons of mass destruction....
Just Microsoft's attorneys vs. Yahoo's kind of peace.

Dan said...

Many of the comments are supporting ignorance. "You probably won't succeed, so it's best that you don't know the odds." Ridiculous: if the overall odds are against you, all the more reason to know the conditional odds.

Have you read Steven Jay Gould's The Median Isn't The Message?

"An hour later, surrounded by the latest literature on abdominal mesothelioma, I realized with a gulp why my doctor had offered that humane advice. The literature couldn't have been more brutally clear: mesothelioma is incurable, with a median mortality of only eight months after discovery. I sat stunned for about fifteen minutes, then smiled and said to myself: so that's why they didn't give me anything to read."

Niniane said...

> I'm sure if J. K. Rowling knew her chances of getting published were very low ,

You are making an assumption. J. K. Rowling is an amazing writer. If she used the spectrum odds that I talked about, there's a good chance that it would say, "While the overall chances of publishing are extremely low, your chances are actually very good."

John K. Lin said...

I'm not sure I'm assuming J.K. Rowling is an amazing writer - I've never read a Harry Potter book. I just know that her books have been best sellers, which doesn't necessarily mean she's a great writer.

I guess the distinction should be made between being a great writer, being publishing, and having a best seller.

In Rowling's case, as a never before published writer, her chances of being published were still low - like any other unpublished writer. To get an accurate spectrum of odds to Rowling's specific case of being published I think would be quite difficult, if not impossible. But I guess one can wish.

Niniane said...

I didn't say you assumed she's a great writer.

*I* am saying she's a great writer.

Also I disagree that she had low odds just because she's unpublished, because of her writing talent.

Why are you spouting all this stuff if you've never even read a Harry Potter book?

metal said...

I'm surprised no one else has said this but...

only people who play the lottery actually win.

John K. Lin said...

Even great talents can go unrewarded. How many starving artists died before their works of art rose to fame? Even Shakespeare wasn't as well known or regarded in his own day as he is today.

I was just using J.K Rowling as an example - because she had no friends or connections in the publishing industry, no track record. (I've seen the movies :-)) Publishing (and anything creative / entertainment related - art, tv, movies, music, comedy) is a competitive field where not all talent is rewarded and where the evaluation of talent can be highly subjective.

So, as much as I like the concept of having a spectrum of specific odds / probability distribution, the reality of having something you propose to specific situations I think would be difficult if not impossible.

And I'm not saying that one should blindly make decisions.

In any case, I also recall seeing a 60 Minutes interview with Rowling and didn't realize the obstacles she had to overcome.

Anonymous said...

Statistics is like a bikini;what it reveals is suggestive, what it hides is vital

Alex said...

Rather than getting steamed up about the potential benefits of analyzing a bunch of statistics that don't exist, why not seek an answer to the question "What are the chances of someone with my background and interests getting published?" by getting in touch with people who have similar backgrounds to yours and who have already had their work published?

Anonymous said...

Would the quantum hoops stop playing if they had better stats?

Strider Aragorn said...

I'm not saying that Rowling would have stopped or even tried harder had she known the odds. I'm just saying that she didn't write them to be published. The story I heard (not verified) is that she couldn't afford money to buy books for her kids, so instead she wrote her own. It was a friend who read them and told her to try to publish them that lead to her fame. To somewhat rephrase what I said earlier, do what you want to do and the rest will fall into place. To do anything less is not only cheating yourself, but everyone else that might benefit form your work.

Anonymous said...

the point of not knowing odds is that they tend to have a discouraging affect which needlessly hampers someone who might have great ideas. jk rowling might have had a lot of feedback prior to being published, indicating that she's a great writer, but the majority do not have that kind of feedback. unfortunately the "rational choice" paradigm as applied to a lot of far-out successes fail completely: consider that the tech industry is led by people who never finished their degrees! how rational is that?

whenever this topic comes up i always think of the infamous discovery of huffman encoding in information theory. whether or not you want to make a rational choice to pursue an endeavor based on odds, i'd rather consider that most of the choices that have resulted in success were not actually made rationally.

Philipp Lenssen said...

> "The odds are low, and you should find out exactly what
> they are, so you can make a rational choice on whether
> you will try anyway."

Niniane, you cannot find out the odds if it's art -- art consists of creating something *original* so by definition there is no prior data on your specific work to base stats on. If you contemplate writing genre fiction, say pulp spy novels, OK, then maybe you can calculate your odds; maybe you can even automate the process of writing this using algorithms (welcome to Orwell's "Fiction Department"). But that's probably not what you're after. And again, if you're an artist working in whatever medium -- writing, painting, movie-making etc. -- you will do so because you *must*, because you have something to say, because you see the world around you and you have a reaction and you MUST get out the reaction... not because you want to find a fail-safe way to get rich or something, but because it's your *desire* to get it out.

Jim Norris said...

I'm not much of a writer, so I have no intuition about what motivates most writers to ply their craft. I get distracted too easily to spend much time writing. For example, I got distracted by the random anti-liberal-media comments posted by "pacifist suck" since those stats were pretty surprising to me. I clicked on the referenced link and found the corresponding table from the FAS doc. Half of p.s.'s death count stats exactly matched the FAS/CRS data, and half of them were just completely made up. Who are these people, and are they so lazy that they can't even come up with a data source that matches their supposed facts? I'm offended that anyone would have such a low opinion of my ability to line up columns of numbers. It reminds me of something I think muller or egnor once said, something like that the most important thing to know about American culture is that if you repeat a lie often enough in enough random places from enough allegedly different sources, it becomes true.