Sunday, September 2, 2012

A nice way to start off a Sunday

Arsenal beat Liverpool 2-0 away from home. Also, Arsenal's new striker, Lukas Podolski, scored his first goal.

Being both Polish and a fanatic soccer fan, I can't believe I never wrote anything about Podolski all this time. For those unfamiliar with this player, here's some background: he was born in Poland, both his parents are Polish, but they left Poland and settled in Germany when he was two years old. He holds a double citizenship. According to FIFA rules, if a player with a double citizenship is wanted by two national soccer teams, he can choose which one he wants to play for; however, he only gets to make that choice once and cannot change his mind later. Podolski chose to represent Germany and not Poland, mainly because the Polish federation was very late to the party and did not call him for the Polish national squad until he was 21. Some Polish soccer fans are very angry because of this, either at Podolski himself (for being a "traitor") or at the German soccer federation (for "stealing" a Polish player).

What's interesting in all this is that Podolski plays for Germany even though he feels more Polish than German. He always speaks Polish to the Polish media. Whenever Germany plays Poland, he makes a point of singing both national anthems before the game; also, when he scored two goals against Poland in the opening game of Euro 2008, he chose not to celebrate as a show of respect for Polish fans. He is married to a Polish woman and says that he will move to Poland once his professional soccer career is over.

All in all though, one has to remember that even though he is Polish, he is not a Polish soccer player. He started playing soccer in Germany, was picked out by German scouts, developed by German youth coaches, gained his experience in German soccer league. Polish fans who think he's been stolen need to realize that had his parents never moved to Germany, he'd most likely not become a player of the quality that he now has.

Friday, August 17, 2012

Worshipping numbers

Some people worship numbers. Paradoxically, number-worshippers are bad with numbers. Because they are so bad with them, they’re unable to critically evaluate claims that involve numbers. You can make them believe almost any idiocy, as long as you use lots of made-up or otherwise incorrect numbers in the course of selling it to them.

Some time ago in the UK there was a criminal trial in which the defendant was a mother accused of killing her two babies. Her defense was that both her babies died of the sudden infant death syndrome. Prosecution called an expert witness, a pediatrician, who testified that, since studies show that the risk of a sudden infant death syndrome occurring in a family similar to the defendant’s is 1 in 8,500, the likelihood of two such deaths occurring in one family is 1 in 73 million. Because the prosecution, the jury, the judge, and the defense were all number-worshippers, this idiotic claim went completely unchallenged. He’s talking numbers. He must be right.

The defendant’s conviction was later overturned on appeal in which it was shown that the expert’s probability claim was completely bogus. Statisticians and health researchers have shown that, first, the assumption of independence is totally unwarranted, and second, that the original calculation involved unconditional probabilities where conditional probabilities should have been used instead. But there is a far simpler reason why this calculation is ridiculous. In order to see this reason, you don’t even have to be good with numbers or know much about probabilities. All you need is a mind that thinks and does not worship numbers. One of the journalists reporting the initial trial saw it right away. Wait a minute, he said, are you telling me that the probability that this woman is innocent is 1 in 73 million? Surely this can’t be right. A great majority of mothers whose babies die did not murder them. But when, instead of thinking, you worship numbers, such simple truths are inaccessible to you. So you can wield your sanctimonious judgment on innocent people with a clear conscience.

(In this video you can learn about the details of the trial.)

Sunday, August 12, 2012

Soft drinks and GDP

I just read that Coca Cola in Poland employs about 2,700 people and produces 0.002 of Poland's GDP. If this is true, then assuming every worker had productivity on that level, Poland's GDP could be produced by a labor force of about 1.4 million people. (In reality it is almost 18 million people.) Or, equivalently, keeping size of labor force at current level but increasing everyone's productivity to Coke level, Poland's GDP per capita would be well over $100,000.

Saturday, August 11, 2012

My dear Mr. Babbage

I am very much obliged to you for sending me cards for your parties, but I am afraid of accepting them, for I should meet some people there, to whom I have sworn by all the saints in Heaven, I never go out.
(Charles Darwin to Charles Babbage.)

Friday, August 10, 2012

The customer is always right. But which customer?

How many times have you heard statements to the effect that bond credit ratings must be biased since institutions that issue bonds pay for these ratings? A lot, probably. These statements are shallow, and wrong. The fact that someone pays to be evaluated does not mean that the evaluation must be biased. If you decide to go to grad school, you'll probably have to take the GRE or some other standardized test. For that test, you'll have to pay the testing company. Does this mean your score will be biased in your favor? Definitely not. Why not? Because test scores, as long as they're not biased, provide schools with valuable information about prospective students, so they want those prospective students to be screened that way. If test scores were biased, schools would no longer require them, which means prospective students would not be willing to pay for them anymore. Even though test-takers pay to be tested, testing companies have strong financial incentives to keep test scores as honest as possible.

Tuesday, August 7, 2012

The most important Master's thesis of all time

Has been defended in 1936 by Claude Shannon, and is titled A Symbolic Analysis of Relay and Switching Circuits. In it, Shannon shows that certain electric circuits are isomorphic to Boolean algebra.

Sunday, July 29, 2012

Algebra-shmalgebra

If you have a fever, throw away the thermometer. Yeah, that sounds like a good idea.

Edit: Here is the best rebuke of this ridiculous op-ed that I've found.

Saturday, July 28, 2012

You didn't earn that

There are a lot of Nobel Prize winning Americans who agree with me -- because they want to give something back.  They know they didn’t -- look, if you’ve won the Nobel Prize, you didn’t get there on your own.  You didn’t get there on your own.  I’m always struck by people who think, well, it must be because I was just so smart.  There are a lot of smart people out there.  It must be because I worked harder than everybody else.  Let me tell you something -- there are a whole bunch of hardworking people out there.  (Applause.)

If you were successful, somebody along the line gave you some help.  There was a great teacher somewhere in your life.  Somebody helped to create this unbelievable American education system that we have that allowed you to thrive.  Somebody invested in high schools and universities.  If you’ve got the Nobel Prize-- you didn’t earn that.  Somebody else made that happen.  All these physics labs didn’t build themselves.  Government research created those labs so that all the scientists could use them to do their research.

Warum hast du Angst vor mir?

I was taking my three-year-old Polish Lowland Sheepdog for a night walk. There was a middle-aged woman walking our way on the sidewalk; she was pushing a shopping cart filled with plastic bags filled with stuff. When she was passing me, my dog got really nervous. She's naturally wary of all strangers, and did not like the woman at all; she moved across the sidewalk to be as far away from the woman as possible. Seeing which, the woman said, 'Why are you afraid of me?' Only, she said it in German. (The translation is this post's title, in case you were wondering.) Which for some reason totally floored me. Only in New York, I thought quite nonsensically.

When I relayed the incident to my wife, she said, 'Well, she is a Polish sheepdog after all, so no wonder she's afraid of Germans.'

Wednesday, July 25, 2012

Gimme 30

George Dvorsky writes:
Dave Asprey, the Bulletproof Executive, claims that his IQ was raised 30 points by taking creatine and going through Dual N-Back training exercises.
This is one of my smaller pet peeves: talking about IQ scores as though the scale was linear. Increasing IQ from 100 to 130 means an increase from 1 in 2 to 1 in 44, whereas increasing it from 130 to 160 means a jump from 1 in 44 to 1 in 32,000. Another 30 points and you're 1 in a million.

Sunday, July 22, 2012

For any belief there's a quote supporting it

A hipster clothing store in the East Village posted a chalk-board sign outside their door with a quote from Hemingway: "In order to write about life, you must live it first!" Or something to that effect.

I guess neither Hemingway nor the staff of the hipster clothing store in the East Village have ever heard of Marcel Proust. You know, the guy who is a direct counter-example to the above quote.

Friday, July 13, 2012

On the necessary existence of the elephant in the room

There is a philosophical view which says that mathematics is a way of evolutionary signaling, i.e. an elaborate way of demonstrating fitness (in this case, intelligence) to one's potential mates. This may very well be true but there is an elephant in the room that it does not address. Which is: while the purpose of doing mathematics may be an accident of evolution, its content cannot. It is possible to conceive of a world in which this function of doing mathematics is not plausible, or even a world in which natural selection does not exist at all; however, it is not possible to conceive of a world in which theorems of mathematics are false. Provable propositions are true in every possible world. Inasmuch as mathematics contains theorems about certain modes of reasoning, it follows that there is a realm of human thinking which cannot be an accident of evolution.

Wednesday, July 11, 2012

How come it's not correcting itself?

Sam Wang writes about the deficiencies of Intrade prediction market:
Even when lots of data are available, such as political polls,  InTrade can still fail. One simple reason is bias: InTrade bettors appear to skew Republican. This could explain why there is such a mismatch between the poll-based Obama win probability (>99% for an election today, probably >80% in November) and the InTrade price (equivalent to a probability of about 0.56). This could be excused on the grounds that the election is far off, and there is uncertainty as to what will happen in the next 4 months. However, there is a third flaw. As I’ve written before, InTrade bettors are habitually underconfident in the face of polling data, even on the eve of an election. Even a 10-point lead in a race is insufficient to drive a market-based probability estimate above 80%. This is perplexing since such a lead is basically a sure thing.
A functional market should be self-correcting. Any systematic bias such as underconfidence or leaning Republican creates an arbitrage opportunity which should be expected to draw new bettors until the point when prices adjust and bias disappears. The question is why this isn't happening in this case.

Monday, July 9, 2012

Your subconscious can only use call-by-value semantics

It's completely incapable of calling by reference. That amazing trick falls squarely within the domain of explicitly conscious reasoning. The subconscious is only able to deal with things directly and does not make the distinction between use and mention. This leads to some very specific bugs, such as the inability to correctly evaluate conditionals. For example, if proposition p is false, the subconscious will erroneously conclude that the implication p -> q must be false as well. Actually, it's worse than that; since the subconscious is incapable of reasoning about propositions without believing that they are true, it will refuse to even parse the conditional and just return a syntax error message.

Sunday, July 8, 2012

It's official now: Everything is related

I just finished watching the British movie trilogy Red Riding. In the first part there's a scene where a young journalist who's prone to seeing conspiracies everywhere, while talking to his buddy in a bar, says 'Everything is related; show me just two things that aren't related.' His friend replies, 'City and the fucking championship.'

Is the US under-insured?

Numbers below are 2008-2011 averages in 28 OECD countries as per World Development Indicators.

Thursday, July 5, 2012

It's good for you, in that it will make you immortal

This article is a great example of how dumb science reporting can get. A quote:
Coffee-drinking men cut their risk for death by 12 percent after four to five cups of java, according to the study, which was led by the National Institutes of Health's Neal Freedman.
So if I drink 42 cups, I'll cut my risk for death to 0%. Sounds like a good deal to me. Studies such as this one usually define "risk of X" as "risk that X occurs during the duration of the study," which would make the claim that coffee reduces the risk of dying by whatever percent make a lot more sense. But the moron who chose to summarize the study didn't think details like that were important. Next quote:
The report sparked some confusion, too, as coffee drinkers were also puzzlingly more -- yes, more -- likely to die. The reason? Coffee drinkers are also generally smokers. How can coffee drinkers can be both more and less likely to die seems like an arithmetic mystery -- but cut out smoking altogether, and the correlation between coffee and longer lives still stands.
Sure, if you're a dimwit, this could indeed "spark some confusion." So the fact that drinking coffee reduces the risk of death when controlling for smoking, but is correlated with higher risk of death when smoking is not controlled for, "seems like an arithmetic mystery" to you. And you write about science.

Monday, June 25, 2012

Alan Turing centennial

It could be a weird contraption made of scrap metal and wood:



It could be a "2-state 3-symbol" whatchamacallit:



Or it could be a few lines of Python code.

A Universal Turing Machine, one of the most powerful concepts ever.

Wednesday, June 20, 2012

Elevator wisdom

One of those elevator screens that show snippets of news has made me aware that there's a Generation Z now. The Generation Naming Committee is not showing great foresight here. Can't wait to see what they're going to name the next one.

Tuesday, June 12, 2012

Congratulations Kuba

I saw an entertaining soccer game today: one in which Poland, co-hosts of the Euro 2012 tournament, tied Russia 1-1 coming from behind. What this means is that the Polish team is one win away (in a game against Czech Republic this Saturday) from advancing to the quarterfinals of the European Championship for the very first time.

Despite the fact that Poland looked slightly the better team and could just as well have won, the tie feels good because before the game everyone was writing us off as easy pray, as Russia is theoretically a much stronger team than Poland. What also feels good to me is that the goalscorer for Poland was a player that I'm personally a big fan of--the team captain Jakub (Kuba) Błaszczykowski. Błaszczykowski, 27, is currently a right-winger for the Bundesliga champions Borussia Dortmund. I've been following his career ever since he was a wee lad of 18, playing for a small club in Poland, telling everyone who would listen that one day Kuba was going to be great. Ha! Told you so.

One thing that many fans do not know about this player is that he lives his life in the shadows of an unspeakable tragedy. When he was 10, his father stabbed his mother to death right in front of him. Since then he has been raised by his uncle and his grandmother. Traumatized by the event, little Kuba refused to leave his bed for a week, and didn't utter a single word for months. Already showing sings of tremendous soccer talent, he has lost the will not just to play but to live at all. Somehow, though, he has been able to reclaim that will. So congratulations, Kuba, for today's goal, I knew that one day you were going to do something like this.

Here is a compilation of clips showing Kuba's skills throughout his career:

Tuesday, June 5, 2012

Obviously a vast improvement

From a hilarious post by John D. Cook about the blessings of object-oriented programming:
In the dark ages of programming, functions acted on data. To slice your bread, you passed a bread data structure to a slice function: 
slice(bread);
Then came object oriented programming. Instead of having an external function slice our bread, we would ask the bread to slice itself by calling the slice method on a bread object: 
bread.slice();
Obviously a vast improvement.

Basic principle of science: If something happens, it must be possible

About five years ago I got in the middle of an internet discussion about torture. It took place in the comments section of a blog that strongly opposed legalizing torture as means for national security agencies to gather intelligence in extremely important national security cases, for two reasons: 1) because torture is morally reprehensible and 2) because it's ineffective in terms of providing reliable intelligence due to the fact that torture victims will say anything, come up with any lie to make the pain stop, so that the information they provide is completely unreliable. I wrote a comment agreeing with the first point but disagreeing with the second, and provided a few examples showing that torture can sometimes "work" (i.e. generate truthful, actionable intelligence that could not have been procured otherwise).

I've had many replies, with one consistent theme in most of them: that my examples weren't plausible because everyone knows that torture victims will say anything and come up with any lie they can think of to make the pain stop. I started replying to this theme, my reply being something to the effect of that sometimes there probably are situations such that, even though the torturer doesn't know what the truth is (otherwise why torture anyone in the first place), he nonetheless has some knowledge as to what the truth is not, which means he can tell if his victim is lying to him. As I was drafting this reply, I realized I was full of shit: I didn't have the first clue if that was indeed the reason why torture worked in my examples, or any examples, and that I was basically making it up as I went along. I also realized something more important: that I didn't have to answer this question in the first place, that the argument was already over and I had won it.

You have a clever and a priori very plausible theory as to why event X can't happen. One way for me to disprove your theory is to find an instance of event X having happened. All I need is a counterexample. I don't have to come up with a counter-theory that explains why your theory doesn't work. I don't have to understand why X happens; I can be as mystified and baffled by it as you are. Your theory says X is impossible. I show you that X happened. Since X happened, it is obviously possible, and I don't need to know why it's possible to know that your theory must be wrong.

Friday, June 1, 2012

Best opening sentence in a work of fiction

That I have ever read, at least.
One morning, as Gregor Samsa was waking up from anxious dreams, he discovered that in his bed he had been changed into a monstrous verminous bug.
Makes you want to keep reading doesn't it.

Tuesday, May 29, 2012

Quite mesmerizing

Another short video by Cristobal Vila, this one about the ubiquity of the Fibonacci Series and The Golden Ratio in nature:

Wouldn't you wish your work space looked like this?



So would M. C. Escher.

(HT: God Plays Dice.)

Copy/improve/paste, part 5: But he just kept right on...

If someone did a scientific survey on this I would probably be in the minority, but I like Lauryn Hill's cover of Killing Me Softly much, much more than Roberta Flack's original. Which I like a lot.

Tuesday, May 22, 2012

I am whatever I say I am

When someone says "I am an X kind of person," the value of X is usually a lie. Oftentimes, however, it's informative as to what the truth is. For example, sometimes it's wishful thinking, in which case if you ask yourself "What kind of person would want to be X?" you'll be on the right track.

Sunday, May 20, 2012

A sentimental journey back in time

Any time I visit a post office to do anything more trivial than buy a book of stamps, get checked at the airport by TSA officers, or deal with the immigration office in any capacity, I get a visceral reminder of my childhood days. Seeing as I grew up in Poland under communist dictatorship, this is not a good thing.

Friday, May 18, 2012

LaTeX fail

From the mathematician Jeff Shalllit:
One problem with the proliferation of "open access" journals is the decrease in quality. A good example is this "proof" of Fermat's Last Theorem by a guy who seems to specialize in rather eccentric papers. This paper was passed around to great laughter at the van der Poorten memorial conference in Australia.
Now I am completely incompetent to judge the content of this paper, but I think I'll take Shallit's word for it. The atrocious quality of its LaTeX formatting is, I think, strong Bayesian evidence that Shalllit is right. The LaTeX code in the paper is hilariously bad. It's just fireworks of insanity. To any of you who have ever compiled a paper in LaTeX, taking a look at it should be good entertainment.

Saturday, May 12, 2012

Overheard in Washington Square Park

"And then you become your own Goebbels."
Fuck me, that would be a horrible fate indeed. To all my friends: if you ever notice me muttering strange things to myself and if in my ramblings you can distinctly recognize words like Herrenvolk or Lebensraum, please let me know I need help.

Wednesday, May 9, 2012

Copy/improve/paste, part 4: In the name of the father

This is part 4 of the little series about great songs with even better covers. It may blow your mind; I guarantee you it will be the most incredible music story you have ever heard.

Since about the early 90s, I was a fan of the Polish rock-punk-ska band called Kult, established in the early 80s and led by a singer-songwriter-saxophone player Kazimierz Staszewski, famous in Poland as Kazik. About that time, the band released a record called "Tata Kazika" ("Kazik's Dad") consisting solely of covers of songs written by Kazik's father Stanislaw Staszewski in the 1950s and 60s. The songs were incredible. Their lyrics were (still are) the very best I have ever heard in either of the two languages I know well. Kult's music was also very good. At the time though, as pretty much everyone in Poland, I had no idea who Stanislaw Staszewski was or that Kazik had a dad who was himself a musician.

During World War II, Stanislaw Staszewski was a soldier in the underground anti-Nazi Home Army. As a participant of the Warsaw Uprising, he was arrested by the SS and shipped to Mauthausen-Gusen concentration camp in Germany. While in the camp, he contracted pneumonia. The camp guards mistakenly thought he had died, and threw him into the heap of dead bodies that were to be burned later. However, a package addressed to him had arrived at the camp, which prompted one of the inmates to go look for him. That's how it was discovered that he wasn't quite dead. He had managed to survive until the camp was freed by the Americans. He then returned to Poland, became an architect, got married and had a little son named Kazik. He also started writing songs that he would perform at house parties for his friends. Exiled from Poland by the communist, he and his wife left for France. He left his family though, and went on to live by himself in Paris, still writing songs, God only knows for whom. He died there in 1973.

His son Kazik grew up without a father, and, for most of his life, passionately hated him for abandoning his mother. That's part of why, even though he knew his music and was a musician himself, Kazik never touched his dad's songs until the 1990s. Another reason for this reluctance was that Kazik was at the time a deeply religious man (a Jehovah's Witness, though not baptized), and his father's lyrics were dark, depressing and quite vehement in their denial of the existence of God. They were exactly what you would expect from writings of someone who once lay dying in a heap of naked bodies destined for a crematory furnace. They were also magnificent.

At any rate, at some point Kazik changed his mind and his band made two records with covers of his dad's songs ("Tata Kazika" and "Tata 2"). Some time later he also changed his mind about religion and became quite an outspoken atheist.

Below the fold are two versions of the song Celina, which is a story of love, jealousy and murder, set in a poor and crime-ridden neighborhood of pre-war Warsaw. The cover isn't actually the official version from the "Tata Kazika" record; rather, it appears to be a private performance by Kazik, but whatever, it doesn't matter.

Thursday, May 3, 2012

Clever words about stupid words

The stupid stuff is this essay by Stephen King. Of which Mike Munger had this to say:
So my man wants the government to both "fix global warming" and "lower the price of gasoline". Nice work there, Steve. Your political economy is way scarier than your fiction.

Monday, April 30, 2012

How about some perspective

Is it me, or are today's politicians and media people more prone to exaggeration than they used to be some two decades ago or so? My favorite recent example: the chairman of the House Committee on Homeland Security, Rep. Peter King (R-N.Y.), called the misconduct of Secret Service agents in Cartagena, Colombia "the worst moment in the history of the Secret Service."

Seriously? If I remember correctly, there was once a crew of Secret Service agents who allowed the President to be shot dead on their watch. I really don't think that has been topped yet. 

Saturday, April 28, 2012

A lighthearted prank with a dark side

The video below is a funny little prank that, on second viewing, is much less funny. Watch the slow motion part, and keep in mind that these girls are friends and roommates. The first girl out the door is desperately trying to shut it in her friends' faces. The second girl pushes the third one out of her way to get to the door faster. Female friendship.

Like taking candy from a kid

If you haven't felt homicidal fury in a really long while, watch this video.

Like French

Evaluating the Design of the R Language is the first paper about R aimed at the academic programming language community. Memorable one-liner:
As a language, R is like French; it has an elegant core, but every rule comes with a set of ad-hoc exceptions that directly contradict it.

Wednesday, April 25, 2012

Lie out

A word of wisdom from a long-ago engineering colleague: "Whenever I see an outlier, I’m never sure whether to throw it away or patent it."
--Berton Gunter (R-help mailing list, December 2009)

Nature actually found a third solution: Let outliers have a bunch of little outlier babies and then expose those to selective pressures.

Friday, April 20, 2012

Famous quotes that get things wrong: Einstein

Albert Einstein is generally credited with the following thought:
The definition of insanity is doing the same thing over and over and expecting different results.
I'll prove that the quote is wrong by counterexample. Texas Hold 'em is a zero-sum game, which means it has at least one minimax strategy profile. Due to the fact that the outcome of any particular hand is determined not only by players' strategies but also in part by chance, it is possible for any player to lose a hand while playing his minimax strategy. The quote above advises a player to change his strategy in such an event. Since, by assumption, the player's strategy is optimal, this advice is absurd. QED.

Wednesday, April 11, 2012

Penalty kicks and the handicap principle

Few hours ago I caught the first half of a soccer game that my team won 3-0. Coincidentally, the first goal of the game, scored off of a penalty kick by the team captain, was a good example of the handicap principle in action. If all you're doing is trying to maximize the probability of scoring, this is a dumb way to take a penalty, because it's needlessly difficult. But if part of your utility function is showing off, this makes sense: by taking an unnecessary risk of looking really stupid if you fail, you're signaling supreme confidence in your skills.

Tuesday, April 3, 2012

Fourteen years and a funeral

The newly elected coalition governing Poland is planning to pass legislation increasing retirement age from 65 to 67. Some guy named Krzysztof Feusette, who writes for the second-largest Polish daily newspaper, is protesting against this in an op-ed Four Years and a Funeral, in which he attempts to show that Polish men will, on average, enjoy retirement for only four years. He compares this figure with its counterparts in other European countries, showing that it is generally much lower.

In order to fully appreciate how much of a goddamn idiot Mr. Feusette is, let's look at the method he uses to arrive at the average length of male retirement. Here's how he does it (make sure you're sitting down when you're reading this): he subtracts retirement age from life expectancy at birth for men. You read that right: at birth. AT BIRTH. Could we make a one-time exception in retirement law for Mr. Feusette, so that this idiot could retire right the fuck now? (Also, when I check the source he cites, I see life expectancy at birth for Polish men to equal 72.3 years and not 71 years.)

For those of you who want to know what the truth is, I'm not sure exactly, because I couldn't find life expectancy data for each age. But it sure isn't even close to four years. In 2007, life expectancy for Polish men at the age of 65 was 14.6 years.

Sunday, April 1, 2012

Empathy

No one is a villain in their own narrative. In your own mental model of the universe, your actions are always justified, or at the very least make sense. Whenever you see anyone do something that to you seems so despicable as to not have any possible justification, you can be sure there is a world in which that action actually is justifiable: The internal world of the person who did this. Think about it: Heinrich Himmler lived in a world in which his actions made sense and were justified. I'm not saying this is karma or poetic justice, or that it was in any way a "just punishment" for what he was doing. But it could not possibly have been a pleasant world to live in.

Monday, March 26, 2012

Recent headline

New Jersey Firehouse Catches Fire.

What's next, New Jersey Police Station Robbed? New Jersey Hospital Gets Sick?

Friday, March 23, 2012

Four-point-four sigmas to the right

Here's something I've learned today from an Indian coworker of mine: in a comparison of the very best athletes from different sports, there is a clear all-time winner. It's an Australian cricketer Don Bradman (1908-2001). Below is a comparison of the top five greatest athletes of all time; the comparison metric is the number of standard deviations away from the mean of their respective disciplines.

  Athlete Sport Statistic Sigmas
1 Don Bradman Cricket Batting Average 4.4
2 Pele Soccer Goals Per Game 3.7
3 Ty Cobb Baseball Batting Average 3.6
4 Jack Nicklaus Golf Major Titles 3.5
5 Michael Jordan Basketball Points Per Game 3.4

Thursday, March 22, 2012

Judea Pearl wins the 2011 Turing Award

Judea Pearl has won the 2011 A. M. Turing Award. The citation reads: "For fundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning." The A. M. Turing Award, given annually since 1966 by the Association for Computing Machinery, is considered the "Nobel Prize of theoretical computer science." Among the recipients are Marvin Minsky, Donald Knuth and Edsger Dijkstra.

Oftentimes, notions that seem so obvious and fundamental that we take them for granted turn out to be much more complex than we initially gave them credit for. It took philosophy and science entire millenia to come up with the first logically correct and useful definition of truth, due to Alfred Tarski. It took a few more decades for Judea Pearl to do the same for the concept of cause and effect. For those of you who don't know Pearl's work, go read his book Causality; I guarantee you it will be one of the most important pieces of prose you'll ever read. (If you would like a short and non-technical introduction, just read the book's closing part Epilogue: The Art and Science of Cause and Effect.)

The gist of Pearl's idea for defining causality is that you cannot do this without first formalizing the notion of an intervention. An intervention is an act of assigning a specific value to a variable by means that are completely independent of that variable's "natural environment". In Pearl's notation, intervention is denoted by do(X = x); think of it as of choosing treatment in a randomized experiment, or as of using the assignment operator in computer programming. Pearl defines the causal effect of a variable X on variable Y as a probability distribution of Y induced by deliberately setting the value of X to x (i.e. "doing" the do(X = x)). Or, in other words, event A causes event B if and only if the smallest possible set of do() operations performed on the whole system that brings about the realization of A, brings about the realization of B as well.

This may all sound trivial, but it actually has some non-trivial implications for probabilistic reasoning, which Pearl works out in full detail. Another interesting feature of his formalization of causality is that it provides a natural framework for mathematically rigorous thinking about the old philosophical ideas of "possible worlds" and counterfactuals. For example, it makes counterfactuals non-tautological (that is, it makes them sometimes true and sometimes false, and thus interesting, as opposed to their treatment in vanilla propositional logic in which all counterfactuals are vacuously true and therefore completely useless). Here is a truth-condition of a counterfactual implied by Pearl's theory:
The proposition "If A were true then B would be true" is true if and only if in the possible world closest to ours in which A holds, B holds as well,
where world V is closest to world W iff there does not exist any world Z such that the set of do() operations required to transform W into Z is a proper subset of the set of do() operations required to transform W into V.

Wednesday, March 14, 2012

The law of large WTF?

Andrew Gelman recently pointed out that some financial folks have some very strange ideas about the law of large numbers. Apparently, those strange ideas are deeply ingrained. Here are some excerpts from Investopedia's entry on the law of large numbers:
Definition of 'Law of Large Numbers.' In statistical terms, a rule that assumes that as the number of samples increases, the average of these samples is likely to reach the mean of the whole population.
So far so good; it's an informal definition of the same concept that statisticians call the law of large numbers. But then when you read,
When relating this concept to finance, it suggests that as a company grows, its chances of sustaining a large percentage in growth diminish. This is because as a company continues to expand, it must grow more and more just to maintain a constant percentage of growth.
...you start scratching your head. The above statement, while definitely true, has precisely fuck-all to do with the law of large numbers. As does the next one:
As an example, assume that company X has a market capitalization of $400 billion and company Y has a market capitalization of $5 billion. In order for company X to grow by 50%, it must increase its market capitalization by $200 billion, while company Y would only have to increase its market capitalization by $2.5 billion. The law of large numbers suggests that it is much more likely that company Y will be able to expand by 50% than company X.
No matter how you look at it, this is not an example of the law of large numbers. Whoever wrote this entry does not understand the definition they provided.

Added: I missed the mistake in Investopedia's definition of the law: It's not an assumption, it's a theorem.

More Added: For the sake of comparison, here is Felix Salmon's example of a correct use of the law of large numbers in a business setting:
If you run a bunch of casinos with hundreds of thousands of punters coming and and betting hundreds of millions of dollars, then you can predict with high accuracy the amount of money you're going to make at the end of the quarter.

Sunday, March 11, 2012

Farewell to a Hero

Major Michal Issajewicz, aged 91, died March 4 in Warsaw. Issajewicz was an officer of the Home Army. For the most of his military career, he served in Home Army's anti-gestapo unit code-named "Umbrella" (Parasol). He is most famous as an important participant of Operation Kutschera, which was a carefully planned execution of Franz Kutschera, the SS and Reich's Police Chief for Nazi-occupied Warsaw. The operation, the logistics and planning of which took many months, was carried out on February 1, 1944. Issajewicz participated in the assassination as a driver of the vehicle that was to stop Kutschera's limo by slamming into it, and as a third backup-executioner. As both first- and second-executioner were injured during the operation, Issajewicz turned out to be the one to actually finish Kutschera off; during retreat from the execution scene he had received a head wound. Before the war was over, Issajewicz had been arrested by the gestapo and survived an interrogation in its horrific Warsaw prison called Pawiak, as well as incarceration in an equally horrific Stutthof concentration camp, from which he eventually managed to escape.

Below is a short clip from a 1958 Polish movie called Zamach ("The Assassination") about Operation Kutschera. It contains a reconstruction (from what I know, quite an accurate one) of the actual execution.



Major Michal Issajewicz, R.I.P.

Predictive power

When the financial crisis broke in August 2007, David Viniar, chief financial officer of Goldman Sachs, famously commented that 25-standard deviation events had occurred on several successive days.
Taken literally, this is of course false; no one has ever seen even a single 25-standard deviation event, and no one ever will. What has occurred was probably the most spectacular failure of a mathematical model in the history of mathematical models.

(The source of the quote is here.)

Friday, March 9, 2012

The process of elimination

At the moment, the GOP electorate behaves a bit like a very indecisive student trying to guess the right answer among four in a multiple choice test question. By process of elimination, the student already figured out that choices B, C, and D are definitely not the right answers. However, he doesn't like choice A either, so he won't just cirlcle it and be done with the question; instead, he'll keep thinking about the answer some more. And then a little longer.

Wednesday, March 7, 2012

The rest is commentary

The first words of Steven Landsburg's great book The Armchair Economist are:
Most of economics can be summarized in four words: "People respond to incentives." The rest is commentary.
Love the book, hate the line. Sure, technically it's true, but it's true in a way that tautologies are. It's true but completely uninformative. How do they respond to what incentives? The devil's in the commentary. Saying that "People respond to incentives" is the essence of economics is kind of like saying that:
Most of physics can be summarized in four words: "Everything is a wave." 
Most of game theory can be summarized in two words: "People strategize." 
Most of biology can be summarized in five words: "Random mutation and natural selection." 
Most of evolutionary psychology can be summarized in three words: "Cognitive traits evolved." 
Most of statistics can be summarized in eleven words: "When repeated large number of times, random events show predictable patterns."
The rest is just commentary.

Nude security theater

It can't possibly be that easy to beat the TSA's billion-dollar body scanners right? TSA can't be that stupid can they?

Too ridiculous for words

Browsing various items relevant to the heated dispute over whether or not contraceptives should be subsidized, I've encountered a whole lot of beliefs that are not just false but quite obviously so. They're so ridiculous it's hard to believe there are people who actually have those beliefs. Here are some of those (in no particular order):

  • Mandating insurance companies to cover contraception isn't asking taxpayers to pay for someone's contraception.
  • Not requiring that insurance companies cover contraception is the same as denying access to contraception.
  • The fact that the market price of contraceptives is high means there is a market failure.

If you believe any of those things, you should really be ashamed of your ignorance.

Sunday, March 4, 2012

Hello, I'm a horrible person. Will you marry me?

A friend of mine is dating a man in a very bad situation. He's divorced and has two kids with his ex-wife who has full custody of their children and keeps suing him for more and more child support. The net amount he has to pay every year in child support, court costs and legal fees, hovers around $80,000. And it can get even worse any day. His ex-wife is not greedy (she's already remarried to a filthy rich guy). She's much worse than greedy; she's taking pleasure in his suffering. She once told my friend, "By the time I'm through with him, he'll beg for mercy."

You have no idea how many times I've heard people react to stories like this with a self-assuringly judgmental remark to the effect of "Why would you ever marry someone like that?" To all of you who ever said this about someone, here's something you might want to consider. Your wife may be someone like that. Your husband may be someone like that. You may be someone like that. Do you think people are stupid? No one ever marries anyone like that. It's just that one day some people have the bad luck to discover that their spouse isn't really the person they thought they were. Nicole Brown did not marry "the kind of guy who beats his wife, threatens to kill her, stalks her in order to consciously instill the sense of imminent doom in her, and then eventually does kill her." As one of my favorite lines in one of my favorite movies goes,
Nobody knows anybody. Not that well.

Thursday, March 1, 2012

Statisticians are special

Says a statistician:
Statisticians are special because, deep in our bones, we know about uncertainty. Economists know about incentives, physicists know about reality, movers can fit big things in the elevator on the first try, evolutionary psychologists know how to get their names in the newspaper, lawyers know you should never never never talk to the cops, and statisticians know about uncertainty. Of that, I’m sure.

Wednesday, February 29, 2012

Freakonomics and contradicting yourself

Part of Levitt and Dubner's Freakonomics franchise is a bi-weekly podcast. The most recent episode talks about economics and political science research into media bias, among other things, research based on Tim Groseclose's measure of ideology called the Political Quotient. Here are two quotes from the show:
Groseclose’s argument, based on his research, is that most news organizations empirically lean to the left, although not as dramatically as some critics might suspect. He ultimately wrote up his findings in a book called Left Turn: How Liberal Media Bias Distorts the American Mind. Now, how did he come to that conclusion -- that the American Mind is being distorted by media bias? Well, Groseclose combined his own findings and existing research to calculate that the average American voter has a “natural” PQ, or Political Quotient, of around 25-30, which is firmly in the conservative range. But, as Groseclose sees it, the left-leaning media pulls some of those naturally conservative voters into the center. Which is why we generally vote about 50-50. Without media bias, Groseclose says, we’d be a much different country.
...and then later on:
(...) having categorized all this language along Democratic and Republican lines, Gentzkow and Shapiro looked at how often a given newspaper used these signature phrases. And from that, they were able to determine each newspaper’s political slant. But it was the next step that really mattered: figuring out where a slant comes from. In other words, is it that reporters have a bias that gets into their stories, or maybe newspaper owners demand a certain line of coverage? They looked into these factors and more -- including one very clever indicator: the voting patterns of the people who read a particular newspaper. Their finding? The most important factor driving the slant of a given newspaper is … the political leanings of the people who buy it. In other words: newspapers are giving the people the news that they want.
The show presents both these findings as take-home points, which is preposterous since at least prima facie they are in blatant contradiction to each other. Now there may very well be an explanation as to how those things can both be true; but Dubner and Levitt seem to think there isn't anything to explain in the first place. (Arguing that the average media consumer is more left-leaning than the average citizen doesn't work because how could the media then be "distorting the American mind"?)

C'mon guys, you gotta do better than that.

Happy Birthday R

Speaking of R, it is celebrating a birthday today. Quoting from an R-help mailing list post by Andy Bunn:
R is refined, tasteful, and beautiful. When I grow up, I want to marry R.
Well, given that R is twelve, Andy might be waiting for the wrong party to grow up. At any rate, to the creators of R: Holy shit, thank you.

Tuesday, February 28, 2012

Editors large and small

Now that the makers of RStudio added seemingly all of the features it was missing, both minor (bracket matching) and major (project manager and version control), their product is officially perfect. It's truly amazing. It basically has all of the functionality of Eclipse with StatET but is much easier to install and runs R code much faster in interactive mode.

Speaking of speed though: sometimes it's still not quite fast enough. This means there's sometimes a need for me to have a backup solution: something that would be a decent code editor (i.e. have at least syntax highlighting, an object browser, and block commenting/uncommenting of code lines), while allowing me to push code to R terminal in order to run it. I did try Emacs and Vim but quickly gave up. It may very well be true that they're much better than everything else, but I'm not a programmer, I'm just a data analyst, so forgive me for thinking that a code editor should be something that makes my life easier right now as opposed to five months down the line. I've also tried Tinn-R but that was a disappointment: it doesn't have an object browser and the interface is horribly cluttered.

I did finally find a good piecemeal solution though: Notepad++ with NppToR, with object browsing via this simple function written by Petr Pikal. Here's a screenshot:


It works well. Still, I don't use it unless I absolutely have to. RStudio is just so much more convenient.

Sunday, February 26, 2012

My favorite line from a motivational speech

It comes from Herb Brooks's locker room speech to his team before the Miracle on Ice game:
If we played them ten times, they might win nine. But not this game. Not tonight.

Wednesday, February 22, 2012

Things I wish would die: Some extremely annoying phrases

If you've ever said he broke his silence or referred to anyone as a policy wonk, I am hereby giving you a one-finger salute.

A concise description of the idea behind evolutionary game theory

People don't change their minds. They die, and are replaced by people with different opinions. 
--Arturo Albergati
OK, so maybe this is just half of this idea. But it is concise.

(I haven't been able to confirm the original source. I'm quoting after Paul Graham.)

Thursday, February 16, 2012

Not for couples

In case you don't already know this: Valentine's Day is not "for couples" or "for lovers". Valentine's Day is for women. Via Robin Hanson:
AshleyMadison.com, a personals site designed to facilitate extramarital affairs … enjoyed another big boost this week, following Father’s Day, when CEO Noel Biderman says men often feel underappreciated. Traffic to the site tripled on Monday. (Biderman says there’s a similar boost in interest from neglected wives and girlfriends after Valentine’s Day.)
Or, here's a bit more direct line of evidence: On average, men spend twice as much as women on Valentine's Day gifts. Note that these averages are over all gifts purchased on that day, regardless of the recipient. If we were to consider heterosexual couples only, and calculate the average spending of men on gifts for their partners and vice versa, I guarantee you the disparity would be even larger. 

Thursday, February 2, 2012

Negative prophet

Apparently, Punxsutawney Phil is right only 39% of the time. Since he's predicting an outcome of a binary variable (either "long winter" or "early spring"), this makes him a valuable expert: if you bet against Phil, you'll be right 61% of the time.

Sunday, January 29, 2012

Good solution vs. "the" solution

Most interesting problems have more than one solution; it's also usually the case that you can do pairwise comparisons and determine if one solution is better (faster, or simpler, or more reliable, or what have you) than some other one. For some problems, there exist something that is the solution. It's a solution that does much more than solve the problem: it makes the problem go away. It dissolves the problem. When you find the solution to some problem, you'll wonder why anyone ever thought it was a problem in the first place. For example, the imaginary unit first appeared as part of a piecemeal solution of the problem of finding general roots of a cubic equation. But if you imagine someone who grew up in a reality where complex numbers were discovered and used before real numbers, you'll probably agree that that someone couldn't even understand why finding roots of cubic equations should be any more difficult than quadratic equations.

Now I'll do something all of us do but rarely admit to: I'll talk about something I really don't have the first clue about. Quantum mechanics! I think that the "many worlds" meta-theory of quantum mechanics is the solution to what's known in other interpretations as the "measurement problem" (i.e. the empirical fact that apparently identical quantum systems behave differently depending on whether they are observed or not). The many worlds interpretation dissolves the problem completely: It shows why, given that quantum systems don't really behave differently when they're measured than when they're not, it appears to us that they do. If by some historical accident the many worlds interpretation were the first meta-theory of quantum mechanics to be developed, no one would have even coined the phrase "measurement problem".

Tuesday, January 24, 2012

Majority, average, what's the difference

So, I was just listening to this here podcast. It started off with a discussion of James Surowiecki's The Wisdom of Crowds which talks about the fact that when you're trying to estimate some quantity, an average over a large number of individual guesses of people picked at random will be closer to the truth than an expert's opinion. The interviewed guest gives an example of Francis Galton's observation that the crowd at a county fair accurately guessed the weight of an ox when their individual guesses were averaged (the average was closer to the ox's true butchered weight than the separate estimates of any of the cattle experts). He then goes on to say that the reason behind this and similar, seemingly magical phenomena, is Condorcet's Jury Theorem: Take a group of people each of whom is more likely to get the right answer than the wrong answer and ask them the question. As size of group increases, the probability that the majority gets the right answer approaches 1 in the limit; the same holds for pluralities. It is also why surveys are accurate.

OK, something's not quite right here. Lots, actually. In the context of the ox example, what does it mean that each group member is "more likely to get the right answer than the wrong answer?" Weight is a continuous variable, so for each group member, the probability that they'll guess the right answer is precisely zero. Grad school was a long time ago, but I seem to remember something about Condorcet's Jury Theorem being applicable only to situations of binary choice (hence the word "Jury" in the name). The average of individual guesses of a continuous quantity, and a majority pick from two alternatives, are very different things. Also, the reason surveys work is Central Limit Theorem and not Condorcet's Jury Theorem. The only thing those have in common is the word "Theorem" in the name.

Saturday, January 21, 2012

Tell us what to think

I've stumbled onto the striking graph below via FlowingData (the original source is ProPublica):


It's a good thing that this shift happened, but the magnitude and timing of it doesn't speak very well of the members of Congress. It reveals that they either have no principles at all or can't even be bothered to read the legislation proposals that they endorse or criticize.

Saturday, January 7, 2012

Things I wish would die: The phrase "statistical dead heat"

You hear this phrase a lot any time there's an important election approaching. It refers to a situation when the difference between percentages of respondents declaring they'll vote for candidate A and candidate B is smaller than the margin of sampling error. For example, suppose there will soon be a Republican presidential primary in Florida where voters will be choosing between two candidates, Mitt Romney and Ann Coulter. A public opinion poll comes out, showing that 51% of respondents say they'll vote for Romney, and 49% of them say they'll vote for Coulter. The polling company says that the margin of sampling error in this poll is three percentage points. The media declare that Romney and Coulter are locked in a "statistical dead heat" or "statistical tie" because, given the margin of error, Romney's true vote share could be as low as 48%, and Coulter's could be as high as 52%.

But to represent this situation as a tie is highly misleading, for several reasons. I'll concentrate here on two of those reasons, to show just how misleading it can be. In what follows, I'm making a simplifying assumption that sampling error is the only source of uncertainty in my example poll. This is of course unrealistic, but completely justified, since sampling error is the only type of uncertainty that is reported by the media.

First, the size of the margin of error depends on the significance level chosen for the particular poll. Most polls choose to report a 95% confidence interval. Suppose that's the case with our fictional Romney v. Coulter situation. What this means is that, if this poll were to be redone a large number of times, with the same sample size, then 95% of the time Romney's vote share would fall somewhere between 48% and 54%. Put another way, it means that the difference between Romney's and Coulter's vote shares is not statistically significant at a 5% level (but if we chose, say, a 68% confidence interval, then the margin of error would be approximately 1.53--smaller than the spread). So the fact that the difference between Coulter and Romney is smaller than the margin of error doesn't mean Romney and Coulter are tied; it means that Romney is most  likely ahead but, if we hold ourselves to a 5% significance level, we can't say exactly by how much. Even when your point estimate is not significant at the level you chose, it is still the best guess you have.

Second, when we look at polls, we don't really care about spread; we care about who's more likely to win. The reason we pay attention to spread at all is because we treat it as a proxy for the probability of winning. So let's think about this in those terms (again, assuming sampling error is the only source of uncertainty). At 5% significance level, the margin of sampling error of a statistic is 1.96 times the standard error of that statistic (if you want to know how the standard error of a proportion is calculated, look below the fold). Thus, the sampling distribution of Romney's vote share is normal with mean 51 and standard deviation 1.53 (= 3/1.96). Below is a plot of that distribution. The ratio of the area shaded in red to total area underneath the curve is the probability that it's actually Coulter who's ahead (i.e. it's the probability that the true percentage of voters who intend to support Romney is less than 50). That probability is about 25%. The odds of Romney being ahead of Coulter are 3 to 1; doesn't sound like dead heat to me.



Wednesday, January 4, 2012

True beauty is attained indirectly

Since lately I'm all about short quotes, here's another one:
And if function is hard enough, form is forced to follow it, because there is no effort to spare for error. Wild animals are beautiful because they have hard lives.
This one is from a wonderful essay by Paul Graham. As a programmer, Graham uses Lisp. Go figure.

Tuesday, January 3, 2012

Try to beat that Hirsch Index

This is one of those things that, when you first read it, your immediate thought is "this can't be true." Here's John D. Cook quoting historian Clifford Truesdell:
… in a listing of all of the mathematics, physics, mechanics, astronomy, and navigation work produced in the 18th century, a full 25% would have been written by Leonard Euler.