(...) neither vectors, random variables nor vectors of random variables are scalars. This statement is obvious to anyone familiar with the basic terms. Equally obvious is the fact that when you try to represent one of these complex, multidimensional creatures as a point on a line, you will invariably lose some information. (...) This isn't pessimism; it's mathematics. You lose information when you go from a vector to a scalar.As immediately pointed out by commenters, this is incorrect. Since sets R and Rn (where R is the real line and n is a natural number) have the same cardinality, for any subset of Rn you can always find a one-to-one (i.e. reversible) mapping onto some subset of R. So what does the author mean when he says "you will inevitably lose information when...?" Let's look at some examples he gives of representing vectors with scalars:
A "rate your experience" question might do a good job comparing the impact of bad beverage service versus that of short delay in take-off but it will probably not do a satisfactory job comparing a forced landing and a seven hour stay on the tarmac on a hot summer day.
A weighted average of nutrients might provide a good way of ranking most of the foods you find in the produce aisle. (...) If, however, you move to the context of the dietary supplement aisle, making that linear assumption about certain nutrients can be dangerous, even deadly.
(...) Take the example of health. There's no meaningful way to boil this complex, multidimensional concept down to one number, but we can come up with scalars that are useful when answering certain questions. Let's say we have formulas for deriving two metrics, L and Q. L correlates very well with longevity; Q correlates very well with quality of life. For most questions about health policy, you will get similar answers with either metric, but there are cases where the two diverge sharply. Both L and Q are good measures of health, but their usefulness depends on the question you need answered.What is clear is that the author thinks of vectors and scalars not just as of "objects," but as of "objects with some sort of internal structure." Perhaps, then, when he says "you will lose information when trying to represent A as B" he means "there does not exist an isomorphism from A onto B" as opposed to "there does not exist a one-to-one mapping between A and B" (in other words, he's interested in structure-preserving mappings rather than identity-preserving ones). If that's the case, he could be right quite literally, depending on what he means by "an object with structure." If this means "a set S and a relation L on S x S," the loss of information claim is still wrong: In the category of sets, the isomorphism class of a set is determined by its cardinality, and an n-ary relation on set S is simply a set of cardinality |Sn|. But if it means "a vector space V over the field of real numbers with operators p and q," the claim is correct. In the category of vector spaces, the isomorphism class of a vector space is determined by its dimensionality, so you cannot represent a vector space with a lower-dimensional vector space without loss of information.
But is this really what the author meant to say? Is he really saying "health is an n-dimensional vector space and therefore it's impossible to construct a single metric of health that's also a vector space?" It's hard to tell, really, but it seems that this is what he's saying. Which means that the loss of information claim is right, but that his expectations as to the usefulness of linear algebra in applied social science are a bit unrealistic. The concepts "set" and "relation" are much more general and applicable than concepts "vector space" and "scalar multiplication," and insisting on representing social phenomena with vector spaces means there won't be a whole lot out there for you to represent.
No comments:
Post a Comment