Title |
Private traits and attributes are predictable from digital records of human behavior
|
---|---|
Published in |
Proceedings of the National Academy of Sciences of the United States of America, March 2013
|
DOI | 10.1073/pnas.1218772110 |
Pubmed ID | |
Authors |
Michal Kosinski, David Stillwell, Thore Graepel |
Abstract |
We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait "Openness," prediction accuracy is close to the test-retest accuracy of a standard personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy. |
Twitter Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 246 | 16% |
United Kingdom | 118 | 8% |
Japan | 54 | 3% |
Spain | 43 | 3% |
Canada | 43 | 3% |
France | 41 | 3% |
Germany | 34 | 2% |
Australia | 28 | 2% |
Chile | 28 | 2% |
Other | 292 | 19% |
Unknown | 635 | 41% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 1278 | 82% |
Scientists | 194 | 12% |
Science communicators (journalists, bloggers, editors) | 60 | 4% |
Practitioners (doctors, other healthcare professionals) | 28 | 2% |
Unknown | 2 | <1% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 53 | 2% |
United Kingdom | 32 | 1% |
Germany | 26 | <1% |
France | 10 | <1% |
Brazil | 9 | <1% |
Australia | 7 | <1% |
Spain | 6 | <1% |
Finland | 6 | <1% |
Austria | 5 | <1% |
Other | 67 | 2% |
Unknown | 2789 | 93% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 627 | 21% |
Student > Master | 541 | 18% |
Researcher | 389 | 13% |
Student > Bachelor | 302 | 10% |
Student > Doctoral Student | 159 | 5% |
Other | 568 | 19% |
Unknown | 424 | 14% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 588 | 20% |
Psychology | 501 | 17% |
Social Sciences | 378 | 13% |
Business, Management and Accounting | 232 | 8% |
Agricultural and Biological Sciences | 119 | 4% |
Other | 638 | 21% |
Unknown | 554 | 18% |