By Super Neuro
A Chinese guy named Bai Li from the University of Waterloo shared on Medium how he used the logistic regression method in ML to help himself find a partner.
Such a practical technique is a must to learn.
The University of Waterloo is a famous university in Canada and one of the best universities in Canada. Its teaching level in engineering disciplines such as mathematics and computer science is among the best in the world. Its advantaged major, computer science, ranks 18th in the 2017 US News World University Rankings.
Like all science and engineering schools, the University of Waterloo has an extremely unbalanced gender ratio and a lack of social activities, making it difficult to find a partner.
Some people think that love is something that cannot be quantified, and you should just "be yourself and let nature take its course."
However, as a data scientist at the University of Waterloo, the young man disagrees. He thinks that since he is a computer scientist,Why not try using machine learning to help you find a girlfriend?
How to Pick Up Girls: Arm Yourself
Action speaks louder than words, so I started researching how to use machine learning technology to find a girlfriend.
The core questions of this study are:
What attributes do you need to stand out from the crowd of boys and be favored by girls?
Little BrotherI tried to list the characteristic attributes of boys, hoping to find out which hypotheses could be supported by data.
Dating
(Target variable)
Have a girlfriend, or have had a girlfriend who lasted at least half a year in the past 5 years.
Country of Citizenship
International Students
major
CS, SE and ECE majors
cause
To be successful academically,
Found an internship with good salary
Interesting
Good at speaking, always able to find interesting topics to talk about
sociability
Outgoing personality, always wanting to meet new people
confidence
Good at speaking, always able to find interesting topics to talk about
Fashion
Pay attention to appearance and dress tastefully
Canada
Lived in Canada for the past 5 years
Asian
From East Asia
In each of these cases, I've assigned a value of 1 or 0 depending on whether the criteria are met. So, we're measuring the relationship between people's attributes and being able to find objects.
Some of the above attributes are very subjective, such as how to prove that a person is interesting? So, if you want to see the kind of super hardcore and rigorous statistical research, the following content may not be your cup of tea.
To collect the data, I listed everyone I could think of in a table, and scored each attribute with a 0 or 1. In the end, the data set had N=70 rows. If you went to school with me for the past two years and know me, there is a good chance that you are on this table.
Carefully analyze the reasons for the order
First, we used Fisher's Exact Test to analyze the target date variable and all explanatory variables and found that three variables had the most significant impact:
-
fitness:Those who regularly go to the gym or exercise are more than twice as likely to have a girlfriend (P value = 0.02)
-
Glasses:The probability of people who don't wear glasses having a girlfriend is 70% higher than those who wear glasses (P value = 0.08)
-
confidence:People with high self-confidence are more likely to have friends (P value = 0.09)
The young man was surprised that wearing glasses had such a big impact, and was curious whether it was because wearing glasses generally gives people the impression of being a "nerd".
So the guy looked up some more information and found that it was true. A research paper said that most people believe that wearing glasses will reduce their attractiveness, whether they are men or women.
Some variables may be more predictive of dating success, but it's hard to be sure because the sample size is small:
-
International students have a higher dating success rate than Canadian students
-
Asians have fewer dating opportunities than other races
Looking at other factors, although there are fewer girls, boys majoring in computer science do not seem to be at a disadvantage; the remaining variables (height/career/fun/sociality/fashion/place of residence) are not very related to successful dating. After all, dating is only the first step in confirming a relationship, and few young people think too far or too complicatedly.
The complete results of this experiment:
We then examined the relationships between the variables, which helped us identify incorrect model assumptions.
Red indicates positive correlation and blue indicates negative correlation
Only correlations with a statistical significance < 0.1 are shown, so the relationships between most variables are blank.
It seems from the picture{ has a girlfriend, looks confident, goes to the gym, doesn't wear glasses }The model trained with this data will also reflect these biases, and I will expand the scope of the investigation and collect more data in the future.
Using Logistic Regression to Predict Finding a Girlfriend
Wouldn't it be great if there was an algorithm that could predict your chances of finding a girlfriend?
The young man trained a logistic regression generalized linear model to predict whether he would have a girlfriend based on the explanatory variables we listed earlier.
I trained this generalized linear model with elastic net regularization using the glmnet and caret packages in R. I then optimized the hyperparameters using a standard grid search, using leave-one-out cross-validation at each iteration and optimizing the kappa coefficient.
Final Conclusion
The final model has a cross-validated ROC AUC score of 0.673, which means that the model is better at predicting your chances of finding a girlfriend than your guess.
Of course, there are always some accidental uncertainties in life, and there will be surprises in life. Well, I won’t say anymore, the little brother is going to the gym, and he has to work hard to take off his glasses!
Here is a recent photo of Bai Li
Easter egg: How is the little brother now?
The original author, Bai Li, completed this research in April this year. He published the article on Medium and it received great reviews.You can learn more about the little brother’s project through his GitHub.
Follow the public account and reply "single dog",
You can get the GitHub address of the little brother
It has been almost four months since the article was published. How is the little brother doing? We also contacted the little brother Ben through a non-existent website, also known as Facebook. You can experience it yourself: