HyperAI

Dog Job Search: AI Interview, Human Assistance, US Research Institute Uses Data From 628 Labradors to Improve the Efficiency of Selecting Olfactory Detection Dogs

2 years ago
Information
Yinrong Huang
特色图像

Contents at a glance:Dogs have a keen sense of smell and are great helpers in performing difficult tasks. However, the selection of working dogs requires strict screening and training, and the elimination rate is extremely high. Using supervised machine learning and task data, it can be used to predict human work performance, however, no similar dog research has been found.

Keywords:Working Dogs Supervised Machine Learning Random Forests

Author: daserney

Editor|Sanyang

This article was first published on HyperAI WeChat public platform~

Dogs can often be seen in parks and on the corners of streets and alleys. In addition to being companions for humans, they also bring joy and comfort.There are also many special dogs who are silently doing important work and serving human society. They are called working dogs.

There are many types of working dogs, including military and police dogs, search and rescue dogs, and service dogs, each of which is divided into many different professional fields. Among them, the main task of olfactory detection dogs is to use their super sense of smell to detect various specific substances, such as explosives, drugs, etc.ThatThe sense of smell plays an irreplaceable role in protecting the security of human society.

Most untrained working dogs cost between $40,000 and $80,000, and that price can double when training costs are factored in.The overall training success rate of working dogs is lower than 50%, and there is an urgent need to develop more effective selection and training methods.

Recently, researchers including Alexander W. Eyre from the Abigail Wexner Research Institute at Nationwide Children's Hospital and Isain Zapata from Rocky Vista University used data from 628 Labrador retrievers from the Transportation Security Administration's olfactory detection team to compare three models.It predicts whether a hound can pass pre-training and enter the formal training stage, and discovers the behavioral characteristics that affect the performance of scent detection dogs.

The research has been published in the journal Scientific Reports with the title “Machine learning prediction and classification of behavioral selection in a canine olfactory detection program”.

The research results have been published inScientific Reports

Paper address:

https://www.nature.com/articles/s41598-023-39112-7#Sec8

Experimental methods 

Data Introduction: AT + Env Predicts Hound Performance 

The study data came from a scent detection dog breeding and training program conducted by the Transportation Security Administration (TSA) between 2002 and 2013.The dataset contains scores for 628 Labrador retrievers that were tested twice every 3 months over a 15-month period of foster care.

Test 1:Airport Terminal (AT) testing. The AT test is conducted in an empty simulated airport terminal, where staff lead the hounds through a simulated airport terminal, searching for scented towels in randomly scattered containers, and interacting with toys. This test demonstrates the hounds’ training potential by measuring their performance in identifying scented towels, and their level of interaction with staff, towels, and toys.

Test 2: Environmental (Env) test, conducted at different locations around the base.The test involves the hounds walking around under the guidance of a human, attempting to search, and interacting with toys and humans in noisy and crowded environments. The test locations include a busy gift exchange (BX), a noisy and dark and enclosed woodshop, an airport cargo area with moving traffic and noise, and various airport terminals. This test complements the AT test because in the AT test, there are no other humans to distract the hounds.

Table 1: Hound characteristics and scoring descriptions

AT = Airport Terminal Test, E = Environmental Test, B = Both.

 

Using 3 prediction models and two feature screening methods 

The study used three different supervised machine learning algorithms to predict the success rate of pre-training selection based on the hounds' performance in behavioral tests.The algorithms used include random forest, support vector machine, and logistic regression.

The study also used principal component analysis (PCA) and recursive feature elimination with cross validation (RFECV).To identify important behavioral traits that influence the performance of scent detection dogs.

Among them, PCA is a statistical technique that reduces the dimensionality of data by identifying the most important variables; RFECV is a machine learning algorithm that recursively eliminates unimportant features to screen out the most important features.

Experimental Results

Predicting hound pass rate: AT test results are better 

As shown in Figure A below, in the AT test,The predictive power of the model generally improves over time.In the test data of the 12th month, the performance of the random forest model is the most outstanding.The accuracy reached 87%, and the AUC (area under the curve) was 0.68.The logistic regression model performed slightly worse.But overall it still performed well.The results of the support vector machine model are relatively unstable.This is mainly due to its poor performance in predicting recall that did not pass the hound.

Table 2: Performance of the three models - A

As shown in Figure B below, in the Env test,The prediction results are not ideal.This may be because, on average, the number of hounds participating in the Env test was relatively small compared to the AT test (56% vs. 73%).The logistic regression model performed better.At the four time points, the F1 of the support vector machine predicting failure of the hound was extremely low.

All three models had the highest accuracy (0.82-0.84) and high F1 scores (0.90-0.91) for predicting pass hounds at month 3. However, they all performed poorly for predicting fail hounds at month 3 (F1≤0.10).

Table 2: Performance of the three models - B

Logistic Regression:

Support Vector Machine:

Random Forest:

A: Airport Terminal Test, AT Test

B: Environment test, Env test

M03, M06, M09, and M12 indicate that the test time is the 3rd, 6th, 9th, and 12th month respectively.

In the figure, the data before / represent the results of selecting hunting dogs through pre-training, and the data after / represent the results of selecting hunting dogs without pre-training.

Influencing characteristics: Possession characteristics, confidence, H2 have a greater impact 

The researchers used principal component analysis (PCA) and recursive feature elimination with cross validation (RFECV) to determine which features were most important for prediction at different time points.The following figure shows the results of PCA in AT test and Env test.

Figure 1: Principal component analysis results

a:  Airport Terminal Testing, AT Testing

b:  Environment Test, Env Test

The abbreviations of the horizontal axis characteristics correspond to those in Table 1.

As shown in Figure a above, in the AT test, the test data of the 3rd and 6th months show that the most influential feature is H1/2 (Hidden 1/2), while in the test data of the 9th and 12th months, Physical Posession (PP) has the greatest impact. Figure b above shows that in the Env test, Independent Possession (IP) has the greatest impact at all time points.

Recursive feature elimination (RFECV) is a feature selection technique that obtains the optimal combination of variables that maximizes model performance by adding or removing specific feature variables.In this study, RFECV was used in combination with random forest.

Table 3: Recursive feature elimination with cross validation (RFECV) results

a:  Airport terminal testing

b:  Environmental Testing

The values represent the percentage of occurrence of each feature, ranging from 0 to 100.

The characteristic abbreviations correspond to those in Table 1.

As shown in Figure A above, in the airport terminal test, all occupancy characteristics (MP, PP, IP) and H2 are the most important.

Figure B above shows that in the environmental test, confidence (Conf) is the most important at 3 and 6 months (100% and 88.7%); at 9 months, independent possession (IP) is the most important (93.3%); at 12 months, physical possession (PP) is the most important (80.7%).

In summary, the results show that some characteristics such as H2, IP, and Conf may have greater influence.However, due to the small size of the dataset and the limited variety of features, the study had some problems in identifying the dogs that successfully passed the pre-training selection and those that failed due to behavioral issues.The prediction process can be further improved and expanded by incorporating additional behavioral characteristics, medical information, and other types of longitudinal data.

A scientific research institution focusing on working dogs

The Penn Vet Working Dog Center, where study author Elizabeth Hare works, is a pioneer in the field of working dogs, advancing the research and application of the latest scientific discoveries and veterinary expertise to optimize the performance of scent detection dogs.Inspired by the outstanding performance of search and rescue dogs during the 9/11 attacks, the organization was established on September 11, 2012 as the National Search and Rescue Dog Research and Development Center.

Agency Address:

https://www.vet.upenn.edu/research/centers-laboratories/center/penn-vet-working-dog-center

The Penn Vet Working Dog Center is dedicated to working with dogs to protect the health and safety of people, animals and the environment by collecting and analyzing genetic, behavioral and physical health data and combining it with the latest scientific research to improve the work efficiency and well-being of working dogs.Its work includes not only developing and implementing training and development programs for working dogs, but also testing and disseminating research results in order to better meet future challenges.

Reference Links:

[1]https://zhuanlan.zhihu.com/p/384069169

[2]https://blog.csdn.net/qq_35218635/article/details/110001554

[3]https://zhuanlan.zhihu.com/p/626862784

[4]https://zhuanlan.zhihu.com/p/359006952