Using Over 20,000 Flickr Images, Monash University Reproduced the Spatiotemporal Characteristics of Cherry Blossoms in Japan Over the Past 10 Years

Contents at a glance:In recent years, the global climate change situation is severe, and the butterfly effect caused by it is profoundly affecting humans and nature. Against this background, collecting data on flowering patterns over hundreds or even thousands of kilometers to understand how climate change affects flowering plants has become one of the important topics of ecological research in recent years. However, traditional methods usually require a lot of money and a long time to conduct sampling surveys, and logistics support work also faces many difficulties. The research recently published in the journal "Flora" not only overcomes these problems, but also reveals unprecedented details.
Keywords:spatiotemporal analysis, smart ecology, SNS data
This article was first published on HyperAI WeChat public platform~
As the national flower of Japan, cherry blossoms play an important role in Yamato culture. Hanami, as a unique folk custom, has a history of hundreds of years.However, Japan spans about 20 degrees of latitude and can be divided into 6 climate zones, with distinct climate differences.Therefore, the blooming time of cherry blossoms varies. During the cherry blossom season, Japanese travel websites will also display the blooming conditions of various places in detail for tourists to arrange their flower viewing time. In recent years, affected by climate change, the blooming time of cherry blossoms in Japan has been continuously advanced.
In order to explore the flowering pattern of cherry blossoms in Japan and understand the impact of climate change on phenology, a research team from Monash University in Australia used Python API and computer vision API to monitor the blooming of cherry blossoms in Japan through social networking site (SNS) data, and compared and verified the experimental results with the actual situation.The research has been published in the journal Flora with the title “The spatiotemporal signature of cherry blossom flowering across Japan revealed via analysis of social network site images”.

The research results have been published in the journal Flora
Paper address:
https://www.sciencedirect.com/science/article/abs/pii/S0367253023001019
Experimental process: crawling, filtering and analysis of data sets
Dataset
The process of collecting cherry blossom opening data in this experiment can be divided into two steps:
1. Extracting image data from social networking sites, including several different sequential stages
2. Use computer vision API and manual verification to filter data for relevance
Considering that the API needs to filter time, space, and text at the same time, the researchers chose Flickr as the data source.First, we use the Python API client to collect relevant images with geographic coordinates on Flickr by searching for the keyword “cherry blossom”.
Next, set the Bounding Box to 31.186°N-46.178°N, 129.173°E-145.859°E.To ensure that the pictures were taken in Japan.The time frame was set to 2008-2018 to exclude the impact of the global tourism decline caused by COVID-19 on the data.
The researchers then filtered the data by masking it with the geographic boundaries of Japan obtained from gadm.org.Finally, 80,915 images were obtained.

January 1, 2008 - December 31, 2018
Search Flickr for "cherry blossom" images located in Japan
January and February (blue)It means the first blooming of cherry blossoms before the coming of spring;
March-May (green)represents the concentration of photographic data documenting the main cherry blossom period in spring;
October-December (pink)This shows an interesting phenomenon that peaks in the fall, especially in November.
Although Flickr images were restricted by the search keyword “cherry blossom”, SNS content could still be incorrectly associated with the search term, thus requiring verification.
In this regard,The researchers submitted all the images to Google Cloud Vision AI.The API generates descriptive text labels for each image based on its visual content, automatically double-checking the relevance of individual data points.
Google Cloud Vision AI uses a pre-trained machine learning model to assign labels to images in predefined categories. In addition, researchers also performed additional manual verification of sample data, as shown in the following table:

Table 1: Image data at each stage in the Tokyo-filtered dataset
Column B:Searching for "cherry blossom" on Flickr returns 28,875 images whose geographic coordinates are all within the administrative area of the Tokyo area.
Column C:The text labels and their relative frequencies returned by the Computer Vision API for this dataset. Of the images returned from the text label filter, 21,908 were labeled "cherry blossom" by the Computer Vision API, but some images were also labeled "autumn" or "maple tree" and were removed, resulting in a total of 21,633 images.
Column D:Result images are randomly selected as samples for manual inspection
Column E:Number of images confirmed to be cherry blossoms by manual inspection
Column F:Estimated accuracy of automatic processing methods (computer vision and label analysis) per month, calculated as E/D
Column G:Using this precision, calculate the total number of cherry blossom pictures taken in February, March, and April. The calculation method is C*F
Evaluation Methodology
To estimate the blooming date of the cherry blossoms, the researchers generated a time series of days for all the images in the dataset and thenThe 7-day width triangular rolling average indicator is used for processing. The center point is assigned unity weight, and the points on both sides are assigned a weight of 0.75., the next closest points are assigned weights of 0.5 and 0.25 respectively, in order to smooth out the fluctuations in photographic activity caused by the different number of flower-viewing people on weekends (leisure time, when photographic activities increase significantly) and weekdays.
The resulting graph shows a peak in photography activity, which was identified as the peak of cherry blossom bloom (mankai).
Comparative verification: The predicted results are consistent with the actual data
The earliest record of cherry blossoms in Japan dates back to 812 AD, and official observations have been made since 1953. To verify the team's analysis method,The experimental team selected data from two popular cherry blossom viewing cities, Tokyo and Kyoto, and compared them with the cherry blossom full bloom dates announced annually by the Japan Meteorological Corporation (JMC) and the Japan National Tourism Organization (JNTO), and calculated the error between the peak date obtained from the experiment and the official date..
Through experiments,The research team obtained visual spatiotemporal data of cherry blossoms blooming across JapanFrom late January (wks 3-4) to late May (wks 3-4), cherry blossoms first bloom from the warm southern areas to the north, and then gradually retreat from south to north. As shown in the figure:

Figure 2: Locations of cherry blossom photography in Japan from 2008 to 2018,
The period of each graph corresponds to two weeks.
AC:Cherry blossoms are seen in the warmer regions of southern Japan, with a high concentration of cherry blossoms in the urban centers of Tokyo and Kyoto on Honshu Island.
DF:The number of cherry blossoms has increased and has begun to spread to the northern part of Honshu Island.
GI:Cherry blossoms are spreading northwards, appearing in Sapporo, Hokkaido. Photographing is still active in Tokyo and Kyoto, and cherry blossom photography is more concentrated in Hokkaido and northern Honshu. Finally, cherry blossom photos are gradually decreasing across the country, retreating from south to north.
The experimental team compared and verified the peak values of the processed time series of cherry blossom event photography days in the Tokyo and Kyoto areas with the dates announced by JMC/JNTO.The results show that the RMS error is 3.21 days in the Tokyo area and 3.32 days in the Kyoto area.As shown below:

Figure 3: Comparison of the dates of the two assessments in the Tokyo area
Left column: The peak dates of cherry blossoms in Tokyo estimated by this experimental method over the years
Middle column: JNTO's annual reports on the peak dates of Tokyo's cherry blossoms
Right column: Error, i.e. the difference in days between the two

Figure 4: Comparison of the dates of the two assessments in the Kyoto area
Left column: The peak dates of cherry blossoms in Kyoto estimated by this experimental method
Middle column: JNTO's annual reports on the peak dates of Kyoto's cherry blossoms
Right column: Error, i.e. the difference in days between the two
The experimental team's data also revealed that cherry blossoms bloom in autumn. This was not officially stated in the data released by JNTO.This shows that SNS data has the ability to analyze low-probability events and reveal abnormal phenological phenomena.This is extremely important for assessing the availability of aromatic resources such as pollen and nectar throughout the year or even in unexpected circumstances, such as non-seasonal opening times.
SNS data: providing new insights into ecological research
An article released by the World Meteorological Organization in April this year showed that the global average temperature in 2022 was 1.15℃ higher than the average value from 1850 to 1900. Humans are relatively slow to perceive climate change, but plants are particularly sensitive.Under the influence of global warming, not only Japanese cherry blossoms, but also flowering plants in many parts of my country have been affected.
According to the cherry blossom observation data of Wuhan University, the flowering period of cherry blossoms at Wuhan University has been significantly advanced since the 1960s, and has continued to break records since 2000, at one point advancing from late March to late February.
Before the 1990s, the flowering time of peonies in Heze, Shandong was mainly concentrated in late April. Around 2010, it was advanced to mid-April. In recent years, the flowers can be observed blooming in early April.
The flowering time of rapeseed has also shown a significant trend of being advanced. Rapeseed flowers in Wuyuan, Jiangxi Province began to bloom on February 22 this year and entered the peak flowering period on March 13. Thirty years ago, rapeseed flowers generally bloomed in mid-March.
A report released by Kepios shows that as of April 2023, the number of social media users in the world will reach 4.8 billion, accounting for 59.91% of the world's total population. On average, each person spends 2 hours and 24 minutes using social media applications every day.The generation of massive amounts of social network data is expected to provide new insights into ecological research.
The SNS analysis technology proposed by the authors in this paper can fill in the missing parts of public data, help researchers understand the different degrees of impact of climate change on flowering plants, and has positive significance for understanding the behavior of important pollinators such as bees and insects.
Reference articles:
[1]https://www.sciencedirect.com/science/article/abs/pii/S0168192320303117
[2]https://link.springer.com/chapter/10.1007/978-4-431-66899-2_8
[3]http://sh.cma.gov.cn/sh/qxkp/qhbh/zhykp/202304/t20230425_5464832.html
[4]https://datareportal.com/social-media-users
This article was first published on HyperAI WeChat public platform~