HyperAI

AI and Embryos Combined? Systems Biologist Patrick Müller Uses Twin Networks to Study Zebrafish Embryos

a year ago
Information
JunZ
特色图像

With a dataset of 3 million images and 15,000 zebrafish embryos, systems biologist Patrick Müller successfully implemented AI-based embryo recognition.

Author|Jia Ling

Editor|Sanyang

During animal development, embryos undergo complex morphological changes over time. Researchers hope to be able to objectively quantify developmental time and speed, and provide standardized methods to analyze the stages of early embryos and better understand the evolutionary and developmental processes.

Previously, scholars' understanding of embryonic development stages and embryonic morphological transformation came from microscopic observation. However, the stage transformation of embryonic development is not ideal and stable. There are so many influencing factors that it is difficult for researchers to observe a specific developmental state. The process of observing embryonic morphology to infer the developmental time and stage is still subjective.

In order to objectively establish the relationship between development time and development speed,Systems biologist Patrick Müller Leading researchers at the University of Konstanz, he developed a deep learning method based on a twin network that can automatically capture the embryonic development process through image comparison and identify the characteristic stages of embryonic development without human intervention.At present, the relevant results have been published inNature Methods".

The paper was published in Nature Methods

Get the paper:

https://www.nature.com/articles/s41592-023-02083-8

01 Experimental procedures

Dataset: Integrating a large number of embryo images

Using high-throughput imaging pipeline and image segmentation based on ResNet101,The researchers created a dataset of 3 million images and 15,000 zebrafish embryos., to generate developmental trajectories of individual embryos. Each embryo is tracked individually and demarcated by a different colored bounding box when input to the model. A separate JSON file is created for each experiment, containing information about embryos belonging to each category.

Image processing diagram

Model architecture: Siamese network model

The twin network structure consists of two parallel neural networks with the same structure., can receive two pictures as input at the same time, and the weights are shared between the two neural networks. The images are compared by similarity calculation based on feature embedding.

The following is a diagram of the structure of the twin network:

Twin network structure

The neural network structure that constitutes the twin network is as follows:

ResNet50-based neural network

Backbone network:based on ImageNet  Dataset,ResNet50 architecture with pre-trained weights as the backbone network;

Embedding model head: The output of the backbone network is flattened and passed to the embedding model head, which consists of three dense layers with batch normalization layers between each layer, producing an output/embedding of size (1, 256);

Transfer Learning: All layers of the ResNet50 backbone network are frozen except convolutional block 5 and the model head layer. The feature embeddings generated by ResNet50 are combined in a distance layer to calculate the Euclidean metric between the network-generated embeddings for different inputs during training.

Algorithm training: Triplet loss training

The algorithm training process is as follows:

Constructing image triplets: The image triplets consist of three embryo images, namelyAnchor image, images of embryos at random developmental stage t1;Positive image, similar to the image at developmental stage t1 (input neural network 1) or the anchor image after image enhancement (input neural network 2);Negative image, images of embryos at developmental stage t2 ≠ t1.

Image triplet diagram

Triplet loss training: Pass the constructed image triplets to the Siamese network and calculate the triplet loss based on the formula below to minimize the similarity between the anchor image and the positive image and maximize the similarity between the anchor image and the negative image.

Triplet loss calculation formula

A represents the anchor image, P represents the positive image, and N represents the negative image.

Iterative training: Neural network 1 was trained for 10 epochs using 300,000 zebrafish embryo image triplets; neural network 2 was trained for 2 epochs using 1 million image triplets, and the anchor images were enhanced, using NVIDIA GeForce RTX3070 (ASUS) for GPU accelerated training.

Task-based training: Corresponding training was conducted on image similarity, embryonic staging, development speed and temperature, and embryonic development changes induced by drugs.

02 Experimental Results

Result 1: Automatic embryo staging using similarity graphs

The test image is compared with a set of embryo images, the cosine similarity between them is calculated, and the similarity score is obtained to classify the embryo images.

Similarity graph of test embryos and reference images

By comparing the test image with the time series of developing embryo images, we can get a curve of similarity over time, from which we can extract two main features:

·  The peak of the curve indicates at which developmental stage the embryo in the test image is located.

· The non-peak regions of the curves contain additional information, such as peak width and similarity to remote embryonic stages, reflecting morphological similarities at different time points.

Schematic diagram of embryo age prediction

The twin network can identify and predict a set of time series images of an embryo, build a trajectory based on the predicted developmental stage, and achieve accurate embryo staging..

Result 2: Exploring the functional relationship between development speed and temperature

Previously, quantifying the temperature dependence of embryonic development required manual or semi-automatic annotation of developmental timing, which significantly limited the number of experiments that could be analyzed within a reasonable time span.

The constructed twin network was used to automatically analyze the temperature-dependent changes in developmental rate. The experimental plan was: zebrafish embryos between 23.5 ℃ and 35.5 ℃ and black carp embryos between 18 ℃ and 36 ℃. 100 to 200 zebrafish embryos or 20 to 100 black carp embryos were analyzed under each temperature condition.

The experimental results are shown in the figure:

Analysis of zebrafish and black carp embryonic development at different temperatures

a, d: Schematic diagram of age estimation for zebrafish and black carp;

b, e: Development of zebrafish and black carp at different temperatures;

c, f: Natural logarithm of estimated growth rate of zebrafish and black carp at different temperatures.

· Temperature changes had a significant impact on the development rates of both embryos.At lower temperatures, embryonic development was slower, while higher temperatures led to a significant increase in development. When faced with a 10°C temperature change, the development rate changed by a factor of two.

·  The temperature dependence of developmental rates was quantified using a twin network and the data were fitted using the Arrhenius equation. The slope of the linear fit gave the mean values for zebrafish and midge over the species-specific temperature range.Apparent activation energyThese apparent activation energies are similar to those of otherPoikilothermSimilar to warm-blooded animals (such as frogs, fruit flies, or yeast), it is clearly different from warm-blooded animals (such as mice or humans).

·  Different from idealized speculation, in the higher temperature area, the development rate of both embryos no longer accelerates, but tends to stabilize. In the lower temperature area: the development of zebrafish slows down linearly, and the embryo stops developing when the temperature is below 23℃; the black carp embryos show the characteristics of nonlinear development, stagnating in the primitive sac stage for a long time.

Result 3: Quantifying natural variability during embryonic evolution

The study found that although embryos are affected by genetic variation, external interference, and noise and randomness in gene expression, which lead to deviations in growth rate and development stage, the evolutionary process will always be completed.

Diagram of the evolutionary differences in embryos

The twin network was used to evaluate the differences in individual phenotypes among embryos of the same age. The experimental results are shown in the figure:

Embryonic development diagram

The left panel shows the percentage of embryonic developmental stages predicted after different times, 0 min (green), 400 min (blue), and 800 min (purple);

The right graph shows that the average similarity value of embryos decreased over time.

In the early embryonic development stage, the predicted embryonic development stage has a narrow distribution, while with the onset of the segmentation period, the distribution width of the predicted embryonic development stage increases. This suggests thatDuring embryonic development, the differences between individuals gradually increase, but the average similarity value decreases over time.

Among more than 3 million zebrafish embryo images, about 1% embryos showed abnormal development, with spontaneous collapse or dorsal-ventral polarity defects being the common causes.Using twin networks, researchers are able to detect embryos with developmental abnormalities at an early stageThese abnormal embryos showed low average similarity values outside the predicted normal development range.

Illustration of abnormal embryo development

Results 4: Identification of drug-treated embryo phenotypes

Embryonic development is coordinated by a variety of signaling molecules, and regulating their activity may lead to changes in embryonic phenotypes. During zebrafish development, there are seven major signaling pathways, among which bone morphogenetic protein (BMP), retinoic acid (RA), Wnt, fibroblast growth factor (FGF) and Nodal signaling pathways mainly regulate the orientation of the germ layer and the formation of the anterior-posterior dorsal-ventral axis, while Sonic Hedgehog (Shh) and planar cell polarity (PCP) signaling pathways control the extension and morphogenesis of the body axis.

The researchers tested the effectiveness of the twin network in detecting abnormal embryos, and the results are shown in the figure below:

Phenotypic comparison between untreated embryos and drug-treated embryos

a: Untreated embryos were used as a reference for the phenotype of drug-treated embryos;

b-i: Changes in similarity between embryos treated with different drugs and untreated embryos;

j: Dependence of embryo number on the accuracy of abnormality detection.

Comparison of the phenotypes of untreated embryos with those treated with BMP, Nodal, FGF, Shh, PCP, and Wnt inhibitors and RA-exposed embryos revealed high similarity values between untreated embryos, whereas similarity values between embryos treated with small molecule drugs and untreated embryos were generally low.

Statistical analysis of time points is performed to determine the time points at which the embryo population deviates significantly from the reference population, thereby detecting embryo populations with phenotypic defects. The accuracy of the detection depends on the number of embryos analyzed and the type of interference.

also,The study also explored the accuracy of the method in identifying phenotypes of different penetration rates and severity.The known range of phenotypes in zebrafish embryos caused by different levels of BMP pathway inhibition are shown in the figure: the twin network is able to accurately detect developmental deviations. For phenotypes with high penetration or obvious phenotypes caused by high doses of small molecule BMP signaling pathway inhibitors, only a small number of embryos are needed for accurate detection, while mild phenotypes require about 30 embryos.

Phenotypic changes of zebrafish embryos under different levels of BMP pathway inhibition

These analyses show thatThe Siamese network, trained only with images of normally developing embryos, is able to detect embryonic phenotypic changes in an unbiased manner..

Result 5: Automatic derivation of embryonic development period

Typically, reference embryo images are available to assess the developmental timing of test embryos, but for newly discovered or uncharacterized species, such reference images may not be available.

The researchers propose that a twin network can be used to determine the developmental stage by calculating the similarity between a test image and other images of the same embryo at earlier time points.

The results of similarity analysis on zebrafish embryos are shown in the figure:

Embryonic development period derivation

a: Calculate the similarity between the test embryo and images from previously acquired time points of the same embryo;

b: Representative similarity matrix.

Similarity showed unique distribution characteristics at different developmental stages. They observed a common pattern: high similarity values clustered locally, while at more distant time points, similarity values were low and plateaued.

Interestingly, the local and global statistical similarities between pairs of images, as assessed by the Siamese network, are consistent with the order of key stages during development. Embryos that fall into the plateau stage have stable morphology, highlighting major periods in development, such as the classic cleavage, blastocyst, embryonic disc, organogenesis, and segmentation stages. In contrast, embryos that fall on the border between plateau stages represent transient periods of major changes in developmental morphology.

Next, the researchers tried to extend this method to other species, including medaka and three-spined stickleback. The results showed that the twin network generated rich maps for these morphologically diverse embryo sequences.

Automatic detection of developmental stages and transitions in black carp and three-spined stickleback embryos

In further research, they applied this method to the more distantly related nematode Caenorhabditis elegans. The researchers used open data from different independent sources, such as published papers and YouTube videos, to train and evaluate the network, and successfully automatically identified the first division cycle of C. elegans, forming the first four proembryonic cells.

These results indicate thatThe Twin Network approach can be used to automatically generate developmental maps of different species for different biological systems and a wide range of image datasets, without the need for models previously trained specifically for this purpose..

03 Twin Network vs. Digital Twin Network

In the 5G era, digital twin networks have been mentioned frequently. At the same time, the "twin technology" with a similar name - twin networks - has also emerged in the field of image recognition. Although the two concepts are different, they have shown synergy in some fields.

First of all, please note that these are two completely different concepts.

Twin Network: A deep learning architecture that is mainly used in image retrieval, image matching, image classification and other fields. It realizes the comparison and analysis of image similarity by learning the embedded representation of images.

Digital Twin Network: A virtual model of a physical entity, which interacts with its corresponding physical entity through real-time data updates and simulation technology, and can simulate the behavior and performance of the physical entity under different conditions. It is mainly used in industrial manufacturing, Internet of Things, urban planning, aerospace and other fields.

As an AI algorithm, Twin Network can leverage its own advantages to empower and improve the efficiency of digital twin networks.

For example, in the digital twin of industrial equipment, the twin network can compare equipment images at different time points to understand the changes and differences in equipment status; in digital twin city planning, the twin network can process image data captured by monitoring probes, conduct real-time monitoring and simulation of traffic flow and road conditions, and so on.

In summary, Twin Network provides image-related support and applications for Digital Twin Network by combining image data and deep learning technology, thereby improving the information acquisition, monitoring and decision-making capabilities of digital twins.

Not only Twin Network, other AI tools will also further empower digital twins.