HyperAIHyperAI

Cornell University Pioneered a "microwave Brain" Chip That Processes ultra-high-speed Data and Wireless Communication Signals Simultaneously, Achieving an Accuracy of 75% at 176 Milliwatts of power.

特色图像

High-bandwidth applications are reshaping the fabric of modern society in invisible yet profound ways, building an "invisible network" of efficient operations across diverse sectors, including the digital economy, public services, and industrial upgrades. From cross-border shopping with a simple tap to immersive cloud gaming, these seemingly ordinary everyday experiences rely on the robust support of high-speed data centers—and high bandwidth is the key to ensuring their efficient operation.

However, the high-performance computing required for high-bandwidth applications is becoming increasingly expensive. The required sampling and processing clock rates are constrained by both semiconductor physics and power limitations. As a result, higher rates increase the pressure on power consumption and heat dissipation. For example, in traditional electronic signal processing chains used in data centers, signals must be precisely timed and sampled as they travel through lossy media. Complex synchronization circuits are then used to reconstruct transmission and restore integrity to ensure accurate delivery to the next node. This process relies on extensive, power-hungry parallel processing, creating a critical bottleneck restricting efficiency improvements.

Deep learning technology provides new exploration directions for high-bandwidth applications.However, current solutions that combine analog computing models with deep learning are generally only targeted at low-bandwidth applications such as images, voice or gestures.Even though microwave photonic chips designed for high bandwidth have been developed, they are limited to a few fixed data functions and have problems such as large size and low power efficiency.

To address this dilemma, a Cornell University team proposed a microwave neural network (MNN)—an integrated circuit that can simultaneously process ultra-high-speed data and wireless communication signals. MNNs process spectral components by capturing the characteristics of input data that are sparse in information but have wide bandwidth.Its advantage is that it can process signals spanning several gigahertz (GHz) in a programmable manner while requiring only low-speed control in the megahertz (MHz) range.The strong nonlinearity in the coupled microwave oscillations is then exploited to express the calculated results in a narrow spectrum for easy electronic readout. In post-processing, this can be mapped to a binary output using a linear regression model.

Comparison between traditional digital trunking and MNN solutions

Furthermore, MNN boasts unparalleled integration capabilities. Fabricated using standard complementary metal oxide semiconductor (CMOS) technology, it occupies a chip area of only 0.088 mm² and consumes less than 200 mW of power, enabling direct integration into general-purpose analog processors.

The related research results were published in Nature Electronics under the title "An integrated microwave neural network for broadband computation and communication."

Research highlights:

* Researched, developed, and manufactured the first low-power integrated circuit that can simultaneously process ultra-high-speed data and wireless communication signals, breaking through the traditional digital circuit framework and using microwave physics principles to achieve computing 

Unlike traditional neural networks that rely on digital clocks, MNNs utilize analog, nonlinear behavior at microwave frequencies, can process data streams at tens of gigahertz, consume less than 200 milliwatts of power, and have an accuracy of 88%.

* Wide range of application scenarios, covering radar tracking, portable smart devices (such as smart watches) and other diverse fields, providing low-power, high-performance and lightweight solutions for high-bandwidth applications

Paper address:

https://go.hyper.ai/rMZ2K

Follow the official account and reply "Microwave Neural Network" to get the full PDF

More AI frontier papers: 

https://hyper.ai/papers

Training data generation: tailored for multiple tasks

In the digital simulation backend training, the spectral data output by the MNN contains information extracted from the original input rather than a direct digital output. To this end, the researchers used a linear regression model to process the 625 measured frequencies within a reduced bandwidth and mapped these features to the final output.

Afterwards, to obtain the optimal parameterized bitstream, the researchers randomly selected parameterized bitstreams and ran them in the experiment, ultimately selecting the option with the best verification performance/accuracy for each task. The following is the data setup for the optimization and evaluation of specific tasks:

* In terms of linear search and conditional algorithm simulation,Each parameterized bitstream contains a dataset of 500 randomly generated 32-bit sequences. In a 10-fold cross-validation, the dataset was split into 10 parts, with 9 parts used alternately for training and 1 for validation. A linear support vector machine (SVM) from the sklearn software package was used with a maximum of 5,000 iterations, a squared hinge loss function, and a regularization parameter of C=0.02. Testing was performed on 40 parameterized bitstreams.

* Bit count,Similar to Linear Search, but with a maximum of 10,000 iterations and a hyperparameter sweep of C from 0.02 to 0.22, constructing a 32-class classification task with labels from the Linear Search dataset.

* In terms of basic bit operations (AND NOT, XOR and NOR),A linear model was fitted via stochastic gradient descent with a logistic loss and an L1 regularization strength of 0.3. The dataset consisted of 500 randomly generated 32-bit sequences, 16 of which were fixed label bits. Using 10-fold cross-validation, the task was to perform multi-label classification on each output bit and test on 120 parameterized bitstreams.

* In terms of coding classification,The RadioML2016.10A dataset was used, split into training and validation sets in an 8:2 ratio. A single-layer linear model was trained in PyTorch using cross-entropy loss and then optimized for 150 epochs using AdamW (learning rate 0.05, weight decay 0.03, batch size 128, and decay factor 0.98). During training, the data was augmented with Gaussian noise (standard deviation 0.01) and tested on 13 parameterized bitstreams.

In the MNN radar mission evaluation, researchers used a digital neural network backend to predict target flight patterns. Each capture provided a 2GHz wideband spectrum. The input for each scenario was (L, S), where L = 1000 captures (the total time span covering the scenario) and S is the spectrum size. The researchers then used a deep ResNet architecture to directly process the MNN output spectrum data.

ResNet consists of a two-fold downsampling pooling layer and a residual branch with two convolutional layers (kernel size 3). Batch normalization, rectified linear unit activation, and random dropout regularization are used between convolutions.

To train the model, the researchers combined a bitstream search with a backend neural network trained on experimental outputs to produce the desired classification results. They selected the bitstream with the highest accuracy in the object counting task and collected experimental data from 500 flight scenarios to train the final model. To train the backend, the researchers optimized the model using cross-entropy loss, binary cross-entropy loss, and mean squared error loss.

Finally, in order to improve the generalization performance, the researchers used data augmentation, which included random shift, random bias, random noise, and random masking. All augmentation operations were applied to each sample with a probability of 20%.

Model Architecture and Methods: Instantaneous Computation via Nonlinear Systems

The overall structure of the MNN microchip is shown in the figure below. As a low-power, lightweight integrated circuit capable of processing both ultra-high-speed data and wireless communication signals, the research team calls it "a computing system modeled after the brain."Its core consists of a nonlinear waveguide (marked as A) and three linear waveguides (marked as B, C, and D respectively), as well as a gain unit (marked as E) and a coupler (marked as F).

MNN chip based on 45 nm RF CMOS process

Specifically, the MNN is a nonlinear system (see the figure below for its working mechanism). It injects gigahertz signals through GSGSG (ground-signal-ground-signal-ground) waveguides. A miniature quadrature hybrid coupler constructed from two overlapping metal layers then distributes the power of these input signals to the individual waveguides. These smaller portions of the drive signal are then reflected from the waveguides and added at the coupler's output port before being extracted through another set of GSGSG waveguides.

Among them, the frequency of the nonlinear waveguide is greatly affected by the amplitude and phase of the incident microwave driving signal; the linear waveguide is not affected by it and provides a stable resonant mode.

The primary input-sensitive source is the cascade-coupled nonlinear resonators within waveguide A. These resonators consist of a combination of nonlinear capacitors and inductors. Antiparallel diodes are used to generate capacitance with polynomial nonlinearity, the degree of which is affected by the bias voltage and microwave signal strength. The linear waveguide is an adjustable-length transmission line. Switches installed along its length allow the length of the microwave signal return path to be adjusted without introducing distortion.

More importantly,Parametric (time-varying) coupling is established by a pair of switches (Spar) connected between the paired waveguides. These switches are composed of N-type metal oxide semiconductor (NMOS) transistors.The switches are controlled by a bit stream running at just one-hundredth the speed of the input data (150 Mbit/s), transmitted through a third GSGSG waveguide. This sequence of on-off parameter coupling is key to dynamically reprogramming the neural network's patterns for various computational tasks.

Finally, to maintain the nonlinearity in the circuit caused by high-amplitude microwave transmission, a cross-coupled transistor pair using thin-gate-oxide power amplifier-stage NMOS transistors is investigated to provide regenerative saturation gain.

This design is different from traditional CMOS oscillators, complex pulse sharpening circuits for spectral analysis, and designs that generate narrowband combs through passively coupled high-quality factor resonators.It utilizes commercial CMOS technology, intentionally exposing the coupled waveguides to input microwaves, and leveraging the nonlinearities and asymmetries within the resonator to achieve nearly instantaneous calculations.

Experimental setup and results: The highest classification accuracy can reach 88%, with power consumption less than 200 mW

In the experiment,The researchers thought it might help to simplify the circuits down to their most basic components.Thus by making the linear waveguide highly detuned from the nominal oscillation frequency of waveguide A, the number of physical circuit parameters is reduced.

In modeling nonlinear dynamics in MNNs, researchers used generalized coupled-mode theory to simplify MNN circuit analysis, reducing it to a coupled-mode model. The linear resonator is simplified to an LC tank circuit, whose natural frequency is altered by adjusting the transmission line length using switches. The nonlinear waveguide consists of polynomial nonlinear capacitors. Losses in the circuit are compensated by saturated gain elements implemented by cross-coupled transistor pairs, with time-varying coupling.

Then the experimental parameters were further simplified.The focus is on the interaction of nonlinear distributed resonances and linear resonators, representing the parametrically driven switches as tunable capacitors.The nonlinear dynamics of the simplified circuit are represented by a set of coupled modes, including the coupling between the nonlinear resonator and the linear resonator, the internal losses, and the interaction with the input drive. These dynamics are affected by the nonlinear bias voltage initial conditions and the microwave drive and slow parameter bit stream.

For circuit simulation and layout, researchers designed and simulated a CMOS chip using a transistor model based on GlobalFoundries' 45nm RF silicon-on-insulator process in the Cadence Virtuoso environment. They used Siemens' Calibre tool to extract parasitic resistance and capacitance, and the 2.5D EMX electromagnetic tool to simulate the layout of waveguides, couplers, and transmission lines to accurately model high-frequency performance.

In the task of simulating high-speed digital signals using microwave circuits, gigabit-rate digital signals composed of square wave signals are essentially analog signals with a spectrum spanning tens of gigahertz.It is shown that MNN can exploit the characteristics of microwave circuits to perform computations directly in the frequency domain.This stands in stark contrast to traditional digital hardware, which operates in the time domain. When processing signals, MNNs present their outputs in a narrow frequency band with specific oscillation patterns. This eliminates the need to strictly maintain signal integrity in the time domain and allows them to capture features from a wide bandwidth of the input signal, reducing the number of compressed features required to train a single-layer digital neural network.

The figure below demonstrates the simulation of ultra-high-speed digital computation without relying on fixed-function digital CMOS circuits. A 32-bit bitstream is input at 150 Mbit/s, the nonlinear resonance responds quickly, and the spectrum analyzer records and averages the output to ensure reliable Fourier transforms. The computational characteristics are focused on the 10-14 GHz range (corresponding to X-band and Ku-band frequencies for satellite communications).

MNN simulates ultra-high-speed digital computing at tens of gigahertz

Results show that adjusting the content of a 150 Mbit/s 32-bit parameter bitstream and extracting specific spectral features can produce correct results for digital logic operations, such as 8-bit NAND operations, with the best measured accuracy reaching approximately 85% despite lossy transmission cables. Furthermore, a total counter (a circuit that counts the number of 1s in an input bitstream) composed of hundreds of logic gates achieved an accuracy of 81% on the validation set by simulating this behavior using a parameterized bitstream and mapping the output through a linear layer.This shows that its computing power does not decrease significantly due to the increase in the complexity of the equivalent digital circuit.

In addition, MNN has been shown to perform bit sequence search in 10 Gbit/s data streams with very high accuracy, which provides an alternative to the high-power Maximum Likelihood Sequence Detection (MLSD) technology used in traditional communications.Meanwhile, by combining search and technical functions, MNN successfully simulated a conditional algorithm and achieved the accuracy of 75% while maintaining power consumption below 200 mW (176 mW).

In radar target detection tasks, researchers have found that the MNN's ability to detect subtle frequency changes is highly suitable for wideband radar applications. This study simulated an airspace scenario with multiple aircraft flying along different polygonal trajectories. The radar reflection signals were recorded and converted into analog voltage waveforms. The square waves were then modulated at their center frequency and fed into the MNN. The average output response within the 8-10 GHz frequency range was extracted, and the digital neural network backend was used to infer the flight trajectory. This is shown in the figure below:

The results show that after simulating 500 flight scenarios, it was found that MNN can learn flight patterns by forming different responses to frequency changes captured over a long period of time, thereby characterizing the target flight pattern. MNN can not only predict the number of dynamic targets, isolate specific target motion, and estimate target speed,It can also recognize a variety of polygonal flight trajectories and achieve high F1 scores in scenarios with different numbers of aircraft.

In the wireless signal classification task, researchers tested the MNN's ability to process the lowest-frequency signals, exploring its application in identifying wireless communication coding schemes. The experiment used the RadioML2016.10A dataset, which includes 11 modulation types (9 digital and 2 analog). Various baseband signals modulated a 50 MHz carrier and then fed into the MNN. The MNN used its sensitivity to convert the transient transformations of the low-frequency driving signal into observable features. Features were extracted in the 8-8.5 GHz range to train the back-end linear layer.

The results show that some parameters can enable MNN to achieve very high accuracy in modulation classification tasks, and the accuracy of wireless signal classification tasks can reach 88%, which is comparable to digital neural networks.This shows that MNN can play an important role as a deep learning accelerator in edge computing, and not only that, it can also significantly reduce the model size.

Deep learning and analog computing have great potential

As mentioned at the outset, the continuous development of deep learning technology has opened up a new path for high-bandwidth applications. Prior to this study, numerous institutions had already explored this field and published numerous research findings. These achievements have laid the theoretical and practical foundation for the continued evolution and innovation of the integration of analog computing and deep learning.

For example, the paper "Microwave signal processing using an analog quantum reservoir computer" jointly published by a team from Cornell University and a team from the University of Maryland proposed thatContinuous-time simulation of quantum nonlinear dynamics using superconducting microwave circuits as quantum reservoirs,This method can directly process analog microwave input signals without discretization, and the continuous variable system allows the quantum reservoir to access a larger Hilbert space. Unlike previous experiments based on digital quantum circuits, this method can directly receive weak analog microwave signals and extract their features, overcoming the input bottleneck.

Paper address:

https://www.nature.com/articles/s41467-024-51161-8

Another example is a study titled "Higher-dimensional processing using a photonic tensor core with continuous-time data" jointly published by teams from the University of Oxford, the University of Münster, and the University of Exeter.Among them, a method is proposed to use continuous-time data representation to develop three degrees of freedom: space, wavelength, and radio frequency, and realize matrix-vector multiplication (MVM) calculation of three-dimensional array input.Photonic memory computing is achieved through an electro-optically controlled photonic tensor core and reconfigurable non-volatile phase-change material memory. The system parallelism is as high as 100, which is two orders of magnitude higher than the previous implementation using only two degrees of freedom, verifying the feasibility of adding radio frequency degrees of freedom to photonic memory computing.

Paper address:

https://www.nature.com/articles/s41566-023-01313-x

The MNN proposed by Cornell University further promotes the integration of analog computing and deep learning in high-bandwidth scenarios.It does not rely on digital clocks and uses microwave physics principles to achieve ultra-high-speed signal processing.It not only overcomes the limitations of traditional digital circuits in power consumption and bandwidth, but also demonstrates the potential of analog computing in complex tasks. From radar trajectory tracking to wireless signal classification, MNN, with its advantages of low power consumption and small size, provides a new paradigm for edge computing, high-speed communications and other fields.

In the future, with the development of technologies such as dynamic parameter adjustment and end-to-end joint training, the integration of analog computing and deep learning is expected to break through more bandwidth and efficiency bottlenecks, opening up broader application space for cutting-edge fields such as ultra-high-speed data processing and millimeter-wave communications.Get high-quality papers and in-depth interpretation articles in the field of AI4S from 2023 to 2024 with one click⬇️

Cornell University Pioneered a "microwave Brain" Chip That Processes ultra-high-speed Data and Wireless Communication Signals Simultaneously, Achieving an Accuracy of 75% at 176 Milliwatts of power. | News | HyperAI