In-depth sharing by HUST/Shanghai AI Lab/Shanghai Jiaotong University research pioneers: latest achievements, experience in submitting papers to top conferences, challenges of interdisciplinary collaboration...

Artificial intelligence integrates multiple disciplines such as computer science, mathematics, statistics, and cognitive science, and its development is highly dependent on the cultivation of interdisciplinary talents. In recent years, the rise of AI for Science has made everyone see the disruptive potential of the deep integration of artificial intelligence and basic disciplines. Nowadays, many outstanding scholars are pushing scientific research to new heights with their multidisciplinary backgrounds. For example:
* Associate Professor Huang Hong of Huazhong University of Science and Technology has academic experience in broadcasting and television engineering, information engineering, and computer science.Today, she focuses on data-driven scientific research, including data mining, big data analysis, social network analysis, etc.
* Zhou Dongzhan, a young researcher at the AI for Science Center of Shanghai Artificial Intelligence Laboratory, started his career in physics.Turned to artificial intelligence, and is now working on the application of AI in material sciences;
* Zhou Bingxin, assistant researcher at the Institute of Natural Sciences, Shanghai Jiao Tong University,She majored in finance in her undergraduate studies, data analysis in her master's studies, and focused on machine learning and deep learning in her doctoral studies. Now, she is using deep learning to solve problems in the biological field, such as protein design and modification based on deep learning algorithms.
Huang Hong: Our research should be able to truly solve practical problems
As an associate professor and doctoral/master's supervisor at Huazhong University of Science and Technology, Associate Professor Huang Hong has been deeply involved in data mining, big data analysis and other fields for many years, and has published many papers as the first/corresponding author in top international journals and conferences such as TKDE, TKDD, WWW, IJCAI, and WSDM. However, her scientific research journey was not smooth sailing.

Recalling her frustrating experiences during graduate school, Associate Professor Huang Hong said that she had revised a paper 28 times. When she revised it for the 25th time, she felt overwhelmed. Later, with the encouragement of her friends and mentors, she calmed down, re-examined the paper, and found that there were still many details that needed to be improved. Finally, through continuous adjustment and polishing, she successfully published it.
In the opinion of Associate Professor Huang Hong, "The key to doing scientific research is to see whether the idea of your article really solves a problem in a certain aspect and whether it puts forward a reasonable research motivation." Based on this concept,Her research focuses on two directions: first, innovating methods in big data analysis and data mining; second, developing data-driven applications to solve practical social problems.
In the field of method innovation, Associate Professor Huang Hong's team mainly focuses on graph neural networks and modeling of complex systems. She believes that in the current era of big data, in order to more effectively mine the value of data, a graph structure can be used to represent the surrounding things, that is, to abstractly model things as nodes, analyze the relationship between these nodes, and then construct a graph structure.
In addition, their team is also developing data-driven applications, such as social network analysis. Between 2009 and 2012, social network development was at its peak, with platforms such as Weibo, Twitter and Facebook gradually emerging. This also prompted Associate Professor Huang Hong's team to use the data from these platforms to analyze the development of network structure and carry out user recommendations, public opinion analysis and other work.
"During the COVID-19 pandemic, we analyzed international news media comments on China and studied the changes in attitudes toward China on the Internet, providing data support for understanding external positions," said Associate Professor Huang Hong.
Another interesting case study is analyzing the socioeconomic status of individuals and using it for urban planning."We work with the telecommunications department to obtain users' mobile traffic log data, analyze users' GPS locations, identify users' activity areas, and combine housing price information in these areas to infer the level of the area in the city." For example, if a person frequently appears in a financial district, it may mean that he or she has a high socioeconomic status, while if he or she often appears near schools or educational institutions, he or she may be a student or educator. Based on this, researchers can comprehensively evaluate an individual's socioeconomic status, thereby providing a reference for urban planning.
In terms of industrial intelligence,Associate Professor Huang Hong’s team is also using artificial intelligence technology to automatically identify and diagnose faults in industrial equipment, greatly improving the efficiency and accuracy of equipment maintenance.
Associate Professor Huang Hong concluded: "You must be interested in the research you want to do." In her opinion, scientific research is essentially a boring process that requires great patience, but if you are truly interested in it, you will have the self-motivation to stick to it, "which is also the quality I value most when recruiting students."
Zhou Dongzhan: Let AI generate new ideas like scientists
Dr. Zhou Dongzhan also agrees with Associate Professor Huang Hong's point of view: "If you don't have interest, it is really difficult to do better work." In her opinion, the key point of choosing a research direction is not to judge whether the field is "hot" or "popular". Popular fields can still produce industry model results, and niche tracks can also discover some new problems. We should break through the comfort zone, avoid homogenized research, and choose to produce some more solid results.
Currently, Dr. Zhou Dongzhan’s research direction is to apply AI technologies such as large language models and multimodal models to material science.The main results are shown in the figure below:

Last January,The Shanghai Artificial Intelligence Laboratory has launched a large language model in the field of chemistry called "Shusheng Jianyuan".Explore cutting-edge topics that combine general large models with professional fields. The chemical language model performs well in many core chemical tasks (molecule and reaction related), and many indicators exceed GPT-4. Considering the importance of external knowledge in chemical research,The team added the Retrieval Augmented Generation (RAG) mechanism to the language model.To reduce the model hallucination problem. Considering the diversity of chemical data modalities,The team further developed a multimodal version model.This version of the model performs well in molecular recognition and multimodal chemical reasoning, and many indicators exceed GPT-4v. Considering the importance of using scientific tools for the model,The team developed an Agent toolkit.Integrate more than 50 chemical tools, covering search, calculation, molecules and reactions, so that the model can perform related tasks more efficiently.
Based on the above research, the laboratory team wants AI to take on more complex tasks, rather than just letting the large language model stay at the question-answering level. So the team began to explore whether AI can generate new scientific research hypotheses like scientists.

As shown in the figure above, AI is used to automatically generate research hypotheses given the research background and problems. For example, if you want to study a certain type of battery and find materials and components that meet specific properties, you can generate high-quality scientific ideas by decoupling the research background and inspiration and combining the MOOSE-CHEM system and its built-in multi-agent operation.

The study found that the formulation of scientific hypotheses is a complex reasoning process that is difficult to generate directly through a single step. Therefore, the team disassembled this process, iteratively searched for inspiration and hypotheses, and further searched the generated hypotheses to ensure that the final scientific hypotheses formed were more solid and diverse.
At the same time, the team also built a scientific hypothesis generated by the Benchmark evaluation. As shown in the figure below, the study found that models with better performance have stronger retrieval capabilities.

In addition, the study also confirmed that in electrochemical-related tasks, the model can generate executable scientific hypotheses rather than just general concepts. For example, its scientific hypotheses include the core components of the material, such as ruthenium metal, nitrogen doping, etc.The laboratory team is already working with relevant research groups, hoping to promote the practical application of the system and make it a true scientific research assistant.

The laboratory team is working to enable AI to generate scientific research ideas and even promote scientific innovation. Looking back on her academic experience, Zhou Dongzhan admitted that her attitude towards scientific research was deeply influenced by physicist Wu Jianxiong - "The deviation of research results may come from a very small detail problem." Therefore, she always emphasizes that paying attention to details and in-depth scrutiny are the keys to achieving breakthroughs in scientific research.
Zhou Bingxin: Self-developed protein model ranks first on the global authoritative list
In everyone's growth trajectory, there may be an "idol" who has a subtle influence on learning, career, and even life planning. Talking about her "scientific research idol", Dr. Zhou Bingxin introduced that "the reason why I chose to do scientific research was largely influenced by my doctoral supervisor." In Zhou Bingxin's impression, her doctoral supervisor is a very responsible person, serious, patient, approachable, and replies to student messages in seconds. He will even help her modify the code word by word and check the formula derivation line by line. "I hope that in the future I can regard the training of students as a very important thing like my supervisor."
In terms of choosing a research direction, Zhou Bingxin believes that there is no single "right path". The key is to find the path that suits you best and stick to it. "It depends on what you are more willing to do and your risk tolerance. As long as you are happy, there is no need to blindly follow the trend of involution or fashion."
Zhou Bingxin also shared some of the team’s research in recent years, especially the exploration of AI in protein modification.
In industry, enzymes are used in drug development, disease monitoring, and plastic degradation, etc. However, natural proteins come from nature and have their own specific living environment (such as high pressure and high temperature), which may not meet industrial needs. Therefore, they need to be modified to improve their catalytic activity, thermal stability, binding affinity, and substrate selectivity.

In recent years, artificial intelligence-assisted protein design has gradually emerged.As shown in the figure below, to put it simply, the self-supervised model is first allowed to learn a large amount of protein data (sequence, structure, evolutionary information), and then a small amount of labeled data sets related to downstream tasks (predicting protein activity) are used to train a prediction model. According to specific needs (improving activity), the structure or sequence of the protein is re-optimized or completely designed.

After a protein sequence is modified, it can be transfected into expression systems such as E. coli and yeast for expression and purification by the biological team. The purified protein will be used to test its biochemical properties, such as activity, stability, and binding affinity, which depend on the specific use of the protein. In this process, algorithms can also provide assistance, such as predicting the expressivity, solubility, and activity of a given protein. Finally, only the protein sequence recommended by the algorithm needs to be used in the experiment, which can further save costs.

As shown in the figure below,Zhou Bingxin's team's work focuses on various modules of protein engineering, including but not limited to deducing sequences from protein structure and deducing sequences from function."We hope to develop our own tools and explore how to combine these tools with subsequent biological experiments to form a complete cycle, thereby achieving iterative optimization between dry experiments (computational simulations) and wet experiments (actual biological experiments)."

So far, the tools developed by the team have achieved excellent results in both dry and wet experiments.For example, on the world’s authoritative list ProteinGym, its models took the first and second positions respectively.

In addition, the growth hormone developed by the team,The world's first truly large-scale production (5,000 liters) of AI-designed protein was achieved.They also successfully modified the EPS-G7 enzyme, improving its specificity and catalytic activity, and reducing production costs by 90%, breaking the import monopoly restrictions.

In addition to the transformation of a single point or a few points,They also generated the complete protein sequence in its entirety.For example, the Ago series proteins (high temperature survival) used for nucleic acid shearing are modified so that they can maintain good activity at room temperature and are suitable for shearing work in nucleic acid test kits.

The biggest problem between AI practitioners and Science practitioners is communication
It is worth mentioning that because Dr. Zhou Bingxin’s field is highly interdisciplinary, in order to promote communication between AI practitioners and Science practitioners, their team compiled a large amount of data, tools, and downstream task detection modules.And integrated it into a tool library called VenusFactory.

In Dr. Zhou Bingxin's view, communication skills are crucial in the collaboration between AI and science. "When I first started working in the biological field, many biological partners wanted to work with us, but I couldn't understand what they were saying. Now, based on my own understanding, I can transform the scientific questions they raised into engineering problems and find corresponding algorithms to solve them."
Dr. Zhou Dongzhan agrees with this view. She emphasized: "When working with universities, research institutes or companies, it is critical to ensure that both parties understand the problem at the same level. We need to let our partners in the scientific field understand the current status of AI technology, and at the same time let the technical team understand what the most critical issues are."
Associate Professor Huang Hong added that it is very important to master basic knowledge in interdisciplinary collaboration. She recalled her collaboration with Professor Luo Jiade's team from the Department of Sociology at Tsinghua University. In the early stages, the sociology team proposed research questions, and the technical team provided data analysis support and was responsible for experimental design. Over time, the technical team gradually mastered the basic knowledge of sociology, began to independently raise questions and discuss with the sociology team, and this collision of ideas gave rise to a number of research results.
It is worth mentioning that ICLR 2025 and other top conferences are announcing their results recently, and there are also several important conferences that have not yet reached their deadlines.We also took this opportunity to let the teachers share their experiences in submitting papers to top AI conferences, as shown below:
1. Read the Call for papers carefully.Clarify the requirements for acceptance of articles at different top conferences to avoid losing submission opportunities.
2. Pay attention to the details of the article.The format should be correct, the pictures should be clear, and the layout should be nice.
3. Clarify the submission deadline.All experiments should be completed at least one week in advance to ensure the integrity of the paper and reduce the room for reviewers to question.
4. Research questions,Whether the idea of the article really solves a certain problem; whether the research motivation is reasonable.
5. Advice on writing papers
* Suggestions for thesis outline: First, introduce the background. Second, what previous research was like and what problems existed. Third, what our work is like, ensuring that your ideas are conveyed to the reviewer and that he is convinced;
* In addition, to ensure the logic of the article, each research question and the subsequent experimental verification need to be linked together and self-consistent.
6. About rejection:It is normal to have your manuscript rejected. Reviewers have different preferences. You can try submitting it a few more times.