Zengyi Qin

qinzy [at] mit.edu

I am an AI researcher and entrepreneur, and also PhD candidate at MIT.



  • [Apr 2024] Our new AI model JetMoE went viral in the AI field. People commented that it represents a paradigm shift in AI training, and it is a game-changer, revolutionizing AI training with remarkable cost efficiency.
  • [Apr 2024] Our JetMoE received strong positive comments from the AI field. See what MIT CSAIL said. Our innovative model architecture and training strategy caught massive attention (see 1 2 3 4).
  • [Jan 2024] Our voice cloning algorithm OpenVoice gained massive attention in the Gen AI field, and is featured in VentureBeat, the leader in covering transformative tech. OpenVoice is also featured in AI News, HyScaler and LinkedIn.
  • [Jan 2024] Tech medias gave very high comments on our OpenVoice, such as "OpenVoice Unveils Revolutionary Open-Source Technology", and "A New Era of Open Source AI Voice Cloning Technology".
  • [Jan 2024] OpenVoice is ranked Global Top 1 in GitHub trending (with >14k stars) and HuggingFace trending, surpassing contemporary work in audio generation by Meta and Microsoft.


JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars
Yikang Shen, Zhen Guo, Tianle Cai and Zengyi Qin
Technical Report, 2024

Training high-performance LLM with remarkable cost efficiency.
website | github | tech report

OpenVoice: Versatile Instant Voice Cloning
Zengyi Qin, Wenliang Zhao, Xumin Yu and Xin Sun
Technical Report, 2024

Instantly clone any voice to generate speech in any style and any language.
paper | website | source code


MonoGRNet: A General Framework for Monocular 3D Object Detection
Zengyi Qin, Jinglu Wang and Yan Lu
The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

A general monocular 3D object detection framework that flexibly adapts to both fully and weakly supervised learning, which alleviates the need of extensive 3D labels and only requires ground truth 2D bounding boxes during training.


SABLAS: Learning Safe Control for Black-Box Dynamical Systems
Zengyi Qin, Dawei Sun and Chuchu Fan
IEEE Robotics and Automation Letters (RA-L), 2022

Learning control barrier functions (CBFs) for safe control of black-box systems. CBFs are a powerful tool to provide safety guarantee, but before this work, they cannot be directly applied to black box systems where their models are unavailable.

paper | code

KETO: Learning Keypoint Representations for Tool Manipulation
Zengyi Qin, Kuan Fang, Yuke Zhu, Li Fei-Fei and Silvio Savarese
The International Conference on Robotics and Automation (ICRA), 2020

KETO is a framework for robots to manipulate unseen objects as tools to complete diverse tasks. We proposed a method to learn the keypoint representations of objects, which simplify the manipulation task and improve the generality to novel objects.

paper | video | website | code

Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates
Zengyi Qin, Kaiqing Zhang, Yuxiao Chen, Jingkai Chen and Chuchu Fan
The International Conference on Learning Representations (ICLR), 2021

We study the multi-agent safe control problem where agents should avoid any collision while reaching their goals. Our method can scale up to an arbitrarily large number of agents (e.g., >1000 in our experiments) and achieve a 99-100% safety rate.

paper | video | code | website

Reactive and Safe Road User Simulations using Neural Barrier Certificate
Yue Meng, Zengyi Qin and Chuchu Fan
The International Conference on Intelligent Robots and Systems (IROS), 2021

Reactive and safe agent modelings are important for nowadays traffic simulator designs and safe planning applications. We propose a control barrier function-based method to simulate traffic agents that behave like humans or human controlled vehicles, which react to other road participants.

paper | website

Density Constrained Reinforcement Learning
Zengyi Qin, Yuxiao Chen and Chuchu Fan
The International Conference on Machine Learning (ICML), 2021

We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous work. State density has a clear physical and mathematical interpretation, and is able to express a wide variety of constraints such as resource limits and safety requirements.

paper | code | website

Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions
Charles Dawson, Zengyi Qin, Sicun Gao and Chuchu Fan
The Conference on Robot Learning (CoRL), 2021

Safety and stability are common requirements for robotic control systems. We propose a robust feedback method based on robust control Lyapunov barrier functions that generalize despite model uncertainty, and with safety and stability guarantee.


Controller synthesis for linear system with reach-avoid specifications
Chuchu Fan, Zengyi Qin, Umang Mathur, Qiang Ning, Sayan Mitra, and Mahesh Viswanathan
IEEE Transactions on Automatic Control (TAC), 2021

We address the problem of synthesizing provably correct controllers for linear systems with reach-avoid specifications. Our solution decomposes the overall synthesis problem into two smaller and more tractable problems, achieving a 2-150 times speedup compared with the previous techniques.


Weakly Supervised 3D Object Detection from Point Clouds
Zengyi Qin, Jinglu Wang and Yan Lu
ACM Multimedia (ACM MM), 2020

A state-of-the-art framework for weakly supervised 3D object detection from point clouds without using any ground truth 3D bounding box for training. The core of our method is the unsupervised 3D object proposal module and the cross-modal knowledge distillation strategy.

paper | code

Learning fine-grained estimation of physiological states from coarse-grained labels by distribution restoration
Zengyi Qin, Jiansheng Chen, Zhenyu Jiang, Xumin Yu, Chunhua Hu, Yu Ma, Suhua Miao and Rongsong Zhou, Scientific Reports, 2020

Our method allows machine learning algorithms to perform fine-grained estimation of physiological states (e.g., sleep depth) even if the training labels are coarse-grained.

paper | code

Triangulation Learning Network: from Monocular to Stereo 3D Object Detection
Zengyi Qin, Jinglu Wang and Yan Lu
The International Conference on Computer Vision and Pattern Recognition (CVPR), 2019

This is a pioneering work on stereo image based 3D object detection without calculating the pixel-level depth maps. We proposed a triangulation learning method to learn the object-level stereo geometric correspondence for 3D object detection.

paper | video | code | website

MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization
Zengyi Qin, Jinglu Wang and Yan Lu
The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), 2019, Oral Presentation, Acceptance Rate < 8%

A state-of-the-art monocular 3D object detection approach based on geometric reasoning. We proposed to decompose the whole task into four progressive sub-tasks that significantly facilitates the monocular 3D object detection.

paper | video | code | website

sEMG based Tremor Severity Evaluation for Parkinson's Disease using a Light-weight CNN
Zengyi Qin*, Zhenyu Jiang*, Jiansheng Chen, Chunhua Hu and Yu Ma
IEEE Signal Processing Letters (SPL), 2019

A machine learning framework to assist the diagnosis of Parkinson's Disease by assessing the pathological tremor. We proposed a light-weight convolutional neural network and a similarity learning strategy to handle the scarcity of medical data.

paper | website