Zengyi Qin | MIT

Zengyi Qin

qinzy [at] mit.edu

I am an AI researcher and entrepreneur, and also PhD candidate at MIT.

Education

Massachusetts Institute of Technology, 2020 - 2024
Doctor of Philosophy - PhD
I mainly work with MIT CSAIL (Computer Science and Artificial Intelligence Laboratory) and MIT IMES (Institute for Medical Engineering and Science) on foundational AI technologies.

Massachusetts Institute of Technology, 2020 - 2022
Master's degree
My research in safety-guaranteed machine learning at MIT REALM has not only become a key feature of the lab but also inspired numerous subsequent studies.

News

[Apr 2024] Our new AI model JetMoE went viral in the AI field. People commented that it represents a paradigm shift in AI training, and it is a game-changer, revolutionizing AI training with remarkable cost efficiency.

[Apr 2024] Our JetMoE received strong positive comments from the AI field. See what MIT CSAIL said. Our innovative model architecture and training strategy caught massive attention (see 1 2 3 4).

[Jan 2024] Our voice cloning algorithm OpenVoice gained massive attention in the Gen AI field, and is featured in VentureBeat, the leader in covering transformative tech. OpenVoice is also featured in AI News, HyScaler and LinkedIn.

[Jan 2024] Tech medias gave very high comments on our OpenVoice, such as "OpenVoice Unveils Revolutionary Open-Source Technology", and "A New Era of Open Source AI Voice Cloning Technology".

[Jan 2024] OpenVoice is ranked Global Top 1 in GitHub trending (with >14k stars) and HuggingFace trending, surpassing contemporary work in audio generation by Meta and Microsoft.

Publications

	JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars Yikang Shen, Zhen Guo, Tianle Cai and Zengyi Qin Technical Report, 2023 Training high-performance LLM with remarkable cost efficiency. website \| github \| tech report
	OpenVoice: Versatile Instant Voice Cloning Zengyi Qin, Wenliang Zhao, Xumin Yu and Xin Sun Technical Report, 2023 Instantly clone any voice to generate speech in any style and any language. paper \| website \| source code Star
	MonoGRNet: A General Framework for Monocular 3D Object Detection Zengyi Qin, Jinglu Wang and Yan Lu The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021 A general monocular 3D object detection framework that flexibly adapts to both fully and weakly supervised learning, which alleviates the need of extensive 3D labels and only requires ground truth 2D bounding boxes during training. paper
	SABLAS: Learning Safe Control for Black-Box Dynamical Systems Zengyi Qin, Dawei Sun and Chuchu Fan IEEE Robotics and Automation Letters (RA-L), 2022 Learning control barrier functions (CBFs) for safe control of black-box systems. CBFs are a powerful tool to provide safety guarantee, but before this work, they cannot be directly applied to black box systems where their models are unavailable. paper \| code
	KETO: Learning Keypoint Representations for Tool Manipulation Zengyi Qin, Kuan Fang, Yuke Zhu, Li Fei-Fei and Silvio Savarese The International Conference on Robotics and Automation (ICRA), 2020 KETO is a framework for robots to manipulate unseen objects as tools to complete diverse tasks. We proposed a method to learn the keypoint representations of objects, which simplify the manipulation task and improve the generality to novel objects. paper \| video \| website \| code
	Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates Zengyi Qin, Kaiqing Zhang, Yuxiao Chen, Jingkai Chen and Chuchu Fan The International Conference on Learning Representations (ICLR), 2021 We study the multi-agent safe control problem where agents should avoid any collision while reaching their goals. Our method can scale up to an arbitrarily large number of agents (e.g., >1000 in our experiments) and achieve a 99-100% safety rate. paper \| video \| code \| website
	Reactive and Safe Road User Simulations using Neural Barrier Certificate Yue Meng, Zengyi Qin and Chuchu Fan The International Conference on Intelligent Robots and Systems (IROS), 2021 Reactive and safe agent modelings are important for nowadays traffic simulator designs and safe planning applications. We propose a control barrier function-based method to simulate traffic agents that behave like humans or human controlled vehicles, which react to other road participants. paper \| website
	Density Constrained Reinforcement Learning Zengyi Qin, Yuxiao Chen and Chuchu Fan The International Conference on Machine Learning (ICML), 2021 We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous work. State density has a clear physical and mathematical interpretation, and is able to express a wide variety of constraints such as resource limits and safety requirements. paper \| code \| website
	Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions Charles Dawson, Zengyi Qin, Sicun Gao and Chuchu Fan The Conference on Robot Learning (CoRL), 2021 Safety and stability are common requirements for robotic control systems. We propose a robust feedback method based on robust control Lyapunov barrier functions that generalize despite model uncertainty, and with safety and stability guarantee. paper
	Controller synthesis for linear system with reach-avoid specifications Chuchu Fan, Zengyi Qin, Umang Mathur, Qiang Ning, Sayan Mitra, and Mahesh Viswanathan IEEE Transactions on Automatic Control (TAC), 2021 We address the problem of synthesizing provably correct controllers for linear systems with reach-avoid specifications. Our solution decomposes the overall synthesis problem into two smaller and more tractable problems, achieving a 2-150 times speedup compared with the previous techniques. paper
	Weakly Supervised 3D Object Detection from Point Clouds Zengyi Qin, Jinglu Wang and Yan Lu ACM Multimedia (ACM MM), 2020 A state-of-the-art framework for weakly supervised 3D object detection from point clouds without using any ground truth 3D bounding box for training. The core of our method is the unsupervised 3D object proposal module and the cross-modal knowledge distillation strategy. paper \| code
	Learning fine-grained estimation of physiological states from coarse-grained labels by distribution restoration Zengyi Qin, Jiansheng Chen, Zhenyu Jiang, Xumin Yu, Chunhua Hu, Yu Ma, Suhua Miao and Rongsong Zhou, Scientific Reports, 2020 Our method allows machine learning algorithms to perform fine-grained estimation of physiological states (e.g., sleep depth) even if the training labels are coarse-grained. paper \| code
	Triangulation Learning Network: from Monocular to Stereo 3D Object Detection Zengyi Qin, Jinglu Wang and Yan Lu The International Conference on Computer Vision and Pattern Recognition (CVPR), 2019 This is a pioneering work on stereo image based 3D object detection without calculating the pixel-level depth maps. We proposed a triangulation learning method to learn the object-level stereo geometric correspondence for 3D object detection. paper \| video \| code \| website
	MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization Zengyi Qin, Jinglu Wang and Yan Lu The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), 2019, Oral Presentation, Acceptance Rate < 8% A state-of-the-art monocular 3D object detection approach based on geometric reasoning. We proposed to decompose the whole task into four progressive sub-tasks that significantly facilitates the monocular 3D object detection. paper \| video \| code \| website
	sEMG based Tremor Severity Evaluation for Parkinson's Disease using a Light-weight CNN Zengyi Qin, Zhenyu Jiang, Jiansheng Chen, Chunhua Hu and Yu Ma IEEE Signal Processing Letters (SPL), 2019 A machine learning framework to assist the diagnosis of Parkinson's Disease by assessing the pathological tremor. We proposed a light-weight convolutional neural network and a similarity learning strategy to handle the scarcity of medical data. paper \| website