Machine Learning
Data-driven approaches to design intelligent algorithms.
MERL has a long history of research activity in machine learning, including the development of various boosting algorithms and contributing to the theory and practice of highly scalable collaborative filtering. Our recent work has focused on deep learning and reinforcement learning, with application to a wide range of applications including automotive, robotics, factory automation, transportation, as well as building and home systems.
Quick Links
-
Researchers
Toshiaki
Koike-Akino
Jonathan
Le Roux
Ye
Wang
Ankush
Chakrabarty
Anoop
Cherian
Gordon
Wichern
Philip V.
Orlik
Michael J.
Jones
Tim K.
Marks
Daniel N.
Nikovski
Kieran
Parsons
Devesh K.
Jha
Stefano
Di Cairano
Diego
Romeres
Chiori
Hori
Christopher R.
Laughman
Karl
Berntorp
Pu
(Perry)
WangYebin
Wang
Bingnan
Wang
Mouhacine
Benosman
Suhas
Lohit
Hassan
Mansour
Matthew
Brand
Arvind
Raghunathan
Petros T.
Boufounos
Moitreya
Chatterjee
Jianlin
Guo
Siddarth
Jain
Kuan-Chuan
Peng
Abraham P.
Vinod
William S.
Yerazunis
Scott A.
Bortoff
Radu
Corcodel
Vedang M.
Deshpande
François
Germain
Chungwei
Lin
Dehong
Liu
Saviz
Mowlavi
Hongtao
Qiao
Hongbo
Sun
Wataru
Tsujita
Sameer
Khurana
Jing
Liu
Pedro
Miraldo
Koon Hoo
Teo
Anthony
Vetro
Jinyun
Zhang
Jose
Amaya
Abraham
Goldsmith
Yanting
Ma
James
Queeney
Joshua
Rapp
Avishai
Weiss
Ryoma
Yataka
Janek
Ebbers
Ryo
Hase
Zexu
Pan
Shinya
Tsuruta
-
Awards
-
AWARD Jonathan Le Roux elevated to IEEE Fellow Date: January 1, 2024
Awarded to: Jonathan Le Roux
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.
Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.
IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
-
AWARD Honorable Mention Award at NeurIPS 23 Instruction Workshop Date: December 15, 2023
Awarded to: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddharth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka and Diego Romeres
MERL Contacts: Radu Corcodel; Chiori Hori; Siddarth Jain; Devesh K. Jha; Diego Romeres
Research Areas: Artificial Intelligence, Machine Learning, RoboticsBrief- MERL Researchers received an "Honorable Mention award" at the Workshop on Instruction Tuning and Instruction Following at the NeurIPS 2023 conference in New Orleans. The workshop was on the topic of instruction tuning and Instruction following for Large Language Models (LLMs). MERL researchers presented their work on interactive planning using LLMs for partially observable robotic tasks during the oral presentation session at the workshop.
-
AWARD MERL team wins the Audio-Visual Speech Enhancement (AVSE) 2023 Challenge Date: December 16, 2023
Awarded to: Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux
MERL Contacts: François Germain; Chiori Hori; Sameer Khurana; Jonathan Le Roux; Zexu Pan; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.
The AVSE challenge aims to design better speech enhancement systems by harnessing the visual aspects of speech (such as lip movements and gestures) in a manner similar to the brain’s multi-modal integration strategies. MERL’s system was a scenario-aware audio-visual TF-GridNet, that incorporates the face recording of a target speaker as a conditioning factor and also recognizes whether the predominant interference signal is speech or background noise. In addition to outperforming all competing systems in terms of objective metrics by a wide margin, in a listening test, MERL’s model achieved the best overall word intelligibility score of 84.54%, compared to 57.56% for the baseline and 80.41% for the next best team. The Fisher’s least significant difference (LSD) was 2.14%, indicating that our model offered statistically significant speech intelligibility improvements compared to all other systems.
- MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.
See All Awards for Machine Learning -
-
News & Events
-
NEWS Diego Romeres gave an invited talk at the Padua University's Seminar series on "AI in Action" Date: April 9, 2024
MERL Contact: Diego Romeres
Research Areas: Artificial Intelligence, Dynamical Systems, Machine Learning, Optimization, RoboticsBrief- Diego Romeres, Principal Research Scientist and Team Leader in the Optimization and Robotics Team, was invited to speak as a guest lecturer in the seminar series on "AI in Action" in the Department of Management and Engineering, at the University of Padua.
The talk, entitled "Machine Learning for Robotics and Automation" described MERL's recent research on machine learning and model-based reinforcement learning applied to robotics and automation.
- Diego Romeres, Principal Research Scientist and Team Leader in the Optimization and Robotics Team, was invited to speak as a guest lecturer in the seminar series on "AI in Action" in the Department of Management and Engineering, at the University of Padua.
-
NEWS Saviz Mowlavi gave an invited talk at North Carolina State University Date: April 12, 2024
MERL Contact: Saviz Mowlavi
Research Areas: Control, Dynamical Systems, Machine Learning, OptimizationBrief- Saviz Mowlavi was invited to present remotely at the Computational and Applied Mathematics seminar series in the Department of Mathematics at North Carolina State University.
The talk, entitled "Model-based and data-driven prediction and control of spatio-temporal systems", described the use of temporal smoothness to regularize the training of fast surrogate models for PDEs, user-friendly methods for PDE-constrained optimization, and efficient strategies for learning feedback controllers for PDEs.
- Saviz Mowlavi was invited to present remotely at the Computational and Applied Mathematics seminar series in the Department of Mathematics at North Carolina State University.
See All News & Events for Machine Learning -
-
Research Highlights
-
Internships
-
OR2105: Preference-based Multi-Objective Bayesian Optimization
MERL is looking for a self-motivated and qualified candidate to work on Bayesian Optimization algorithms applied to industrial applications. The ideal candidate is a PhD student with experience and peer-reviewed publications in the general field of derivative-free/zeroth-order optimization, preference will be given to candidates who have contributed to theoretical advances or practical application of Bayesian optimization, especially for multi-objective optimization problems. The ideal candidate will have a strong general understanding of numerical optimization and probabilistic machine learning e.g. Gaussian process regression, and is expected to develop, in collaboration with MERL researchers, state of the art algorithms to optimize parameters for industrial processes or control systems. Proficiency in Python is required. An expected outcome of the internship is one or more peer-reviewed publications. The expected duration is 3-4 months, with flexible starting date.
-
EA2120: AI-assisted Design of Semiconductor Devices
We are seeking a graduate student interested in the research of AI-assisted design of semiconductor devices in general and GaN, SiC and Si IGBT in particular. The interns will collaborate with researchers at MERL and those in Japan to explore and develop new AI input models and methodology, and optimization methods, using both simulated and experimental data for the AI-assisted design of semiconductor devices. The ideal candidates are senior Ph.D. students with experience in semiconductor device physics, device modeling, deep learning, and other machine learning techniques, and the use of TCAD as a simulation tool. Those with deep knowledge of GaN, Si, and SiC devices and applications in RF and power electronics will be great assets. This internship's Start date is flexible and lasts 3-6 months.
-
CA2132: Optimization Algorithms for Motion Planning and Predictive Control
MERL is looking for a highly motivated and qualified individual to work on tailored computational algorithms for optimization-based motion planning and predictive control applications in autonomous systems (vehicles, mobile robots). The ideal candidate should have experience in either one or multiple of the following topics: convex and non-convex optimization, stochastic predictive control (e.g., scenario trees), interaction-aware motion planning, machine learning, learning-based model predictive control, mathematical programs with complementarity constraints (MPCCs), optimal control, and real-time optimization. PhD students in engineering or mathematics, especially with a focus on research related to any of the above topics are encouraged to apply. Publication of relevant results in conference proceedings or journals is expected. Capability of implementing the designs and algorithms in MATLAB/Python is required; coding parts of the algorithms in C/C++ is a plus. The expected duration of the internship is 3 months, and the start date is flexible.
See All Internships for Machine Learning -
-
Openings
-
EA2051: Research Scientist - Electric Systems Automation
-
OR2137: Research Scientist - Optimization & Intelligent Robotics
See All Openings at MERL -
-
Recent Publications
- "Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees", Transactions on Machine Learning Research (TMLR), April 2024.BibTeX TR2024-037 PDF
- @article{Queeney2024apr,
- author = {Queeney, James and Ozcan, Erhan Can and Paschalidis, Ioannis Ch. and Cassandras, Christos G.},
- title = {Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees},
- journal = {Transactions on Machine Learning Research (TMLR)},
- year = 2024,
- month = apr,
- issn = {2835-8856},
- url = {https://www.merl.com/publications/TR2024-037}
- }
, - "LMI-Based Neural Observer for State and Nonlinear Function Estimation", International Journal of Robust and Nonlinear Control, DOI: 10.1002/rnc.7327, April 2024.BibTeX TR2024-036 PDF
- @article{Jeon2024apr,
- author = {Jeon, Woongsun and Chakrabarty, Ankush and Zemouche, Ali and Rajamani, Rajesh},
- title = {LMI-Based Neural Observer for State and Nonlinear Function Estimation},
- journal = {International Journal of Robust and Nonlinear Control},
- year = 2024,
- month = apr,
- doi = {10.1002/rnc.7327},
- url = {https://www.merl.com/publications/TR2024-036}
- }
, - "Understanding and Controlling Generative Music Transformers by Probing Individual Attention Heads", IEEE ICASSP Satellite Workshop on Explainable Machine Learning for Speech and Audio (XAI-SA), April 2024.BibTeX TR2024-032 PDF
- @inproceedings{Koo2024apr,
- author = {Koo, Junghyun and Wichern, Gordon and Germain, François G and Khurana, Sameer and Le Roux, Jonathan},
- title = {Understanding and Controlling Generative Music Transformers by Probing Individual Attention Heads},
- booktitle = {IEEE ICASSP Satellite Workshop on Explainable Machine Learning for Speech and Audio (XAI-SA)},
- year = 2024,
- month = apr,
- url = {https://www.merl.com/publications/TR2024-032}
- }
, - "Physics-informed shape optimization using coordinate projection", Nature publishing, April 2024.BibTeX TR2024-035 PDF
- @article{Zhang2024apr,
- author = {Zhang, Zhizhou and Lin, Chungwei and Wang, Bingnan},
- title = {Physics-informed shape optimization using coordinate projection},
- journal = {Nature publishing},
- year = 2024,
- month = apr,
- url = {https://www.merl.com/publications/TR2024-035}
- }
, - "Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection", IEEE International Conference on Robotics and Automation (ICRA), March 2024.BibTeX TR2024-033 PDF Video
- @inproceedings{Zhu2024mar,
- author = {Zhu, Xinghao and Jha, Devesh K. and Romeres, Diego and Sun, Lingfeng and Tomizuka, Masayoshi and Cherian, Anoop},
- title = {Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection},
- booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
- year = 2024,
- month = mar,
- url = {https://www.merl.com/publications/TR2024-033}
- }
, - "Oriented-grid Encoder for 3D Implicit Representations", International Conference on 3D Vision (3DV), March 2024.BibTeX TR2024-031 PDF
- @inproceedings{Gaur2024mar,
- author = {Gaur, Arihant and Pais, Goncalo and Miraldo, Pedro},
- title = {Oriented-grid Encoder for 3D Implicit Representations},
- booktitle = {International Conference on 3D Vision (3DV)},
- year = 2024,
- month = mar,
- url = {https://www.merl.com/publications/TR2024-031}
- }
, - "Why does music source separation benefit from cacophony?", IEEE ICASSP Satellite Workshop on Explainable Machine Learning for Speech and Audio (XAI-SA), March 2024.BibTeX TR2024-030 PDF Video
- @inproceedings{Jeon2024mar,
- author = {Jeon, Chang-Bin and Wichern, Gordon and Germain, François G and Le Roux, Jonathan},
- title = {Why does music source separation benefit from cacophony?},
- booktitle = {IEEE ICASSP Satellite Workshop on Explainable Machine Learning for Speech and Audio (XAI-SA)},
- year = 2024,
- month = mar,
- url = {https://www.merl.com/publications/TR2024-030}
- }
, - "Single-pixel imaging of dynamic flows using Neural ODE regularization", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2024.BibTeX TR2024-024 PDF
- @inproceedings{Sholokhov2024mar,
- author = {Sholokhov, Aleksei and Rapp, Joshua and Nabi, Saleh and Brunton, Steven and Kutz, Nathan and Mansour, Hassan},
- title = {Single-pixel imaging of dynamic flows using Neural ODE regularization},
- booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
- year = 2024,
- month = mar,
- url = {https://www.merl.com/publications/TR2024-024}
- }
,
- "Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees", Transactions on Machine Learning Research (TMLR), April 2024.
-
Videos
-
Software & Data Downloads
-
Long-Tailed Anomaly Detection (LTAD) Dataset -
Pixel-Grounded Prototypical Part Networks -
DeepBornFNO -
BAyesian Network for adaptive SAmple Consensus -
Simple Multimodal Algorithmic Reasoning Task Dataset -
SOurce-free Cross-modal KnowledgE Transfer -
Audio-Visual-Language Embodied Navigation in 3D Environments -
Nonparametric Score Estimators -
3D MOrphable STyleGAN -
Instance Segmentation GAN -
Audio Visual Scene-Graph Segmentor -
Generalized One-class Discriminative Subspaces -
Hierarchical Musical Instrument Separation -
Generating Visual Dynamics from Sound and Context -
Adversarially-Contrastive Optimal Transport -
Online Feature Extractor Network -
MotionNet -
FoldingNet++ -
Quasi-Newton Trust Region Policy Optimization -
Landmarks’ Location, Uncertainty, and Visibility Likelihood -
Robust Iterative Data Estimation -
Gradient-based Nikaido-Isoda -
Circular Maze Environment -
Discriminative Subspace Pooling -
Kernel Correlation Network -
Fast Resampling on Point Clouds via Graphs -
FoldingNet -
Deep Category-Aware Semantic Edge Detection -
MERL Shopping Dataset -
Partial Group Convolutional Neural Networks
-