Tags
AI
- Eureka: Human-Level Reward Design via Coding Large Language Models
- HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
artificial-intelligence
- ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation
- BAMI: Training-Free Bias Mitigation in GUI Grounding
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
computer-vision
- ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation
- AlbumFill: Album-Guided Reasoning and Retrieval for Personalized Image Completion
- Audio-Visual Intelligence in Large Foundation Models
- BAMI: Training-Free Bias Mitigation in GUI Grounding
- Generalizable Sparse-View 3D Reconstruction from Unconstrained Images
- Normalizing Trajectory Models
- Posterior Augmented Flow Matching
- World Model for Robot Learning: A Comprehensive Survey
diffusion-models
- ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation
- Audio-Visual Intelligence in Large Foundation Models
foundation-models
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control
- Audio-Visual Intelligence in Large Foundation Models
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
generalization
llm
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control
- AlbumFill: Album-Guided Reasoning and Retrieval for Personalized Image Completion
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
machine-learning
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control
- ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
- Normalizing Trajectory Models
manipulation
multimodal
- AlbumFill: Album-Guided Reasoning and Retrieval for Personalized Image Completion
- An Adaptable, Safe, and Portable Robot-Assisted Feeding System
perception
- An Adaptable, Safe, and Portable Robot-Assisted Feeding System
- Audio-Visual Intelligence in Large Foundation Models
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
planning
reinforcement-learning
robotics
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control
- An Adaptable, Safe, and Portable Robot-Assisted Feeding System
- Eureka: Human-Level Reward Design via Coding Large Language Models
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
- HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
- World Model for Robot Learning: A Comprehensive Survey
segmentation
transformers
- Audio-Visual Intelligence in Large Foundation Models
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
- Generalizable Sparse-View 3D Reconstruction from Unconstrained Images