통합 참고문헌 (References)
53 references
[1] O'Neill, Abby (2023). Open X-Embodiment: Robotic Learning Datasets and RT-X Models. arXiv.
[2] Khazatsky, Alexander (2024). DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset. arXiv.
[3] Ebert, Frederik (2021). Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets. arXiv.
[4] Dasari, Sudeep (2019). RoboNet: Large-Scale Multi-Robot Learning. arXiv.
[5] Kalashnikov, Dmitry (2018). QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation. arXiv.
[6] Levine, Sergey (2016). Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection. arXiv.
[7] Brohan, Anthony (2022). RT-1: Robotics Transformer for Real-World Control at Scale. arXiv.
[8] Brohan, Anthony (2023). RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control. arXiv.
[9] AgiBot World Team (2025). AgiBot World Colosseo: A Large-Scale Manipulation Platform. arXiv.
[10] Ha, Huy (2024). Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots. arXiv.
[11] Choi, Hojung (2026). In-the-Wild Compliant Manipulation with UMI-FT. arXiv.
[12] Feng, Ruoxuan (2025). AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-Tactile Sensors. arXiv.
[13] Shaw, Kenneth (2023). LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning. arXiv.
[14] Lambeta, Mike (2020). DIGIT: A Novel Design for a Low-Cost Compact High-Resolution Tactile Sensor with Application to In-Hand Manipulation. arXiv.
[15] Black, Kevin (2024). pi0: A Vision-Language-Action Flow Model for General Robot Control. arXiv.
[16] Covariant (2024). RFM-1: Robotics Foundation Model. Company technical post.
[17] Dexterity (2025). Dexterity Foresight: AI Platform for Industrial Robot Workcells. Company product page.
[18] Chef Robotics (2025). ChefOS: AI Robotics Platform for Food Manufacturing. Company product page.
[19] Toyota Research Institute (2024). Large Behavior Models for Robot Manipulation. Company technical post.
[20] NVIDIA (2025). Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning. NVIDIA Research.
[21] Mandlekar, Ajay (2021). What Matters in Learning from Offline Human Demonstrations for Robot Manipulation. arXiv.
[22] Chi, Cheng (2023). Diffusion Policy: Visuomotor Policy Learning via Action Diffusion. arXiv.
[23] Zhao, Tony Z. (2023). Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. arXiv.
[24] Yu, Wenhao (2025). ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation. arXiv.
[25] Lambeta, Mike (2024). Digitizing Touch with an Artificial Multimodal Fingertip. arXiv.
[26] Bhirangi, Raunaq (2021). ReSkin: Versatile, Replaceable, Lasting Tactile Skins. arXiv.
[27] Bhirangi, Raunaq (2024). AnySkin: Plug-and-play Skin Sensing for Robotic Touch. arXiv.
[28] Choi, Hojung (2025). CoinFT: A Coin-Sized, Capacitive 6-Axis Force Torque Sensor for Robotic Applications. arXiv.
[29] Xu, Mengda (2025). DexUMI: Using Human Hand as the Universal Manipulation Interface for Dexterous Manipulation. arXiv.
[30] Fang, Hao-Shu (2025). DEXOP: Passive Exoskeleton for Direct-contact Dexterous Demonstration. arXiv.
[31] Si, Zilin (2025). ExoStart: From 10 Exoskeleton Demos to Dexterous Robot Manipulation. arXiv.
[32] Huang, Yuhang (2025). Tactile-VLA: Unlocking Vision-Language-Action Model's Physical Knowledge for Tactile Generalization. arXiv.
[33] Hao, Yaru (2025). Tactile-Language-Action Model for Contact-Rich Manipulation. arXiv.
[34] Figure AI (2026). Figure 03 + Helix 02: General-Purpose Humanoid System. Company product page.
[35] Physical Intelligence (2024). pi0: A Generalist Robot Policy. Company research post.
[36] Yang, Fengyu (2023). Touch and Go: Learning from Human-Collected Vision and Touch. arXiv.
[37] Li, Xingyu (2024). Evaluating Real-World Robot Manipulation Policies in Simulation. arXiv.
[38] Octo Model Team (2024). Octo: An Open-Source Generalist Robot Policy. arXiv.
[39] Genesis Team (2024). Genesis: A Generative and Universal Physics Engine for Robotics and Beyond. Project page.
[40] Bjorck, Johan (2025). GR00T N1: An Open Foundation Model for Generalist Humanoid Robots. arXiv.
[41] NVIDIA (2025). Isaac GR00T N1 Open Foundation Model for Humanoid Robots. NVIDIA Developer.
[42] DeepMind Robotics Team (2025). Gemini Robotics: Bringing AI into the Physical World. arXiv.
[43] Physical Intelligence (2025). OpenPI: Open Source Robot Policy Stack. GitHub.
[44] Kim, Moo Jin (2024). OpenVLA: An Open-Source Vision-Language-Action Model. arXiv.
[45] Pertsch, Karl (2025). FAST: Efficient Action Tokenization for Vision-Language-Action Models. arXiv.
[46] Shukla, Shivin (2025). SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics. arXiv.
[47] Qin, Yuzhe (2023). AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System. arXiv.
[48] Ding, Zihan (2024). Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning. arXiv.
[49] Kareer, Simar (2024). EgoMimic: Scaling Imitation Learning via Egocentric Video. arXiv.
[50] Zheng, Renhao (2026). EgoScale: Scaling Dexterous Manipulation with Diverse Egocentric Human Data. NVIDIA Research.
[51] Figure AI (2025). Helix: A Vision-Language-Action Model for Generalist Humanoid Control. Company announcement.
[52] Generalist AI (2025). GEN-0 Robot Foundation Model. Company page.
[53] Skild AI (2024). General-Purpose Robot Brain. Company page.
감사의 글
이 서베이는 S1 로봇 핸드, S4 휴머노이드, S6 제조 피지컬AI, S9 NVIDIA 피지컬AI의 논지를 연결하되, large-data driven manipulation 자체를 독립 주제로 재구성한다.
이 프로젝트는 황민호님의 Harness 스킬을 이용하여 제작되었습니다.
이 저작물의 제작에 AI 도구가 활용되었습니다. 문헌 조사, 콘텐츠 생성, 원고 작성에 Claude(Opus 4.6)를 사용하였습니다.