Information
EN Services Services Artificial Intelligence AI-powered Software Engineering Cloud Computing CX & Digital Commerce Cybersecurity Data World Design Digital Assets Digital Experience Gaming Governance, Risk and Compliance Hybrid Work Internet of Things Metaverse Quality Engineering Quantum Computing Robotics & Autonomous Things Social Media Strategy and Business Model Transformation Supply Chain Management Telco Networks 3D & Mixed Reality Read more Read more Read more Read more Industries Industries Automotive & Manufacturing Energy & Utilities Financial Services Logistics Retail & Consumer Products Telco & Media Read more Read more Insights & Labs Insights & Labs Labs Area360 Area42 Area Phi Cyber Security Lab Immersive Experience Lab IoT Validation Lab Test Automation Center Challenges Insights Xchange Webinars Discover More ROSE We care We care Making a difference People Continuous Learning Culture Wellbeing Inclusion & Diversity Environment Energy & Emissions Reply to the Earth Sustainable Supply Chain Discover More We are We are Company profile Offices Contacts Newsroom Investors Financial News Reply Share Information Financial Highlights Financial Calendar & Events Financial Reports Shareholders' Meeting Loyalty Shares Sustainability & Governance Read More Discover more Read More Discover more Join us English English Languages Spotlight on agents Spot's interaction begins with converting human commands spoken in natural language and voice into text through the Speech-to-Text phase, a crucial step for enabling seamless communication. The natural language text is then subjected to Task Processing, where subtasks are extracted, enabling Spot to gain a more comprehensive understanding of the user's intent. Spot's capabilities extend to Navigation Tasks , facilitated by the use of Vision Language Maps (VLMaps) from Google. These maps provide Spot with a semantic understanding of its environment, assisting in tasks such as autonomous exploration and mapping. In Manipulation Tasks , Spot employs two distinct AI models: Grounding DINO for object detection and Visual Cortex 1 for effective manipulation. DINO plays a pivotal role in accurately detecting and locating objects within Spot's surroundings, instead, Visual Cortex 1 enhances Spot's ability to interact with objects, ensuring precision and effectiveness, particularly in tasks like pick-and-place operations. Contact us I declare that I have read and fully understood the and I hereby express my consent to the processing of my personal data by Reply SpA for marketing purposes, in particular to receive promotional and commercial communications or information regarding company events or webinars, using automated contact means (e.g. SMS, MMS, fax, email and web applications) or traditional methods (e.g. phone calls and paper mail). Confirm In order to complete your request, please click the confirmation link in the email that we have sent to you. Go to ROSE it/en 1 1 Ask Rose We are Privacy & legal (Candidate) (Client) (Supplier) (Marketing) (UK & IR) (Germany) Careers Contacts Reply © 2025 - Cookie settings Technical-Functional Cookies Profiling Cookies checkbox label label Consent Leg.Interest checkbox label label checkbox label label checkbox label label Explore Reply's pioneering AI-embodied agents simplifying robot control, showcased through the Spot case. In recent years, the field of robotics and artificial intelligence has witnessed remarkable advancements, particularly in the realm of Embodied AI. These advancements have been made possible through a convergence of technologies such as soft robotics, haptic feedback, and the revolutionary use of transformer-based algorithms. One key development has been the integration of AI into robotic systems, enabling them to understand and interact with the physical world more efficiently. Thanks to pioneering algorithms like DINO (DIstillation of knowledge with NO labels), CLIP, and VC1 (Visual Cortex), which are built upon the Vision Transformer architecture, as Reply we have witnessed a significant leap in the capabilities of AI-embodied agents. These algorithms emulate the human visual attention mechanism, surpassing the performance of traditional Computer Vision models like Convolutional Neural Networks (CNNs). The Spot Case At Reply, we harness visual representations to enable the Spot robot to understand the environment and perform complex tasks like navigation and object manipulation with minimal training, enhancing human-robot interaction. This enables the control of the AI agents using natural language and voice commands, eliminating the need for complex model management. Spot's interaction begins with converting human commands spoken in natural language and voice into text through the Speech-to-Text phase, a crucial step for enabling seamless communication. The natural language text is then subjected to Task Processing, where subtasks are extracted, enabling Spot to gain a more comprehensive understanding of the user's intent. Spot's capabilities extend to Navigation Tasks, facilitated by the use of Vision Language Maps (VLMaps) from Google. These maps provide Spot with a semantic understanding of its environment, assisting in tasks such as autonomous exploration and mapping. In Manipulation Tasks, Spot employs two distinct AI models: Grounding DINO for object detection and Visual Cortex 1 for effective manipulation. DINO plays a pivotal role in accurately detecting and locating objects within Spot's surroundings, instead, Visual Cortex 1 enhances Spot's ability to interact with objects, ensuring precision and effectiveness, particularly in tasks like pick-and-place operations. explore the future of AI-embodied agents Interested in integrating cutting-edge AI into your robotics projects? Contact us PRIVACY I declare that I have read and fully understood the Privacy Notice and I hereby express my consent to the processing of my personal data by Reply SpA for marketing purposes, in particular to receive promotional and commercial communications or information regarding company events or webinars, using automated contact means (e.g. SMS, MMS, fax, email and web applications) or traditional methods (e.g. phone calls and paper mail). Thank you for your interest In order to complete your request, please click the confirmation link in the email that we have sent to you. These cookies are necessary for the correct functioning of the website (such as session cookies to log in).This type of cookies include also third-party analytics cookies used exclusively to collect information, in an aggregate and anonymous form, regarding the visitors browsing methods. These cookies allow the site to provide you enhanced and more personalized features (such as the saving preferred language). Disabling such cookies may partially affect the correct use of online services. This website uses only third-party cookies and tracking technologies, requiring your consent. Specifically, some Reply webpages include tracking technologies of third parties used for retargeting purposes and for supporting Reply marketing strategies (e.g. TechTarget).