DeepNLP ICRA2023 Accepted Paper List AI Robotic and STEM Top Conference & Journal Papers
-
Inflatable structures have attracted considerable research attention in many fields owing to their numerous advantages, such as being light and able to engage in interactions safely. However, in most cases, the inflatable structure can only have one stable configuration, which is undesirable for robotic arms. This study proposes a novel inflatable structure that can be easily reconfigured into mul...
-
Soft continuum robots have shown tremendous potential for medical and industrial applications owing to their flexibility and continuous deformability. However, their telescopic and bending capabilities and variable stiffness are still limited. This study proposes a novel origami-inspired soft continuum robot to possess large telescopic and bending capabilities while improving stiffness based on th...
-
The Finite Element Method (FEM) is a powerful modeling tool for predicting the behavior of soft robots. However, its use for control can be difficult for non-specialists of numerical computation: it requires an optimization of the computation to make it real-time. In this paper, we propose a learning-based approach to obtain a compact but sufficiently rich mechanical representation. Our choice is ...
-
One of the recent developments in physical reservoir computing, which uses the complex dynamics of a physical system as a computational resource, is the use of a pneumatic pipeline system as a computational resource. This uses the dynamics of air for computation, and because it is lightweight and power-saving, it is used for gait-assist control using a soft exoskeleton with pneumatic rubber artifi...
-
Pneumatic soft robots present many advantages in manipulation tasks. Notably, their inherent compliance makes them safe and reliable in unstructured and fragile environments. However, full-body shape sensing for pneumatic soft robots is challenging because of their high degrees of freedom and complex deformation behaviors. Vision-based proprioception sensing methods relying on embedded cameras and...
-
Recently, data-driven models such as deep neural networks have shown to be promising tools for modelling and state inference in soft robots. However, voluminous amounts of data are necessary for deep models to perform effectively, which requires exhaustive and quality data collection, particularly of state labels. Consequently, obtaining labelled state data for soft robotic systems is challenged f...
-
State estimation from measured data is crucial for robotic applications as autonomous systems rely on sensors to capture the motion and localize in the 3D world. Among sensors that are designed for measuring a robot's pose, or for soft robots, their shape, vision sensors are favorable because they are information-rich, easy to set up, and cost-effective. With recent advancements in computer vision...
-
In this article we investigate the discrete-time model based control of a planar soft continuum manipulator with proprioceptive sensing provided by fiber Bragg gratings. A control algorithm is designed with a discrete-time energy shaping approach which is extended to account for control-related lag of digital nature. A discrete-time nonlinear observer is employed to estimate the uncertain bending ...
-
Soft optical sensing strategies are rapidly developing for soft robotic systems as a means to increase the controllability of soft compliant robots. In this paper, we present a roughness tuning strategy for the fabrication of soft optical sensors to achieve the dual functionality of shape sensing combined with contact recognition within a single multi-modal sensor. The molds used to fabricate the ...
-
Following biology's lead, soft robotics has emerged as a perfect candidate for actuation within complex environments. While soft actuation has been developed intensively over the last few decades, soft sensing has so far slowed to catch up. A largely unresearched area is the change of the soft material properties through prestress to achieve a degree of mechanical sensitivity tunability within sof...
-
Vibration perception is essential for robotic sensing and dynamic control. Nevertheless, due to the rigorous demand for sensor conformability and stretchability, enabling soft robots with proprioceptive vibration sensing remains challenging. This paper proposes a novel liquid metal-based stretchable e-skin via a kirigami-inspired design to enable soft robot proprioceptive vibration sensing. The e-...
-
As soft robotic systems become increasingly complex, there is a need to develop sensory systems which can provide rich state information to the robot for feedback control. Multi-axis force sensing and control is one of the less explored problems in this domain. There are numerous challenges in the development of a multi-axis soft sensor: from the design and fabrication to the data processing and m...
-
Soft, wearable systems hold promise for a wide variety of new or enhanced applications in the realm of human-computer interaction, physiological monitoring, wear-able robotics, and a host of other human-centric devices. Soft sensor systems have been developed concurrently in order to allow these wearable systems to respond intelligently with their surroundings. A recently reported sensing mechanis...
-
Whisker-based tactile sensors have the potential to perform fast and accurate 3D mappings of the environment, complementing vision-based methods under conditions of glare, reflection, proximity, and occlusion. However, current algorithms for mapping with whiskers make assumptions about the conditions of contact, and these assumptions are not always valid and can cause significant sensing errors. H...
-
Learning Decoupled Multi-touch Force Estimation, Localization and Stretch for Soft Capacitive E-skin
Distributed sensor arrays capable of detecting multiple spatially distributed stimuli are considered an important element in the realisation of exteroceptive and proprioceptive soft robots. This paper expands upon the previously presented idea of decoupling the measurements of pressure and location of a local indentation from global deformation, using the overall stretch experienced by a soft capa...
-
This paper presents the novel use of air gaps in flexible optical light pipes to create coded patterns for use in bend localization. The OptiGap sensor system allows for the creation of extrinsic intensity modulated bend sensors that function as flexible absolute linear encoders. Coded air gap patterns are identified by a Gaussian naive Bayes (GNB) classifier running on an STM32 microcontroller. T...
-
Soft devices employ variable stiffness to ensure safety and improve the robustness in the interaction between robots and objects. Using soft materials is one of the most popular approaches to design a variable-stiffness device, while the use of silicone sponge remains less explored in this field. Here we present a novel silicone-sponge-based variable-stiffness device (SVD). The SVD is easy-to-make...
-
We propose a novel design for a lightweight and compact tunable stiffness actuator capable of stiffness changes up to 20x. The design is based on the concept of a coiled spring, where changes in the number of layers in the spring change the bulk stiffness in a near linear fashion. We present an elastica nested rings model for the deformation of the proposed actuator and empirically verify that the...
-
Electrostatic actuators provide a promising approach to creating soft robotic sheets, due to their flexible form factor, modular integration, and fast response speed. However, their control requires kilo-Volt signals and understanding of complex dynamics resulting from force interactions by on-board and environmental effects. In this work, we demonstrate an untethered planar five-actuator piezoele...
-
Soft robots have long been attractive to robotic engineers due to their remarkable dexterity; however, reports that standardize soft actuators into modularized off-shelf devices akin to rigid robots are still rare, and the mechanical efficiency of existing designs is still limited. This work identifies origami folding to enable the design of LEGO-like modularized soft actuators with high mechanica...
-
Soft robotic structures may play a major role in the 4th industrial revolution. Researchers have successfully demonstrated the advantages of soft robotics over traditional robots made of rigid links and joints in many application areas. Variable stiffness links (VSL) and joints (VSJ) have been investigated to achieve on-demand forces and, at the same time, be inherently safe in interactions with h...
-
One of the biggest problems with soft robots is precisely the fact that they are soft. Indeed the softer they are, the less force they can exert on the environment. Researchers have proposed a number of stiffening methods, but all of them have drawbacks, such as locking the shape of the device in a way that precludes further adjustments. In this paper we propose a stiffening method inspired by the...
-
This paper discusses the effect of axial backbone compression on tendon-driven continuum robots. A new mechanics model for compensating for this effect that does not require tendon tension sensing or knowledge of manipulator material properties/stiffnesses is introduced and analyzed. In addition, we provide an analytical expression for the minimum preload on the tendons to achieve a given bend, a ...
-
This research proposes a novel semi-supervised learning framework FourStr (Four-Stream formed by two two-stream models) that focuses on the improvement of fusion and labeling efficiency for 3D multi-sensor detector. FourStr adopts a multi-sensor single-stage detector named adaptive fusion network (AFNet) as the backbone and trains it through the semi-supervision learning (SSL) strategy Stereo Fusi...
-
We present a robust method to visually segment scenes into objects based on motion and appearance. Both these cues provide complementary information that we fuse using two interconnected recursive estimators: One estimates object segmentation from motion as a probabilistic clustering of tracked 3D points, and the other estimates object segmentation from appearance as a probabilistic image segmenta...
-
Accurate and timely onboard perception is a prerequisite for mobile robots to operate in highly dynamic scenarios. The bio-inspired event camera can capture more motion details than a traditional camera by triggering each pixel asynchronously and therefore is more suitable in such scenarios. Among various perception tasks based on the event camera, ego-motion removal is one fundamental procedure t...
-
Robots need to understand articulated objects, such as drawers. The state of articulated structures is commonly estimated using vision, but visual perception is limited when objects are occluded, have few salient features, or are not in the camera's field of view. Audio sensing does not face these challenges, since sound propagates in a fundamentally different way than light. Therefore we propose ...
-
Developing embodied agents in simulation has been a key research topic in recent years. Exciting new tasks, algorithms, and benchmarks have been developed in various simulators. However, most of them assume deaf agents in silent environments, while we humans perceive the world with multiple senses. We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation ...
-
Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real tim...
-
In this paper, we address the problem of estimating the in-hand 6D pose of an object in contact with multiple vision-based tactile sensors. We reason on the possible spatial configurations of the sensors along the object surface. Specifically, we filter contact hypotheses using geometric reasoning and a Convolutional Neural Network (CNN), trained on simulated object-agnostic images, to promote tho...
-
LiDAR-Camera online calibration is of great significance for building a stable autonomous driving perception system. For online calibration, a key challenge lies in constructing a unified and robust representation between multi-modal sensor data. Most methods extract features manually or implicitly with an end-to-end deep learning method. The former suffers poor robustness, while the latter has po...
-
In this paper we propose a visual servoing approach that controls the deformation of a suspended tether cable subject to gravity from visual data provided by a RGB-D camera. The cable shape is modelled with a parabolic curve together with the orientation of the plane containing the tether. The visual features considered are the parabolic coefficients and the yaw angle of that plane. We derive the ...
-
We propose a new visual servoing method that controls a robot's motion in a latent space. We aim to extract the best properties of two previously proposed servoing methods: we seek to obtain the accuracy of photometric methods such as Direct Visual Servoing (DVS), as well as the behavior and convergence of pose-based visual servoing (PBVS). Photometric methods suffer from limited convergence area ...
-
This paper proposes CNN-based visual servoing for simultaneous positioning and flattening of a soft fabric part placed on a table by a dual manipulator system. We propose a network for multimodal data processing of grayscale images captured by a camera and force/torque applied to force sensors. The training dataset is collected by moving the real manipulators, which enables the network to map the ...
-
Dynamical System-based Imitation Learning for Visual Servoing using the Large Projection Formulation
Nowadays ubiquitous robots must be adaptive and easy to use. To this end, dynamical system-based imitation learning plays an important role. In fact, it allows to realize stable and complex robotic tasks without explicitly coding them, thus facilitating the robot use. However, the adaptation capabilities of dynamical systems have not been fully exploited due to the lack of closed-loop implementati...
-
Constant Distance and Orientation Following of an Unknown Surface with a Cable-Driven Parallel Robot
Cable-Driven Parallel Robots (CDPRs) are well-adapted to large workspaces since they replace rigid links by cables. However, they lack in positioning accuracy and new control methods are necessary to achieve profile-following tasks. This paper presents a control scheme designed for these tasks, relying on a combination of accurate boarded distance sensors and of a less accurate remote camera. The ...
-
This paper presents a spectral domain registration-based visual servoing scheme that works on 3D point clouds. Specifically, we propose a 3D model/point cloud alignment method, which works by finding a global transformation between reference and target point clouds using spectral analysis. A 3D Fast Fourier Transform (FFT) in $\mathbb{R}^{3}$ is used for the translation estimation, and the real sp...
-
This paper presents a novel autonomous endoscope control method for the dVRK's Endoscopic Camera Manipulator (ECM), which allows the camera to track the surgical instruments on the Patient Side Manipulator (PSM). An Image-based Visual Servoing (IBVS) is enforced by the addition of a visibility constraint that ensures the identified surgical tool remains in the camera's Field Of View (FOV) for the ...
-
Safe motion control in unknown environments is one of the challenging tasks in robotics, such as autonomous navigation. Control Barrier Function (CBF), as a strong math-ematical tool, has been widely used in many safety-critical systems to satisfy safety requirements. However, there are only a handful of recent studies on safety controllers with perception inputs. Common assumptions in most of the...
-
In this paper, we propose a multi-object tracking framework for videos captured by UAVs, considering motion imperfection in the following two aspects: 1) motion blurring of objects due to high-speed motion of the UAV and the objects, deteriorating the performance of the detector; 2) motion coupling of the global movement of the UAV camera with the object motion, resulting in the nonlinearity of ob...
-
Motion deblurring is a critical ill-posed problem that is important in many vision-based robotics applications. The recently proposed event-based double integral (EDI) provides a theoretical framework for solving the deblurring prob-lem with the event camera and generating clear images at high frame-rate. However, the original EDI is mainly designed for offline computation and does not support rea...
-
This work addresses the issue of motion compensation and pattern tracking in event camera data. An event camera generates asynchronous streams of events triggered independently by each of the pixels upon changes in the observed intensity. Providing great advantages in low-light and rapid-motion scenarios, such unconventional data present significant research challenges as traditional vision algori...
-
Current robotic hand manipulation narrowly operates with objects in predictable positions in limited environments. Thus, when the location of the target object deviates severely from the expected location, a robot sometimes responds in an unexpected way, especially when it operates with a human. For safe robot operation, we propose the EXit-aware Object Tracker (EXOT) on a robot hand camera that r...
-
We present Mono-STAR, the first real-time 3D reconstruction system that simultaneously supports semantic fusion, fast motion tracking, non-rigid object deformation, and topological change under a unified framework. The proposed system solves a new optimization problem incorporating optical-flow-based 2D constraints to deal with fast motion and a novel semantic-aware deformation graph (SAD-graph) f...
-
Persistent multi-object tracking (MOT) allows autonomous vehicles to navigate safely in highly dynamic environments. One of the well-known challenges in MOT is object occlusion when an object becomes unobservant for subsequent frames. The current MOT methods store objects information, such as trajectories, in internal memory to recover the objects after occlusions. However, they retain short-term ...
-
Event cameras are asynchronous neuromorphic vision sensors with high temporal resolution and no motion blur, offering advantages over standard frame-based cameras especially in high-speed motions and high dynamic range conditions. However, event cameras are unable to capture the overall context of the scene, and produce different events for the same scenery depending on the direction of the motion...
-
We propose a method for joint detection and tracking of multiple objects in 3D point clouds, a task conventionally treated as a two-step process comprising object detection followed by data association. Our method embeds both steps into a single end-to-end trainable network eliminating the dependency on external object detectors. Our model exploits temporal information employing multiple frames to...
-
In this work, we present an inverse reinforcement learning approach for solving the problem of task sequencing for robots in complex manufacturing processes. Our proposed framework is adaptable to variations in process and can perform sequencing for entirely new parts. We prescribe an approach to capture feature interactions in a demonstration dataset based on a metric that computes feature intera...
-
Identifying an appropriate task space can simplify solving robotic manipulation problems. One solution is deploying control algorithms in a learned low-dimensional action space. Linear and nonlinear action mapping methods have trade-offs between simplicity and the ability to express motor commands outside of a single low-dimensional subspace. We propose that learning local linear action representa...
-
Recent works in robotic manipulation through reinforcement learning (RL) or imitation learning (IL) have shown potential for tackling a range of tasks e.g., opening a drawer or a cupboard. However, these techniques generalize poorly to unseen objects. We conjecture that this is due to the high-dimensional action space for joint control. In this paper, we take an alternative approach and separate t...
-
Model Free Reinforcement Learning (MFRL) has shown significant promise for learning dexterous robotic manipulation tasks, at least in simulation. However, the high number of samples, as well as the long training times, prevent MFRL from scaling to complex real-world tasks. Model- Based Reinforcement Learning (MBRL) emerges as a potential solution that, in theory, can improve the data efficiency of...
-
Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful ...
-
Embodied AI agents in large scenes often need to navigate to find objects. In this work, we study a naturally emerging variant of the object navigation task, hierarchical relational object navigation (HRON), where the goal is to find objects specified by logical predicates organized in a hierarchical structure-objects related to furniture and then to rooms-such as finding an apple on top of a tabl...
-
Programming manipulation behaviors can become increasingly difficult with a growing number and complexity of manipulation tasks, particularly in a dynamic and unstructured environment. Recent progress in unsupervised skill discovery algorithms has shown great promise in learning an extensive collection of behaviors without extrinsic supervision. On the other hand, safety is one of the most critica...
-
The spatial distribution of sensed objects strongly influences the behavior of mobile robots. Yet, as robots evolve in complexity to operate in increasingly rich environments, it becomes much more difficult to specify the underlying relations between sensed object spatial distributions and robot behaviors. We aim to address this challenge by leveraging system trace data to automatically infer rela...
-
The ability to specify robot commands by a non-expert user is critical for building generalist agents capable of solving a large variety of tasks. One convenient way to specify the intended robot goal is by a video of a person demonstrating the target task. While prior work typically aims to imitate human demonstrations performed in robot environments, here we focus on a more realistic and challen...
-
Food picking is trivial for humans but not for robots, as foods are fragile. Presetting foods' physical properties does not help robots much due to the objects' inter- and intra-category diversity. A recent study proved that learning-based fracture anticipation with tactile sensors could overcome this problem; however, the method trains the model for each food to deal with intra-category differenc...
-
The process of designing costmaps for off-road driving tasks is often a challenging and engineering-intensive task. Recent work in costmap design for off-road driving focuses on training deep neural networks to predict costmaps from sensory observations using corpora of expert driving data. However, such approaches are generally subject to over-confident mis-predictions and are rarely evaluated in...
-
Estimating terrain traversability in off-road environments requires reasoning about complex interaction dynamics between the robot and these terrains. However, it is challenging to create informative labels to learn a model in a supervised manner for these interactions. We propose a method that learns to predict traversability costmaps by combining exteroceptive environmental information with prop...
-
Motion generation seeks to produce safe and feasible robot motion from start to goal. Various tools at different levels of granularity have been developed. On one extreme, sampling-based motion planners focus on completeness - a solution, if it exists, would eventually be found. However, produced paths are often of low quality, and contain superfluous motion. On the other, reactive methods optimis...
-
Reinforcement learning (RL) and trajectory opti-mization (TO) present strong complementary advantages. On one hand, RL approaches are able to learn global control policies directly from data, but generally require large sample sizes to properly converge towards feasible policies. On the other hand, TO methods are able to exploit gradient-based information extracted from simulators to quickly conve...
-
We study the problem of generating control laws for systems with unknown dynamics. Our approach is to represent the controller and the value function with neural networks, and to train them using loss functions adapted from the Hamilton-Jacobi-Bellman (HJB) equations. In the absence of a known dynamics model, our method first learns the state transitions from data collected by interacting with the...
-
We present Learned Risk Metric Maps (LRMM) for real-time estimation of coherent risk metrics of high-dimensional dynamical systems operating in unstructured, partially observed environments. LRMM models are simple to design and train-requiring only procedural generation of obstacle sets, state and control sampling, and supervised training of a function approximator-which makes them broadly applica...
-
Near the limits of adhesion, the forces generated by a tire are nonlinear and intricately coupled. Efficient and accurate modelling in this region could improve safety, especially in emergency situations where high forces are required. To this end, we propose a novel family of tire force models based on neural ordinary differential equations and a neural-ExpTanh parameterization. These models are ...
-
Autonomous driving has attracted lots of attention in recent years. For some tasks, e.g., trajectory prediction, motion planning, and trajectory tracking, an accurate vehicle model can reduce the difficulty of these tasks and improve task completion performance. Prior works focused on parameter estimation of physical models or modeling nonlinear dynamics using neural networks. Still, these methods...
-
Safe and efficient robot-environment interaction is a critical but challenging problem as robots are being increasingly employed to operate in unstructured and unpredictable environments. Soft robots are inherently compliant to safely interact with environments but their high nonlinearity exacerbates control difficulties. Meta-learning provides a powerful tool for fast online model adaptation beca...
-
In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems. Sparse rewards are common in continuous control robotics tasks such as manipulation and navigation and make the learning problem hard due to the non-trivial estimation of value functions over the state space. This demands either ...
-
Model predictive control is a powerful tool to generate complex motions for robots. However, it often requires solving non-convex problems online to produce rich behaviors, which is computationally expensive and not always practical in real time. Additionally, direct integration of high dimensional sensor data (e.g. RGB-D images) in the feedback loop is challenging with current state-space methods...
-
We propose a novel Branch-and-Bound method for reachability analysis of neural networks in both open-loop and closed-loop settings. Our idea is to first compute accurate bounds on the Lipschitz constant of the neural network in certain directions of interest offline using a convex program. We then use these bounds to obtain an instantaneous but conservative polyhedral approximation of the reachabl...
-
Trajectory optimization methods have achieved an exceptional level of performance on real-world robots in recent years. These methods heavily rely on accurate analytical models of the dynamics, yet some aspects of the physical world can only be captured to a limited extent. An alternative approach is to leverage machine learning techniques to learn a differentiable dynamics model of the system fro...
-
Model Predictive Control (MPC) is a state-of-the-art (SOTA) control technique which requires solving hard constrained optimization problems iteratively. For uncertain dynamics, analytical model based robust MPC imposes additional constraints, increasing the hardness of the problem. The problem exacerbates in performance-critical applications, when more compute is required in lesser time. Data-driv...
-
This paper proposes a novel underwater Multi-View Photometric Stereo (MVPS) framework for reconstructing scenes in 3-D with a non-stationary low-cost robot equipped with a monocular camera and fixed lights. The underwater realm is the primary focus of study here, due to the challenges in utilizing underwater camera imagery and lack of low-cost reliable localization systems. Previous underwater PS ...
-
Acoustic perception in underwater environments is challenging due to the low frequency of the acquisition system and multiple and huge sources of noise. Therefore, point clouds built by profiling sonars mounted on Autonomous Underwater Vehicles (AUV) are sparse and noisy. To solve the mapping task, AUVs need a registration algorithm to prevent maps from inconsistencies. Many scan matching algorith...
-
We present a technique for dense 3D reconstruction of objects using an imaging sonar, also known as forward-looking sonar (FLS). Compared to previous methods that model the scene geometry as point clouds or volumetric grids, we represent the geometry as a neural implicit function. Additionally, given such a representation, we use a differentiable volumetric renderer that models the propagation of ...
-
Underwater robots typically rely on acoustic sensors like sonar to perceive their surroundings. However, these sensors are often inundated with multiple sources and types of noise, which makes using raw data for any meaningful inference with features, objects, or boundary returns very difficult. While several conventional methods of dealing with noise exist, their success rates are unsatisfactory....
-
Autonomous surface vessels (ASV) represent a promising technology to automate water-quality monitoring of lakes. In this work, we use satellite images as a coarse map and plan sampling routes for the robot. However, inconsistency between the satellite images and the actual lake, as well as environmental disturbances such as wind, aquatic vegetation, and changing water levels can make it difficult ...
-
We present a navigation framework to perform autonomous underwater docking to a wave energy converter (WEC) under various ocean conditions by incorporating flow state estimation into the design of model predictive control (MPC). Existing methods lack the ability to perform dynamic rendezvous and autonomously dock in energetic conditions. The use of exteroceptive sensors or high performing acoustic...
-
Vessel transit in ice-covered waters poses unique challenges in safe and efficient motion planning. When the concentration of ice is high, it may not be possible to find collision-free trajectories. Instead, ice can be pushed out of the way if it is small or if contact occurs near the edge of the ice. In this work, we propose a real-time navigation framework that minimizes collisions with ice and ...
-
We present the results of experiments performed using a small autonomous underwater vehicle to determine the location of an isobath within a bounded area. The primary contribution of this work is to implement and integrate several recent developments real-time planning for environmental map-ping, and to demonstrate their utility in a challenging practical example. We model the bathymetry within th...
-
Place recognition using SOund Navigation and Ranging (SONAR) images is an important task for simultaneous localization and mapping (SLAM) in underwater environments. This paper proposes a robust and efficient imaging SONAR-based place recognition, SONAR context, and loop closure method. Unlike previous methods, our approach encodes geometric information based on the characteristics of raw SONAR me...
-
Underwater depth estimation is essential for safe Autonomous Underwater Vehicles (AUV) navigation. While there has been recent advances in out-of-water monocular depth estimation, it is difficult to apply these methods to the underwater domain due to the lack of well-established datasets with labelled ground truths. In this paper, we propose a novel method for self-supervised underwater monocular ...
-
Depth estimation is critical for any robotic system. In the past years, the estimation of depth from monocular images has shown great improvement. However, in the underwater environment results are still lagging behind due to appearance changes caused by the medium. So far little effort has been invested in overcoming this. Moreover, underwater, there are more limitations to using high-resolution ...
-
The recent development of high-precision subsea optical scanners allows for 3D keypoint detectors and feature descriptors to be leveraged on point cloud scans from subsea environments. However, the literature lacks a comprehensive survey to identify the best combination of detectors and descriptors to be used in these challenging and novel environments. This paper aims to identify the best detecto...
-
Quadruped animal locomotion emerges from the interactions between the spinal central pattern generator (CPG), sensory feedback, and supraspinal drive signals from the brain. Computational models of CPGs have been widely used for investigating the spinal cord contribution to animal locomotion control in computational neuroscience and in bio-inspired robotics. However, the contribution of supraspina...
-
The high speed and low energy cost are two conflicting objectives in the motion optimization of bio-inspired underwater robots, but playing a very important role. To this end, this paper proposes an optimization strategy for swimming speed and power cost using an improved NSGA-II for a flexible robotic fish. A dynamic model involving flexible deformation is established for speed prediction with th...
-
Swarm robotics plays a non-negligible role in actual practice because of its scalability and robustness. Besides some specific studies, there is still a lack of overall approaches to solving the search and rescue problem in a communication-denied environment. This paper presents a bee-inspired swarm cooperation approach without information exchange, including a target grouping method suitable for ...
-
Robots that use impulsive mechanisms to achieve high-speed and high-powered motion are becoming more common and better understood, but control of these systems remains relatively rudimentary. Among robots that use spring actuation to generate motion, robot actuation and mechanisms are usually not controlled intentionally in order to achieve variation in the system's behavior, or they are controlle...
-
Falling cat problem is well-known where cats show their super aerial reorientation capability and can land safely. For their robotic counterparts, a similar falling quadruped robot problem, has not been fully addressed, although achieving safe landing as the cats has been increasingly investigated. Unlike imposing the burden on landing control, we approach to safe landing of falling quadruped robo...
-
We present SunBot, a robotic system for the study and implementation of fish-inspired tearing manipulations. Various fish species–such as the sunburst butterflyfish-feed on prey fixed to substrates, a maneuver previously not demonstrated by robotic fish which typically specialize for open water swimming and surveillance. Biological studies indicate that a dynamic “head flick” behavior may play a r...
-
Central Pattern generators (CPG) are a biologically inspired, decentralized control architecture that enables model-free, but yet adaptively stable and computational lightweight locomotion capabilities on complex robots. Nevertheless, no unified design guidelines for closed-loop CPG controllers are available in the literature. Therefore, we propose a task-distributed, end-to-end trainable, closed-...
-
Most legged robots are built with leg structures from serially mounted links and actuators and are controlled through complex controllers and sensor feedback. In comparison, animals developed multi-segment legs, mechanical coupling between joints, and multi-segmented feet. They run agile over all terrains, arguably with simpler locomotion control. Here we focus on developing foot mechanisms that r...
-
Terrestrial cyborg insects are biohybrid systems integrating living insects as mobile platforms. The insects' locomotion is controlled by the electrical stimulation of their sensory, muscular, or neural systems, in which continuous pulse trains are usually chosen as the stimulation waveform. Although this waveform is easy to generate and can elicit graded responses from the insects, its locomotion...
-
Modern legged robot morphologies assign most of their actuated degrees of freedom (DoF's) to the limbs and designs continue to converge to twelve DoF quadrupeds with three actuators per leg and a rigid torso often modeled as a Single Rigid Body (SRB). This is in contrast to the animal kingdom, which provides tantalizing hints that core actuation of a jointed torso confers substantial benefit for e...
-
As an active spine introduces more degree of freedoms (DOFs) as well as time-varying inertia, locomotion control of spined quadruped robots is challenging. Direct optimization on the full dynamics model causes prohibitive calculation time and is difficult to apply to embedded platforms. Model predictive control (MPC)-based on SRB dynamics is a prevalent approach for ordinary quadruped robots, rega...
-
In recent years, reinforcement learning and its multi-agent analogue have achieved great success in solving various complex control problems. However, multi-agent rein-forcement learning remains challenging both in its theoretical analysis and empirical design of algorithms, especially for large swarms of embodied robotic agents where a definitive toolchain remains part of active research. We use ...
-
In constrained solution spaces with a huge number of homotopy classes, standalone sampling-based kinodynamic planners suffer low efficiency in convergence. Local optimization is integrated to alleviate this problem. In this paper, we propose to thrive the trajectory tree growing by optimizing the tree in the forms of deformation units, and each unit contains one tree node and all the edges connect...
-
Autonomous UAV path planning for 3D reconstruction has been actively studied in various applications for high-quality 3D models. However, most existing works have adopted explore-then-exploit, prior-based or exploration-based strategies, demonstrating inefficiency with repeated flight and low autonomy. In this paper, we propose PredRecon, a prediction-boosted planning framework that can autonomous...
-
Navigating dynamic environments requires the robot to generate collision-free trajectories and actively avoid moving obstacles. Most previous works designed path planning algorithms based on one single map representation, such as the geometric, occupancy, or ESDF map. Although they have shown success in static environments, due to the limitation of map representation, those methods cannot reliably...
-
We introduce Spatial Predictive Control (SPC), a technique for solving the following problem: given a collection of robotic agents with black-box positional low-level controllers (PLLCs) and a mission-specific distributed cost function, how can a distributed controller achieve and maintain cost-function minimization without a plant model and only positional observations of the environment? Our ful...
-
Visual navigation plays an important role for Unmanned Aerial Vehicles(UAVs). In some applications, the landmark image and the real-time image may be heterogeneous, like near-infrared and visible images. In this work, we propose a multimodal image registration method to deal with near-infrared and visible images so that it can be applied to visual navigation system for the localization of UAVs in ...
-
We address the problem of efficient 3-D exploration in indoor environments for micro aerial vehicles with limited sensing capabilities and payload/power constraints. We develop an indoor exploration framework that uses learning to predict the occupancy of unseen areas, extracts semantic features, samples viewpoints to predict information gains for different exploration goals, and plans informative...
-
The use of bidirectional propellers provides quadrotors with greater maneuverability which is advantageous in constrained environments. This paper addresses the development of a trajectory planning algorithm for quadrotors with bidirectional motors. Previous work has shown that the property of differential flatness can be leveraged for efficient trajectory planning. However, planners that leverage...
-
Small uncrewed aerial systems (sUASs) are useful tools for 3D reconstruction due to their speed, ease of use, and ability to access high-utility viewpoints. Today, most aerial survey approaches generate a preplanned coverage pattern assuming a planar target region. However, this is inefficient since it results in superfluous overlap and suboptimal viewing angles and does not utilize the entire fli...
-
In this work, we present an Integrated Guidance and Controller (IGC) scheme to drive quadcopters in path-following tasks with obstacle avoidance and constant uncertainty rejection. This scheme is based on the combination of a time-varying artificial vector field and Backstepping with integral action control. The vector field switches between two behaviors: (i) path-following; and (ii) obstacle cir...
-
This paper proposes an adaptive near-hover position controller for quadcopters, which can be deployed to quadcopters of very different mass, size and motor constants, and also shows rapid adaptation to unknown disturbances during runtime. The core algorithmic idea is to learn a single policy that can adapt online at test time not only to the disturbances applied to the drone, but also to the robot...
-
The use of cables for aerial manipulation has shown to be a lightweight and versatile way to interact with objects. However, fastening objects using cables is still a challenge and human is required. In this work, we propose a novel way to secure objects using hitches. The hitch can be formed and morphed in midair using a team of aerial robots with cables. The hitch's shape is modeled as a convex ...
-
Detect-and-Avoid (DAA) capabilities are critical for safe operations of unmanned aircraft systems (UAS). This paper introduces, AirTrack, a real-time vision-only detect and tracking framework that respects the size, weight, and power (SWaP) constraints of sUAS systems. Given the low Signal-to-Noise ratios (SNR) of far away aircraft, we propose using full resolution images in a deep learning framew...
-
This paper proposes a new model for onboard physical fault detection on autonomous unmanned aerial vehicles (UAV) through machine learning (ML) techniques. The proposal performs the detection task with high accuracies and minimal processing requirements while signaling an unreliable ML model to the operator, implemented in two main phases. First, a wrapper-based feature selection is performed to d...
-
A compliant control model based on reinforcement learning (RL) is proposed to allow robots to interact with the environment more effectively and autonomously execute force control tasks. The admittance model learns an optimal adjustment policy for interactions with the external environment using RL algorithms. The model combines energy consumption and trajectory tracking of the agent state using a...
-
Aerial robots have a wide range of applications, such as collecting data in hard-to-reach areas. This requires the longest possible operation time. However, because currently available commercial batteries have limited specific energy of roughly 300 W h kg-1, a drone's flight time is a bottleneck for sustainable long-term data collection. Inspired by birds in nature, a possible approach to tackle ...
-
Hybrid unmanned aerial vehicles (H-UAVs) are highly versatile platforms with the ability to transition between rotary- and fixed-wing flight. However, their (aero)dynamics tend to be highly nonlinear which increases the risk of introducing safety-critical modeling errors in a controller. Designing a safe, yet not too cautious controller, requires a credible model which provides accurate dynamics u...
-
We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also presen...
-
Seamlessly integrating rules in Learning-from-Demonstrations (LfD) policies is a critical requirement to enable the real-world deployment of AI agents. Recently, Signal Temporal Logic (STL) has been shown to be an effective language for encoding rules as spatio-temporal constraints. This work uses Monte Carlo Tree Search (MCTS) as a means of integrating STL specification into a vanilla LfD policy ...
-
Unmanned aerial vehicle (UAV) tracking is crucial for autonomous navigation and has broad applications in robotic automation fields. However, reliable UAV tracking remains a challenging task due to various difficulties like frequent occlusion and aspect ratio change. Additionally, most of the existing work mainly focuses on explicit information to improve tracking performance, ignoring potential i...
-
Recently, Model Predictive Contouring Control (MPCC) has arisen as the state-of-the-art approach for model-based agile flight. MPCC benefits from great flexibility in trading-off between progress maximization and path following at runtime without relying on globally optimized trajectories. However, finding the optimal set of tuning parameters for MPCC is challenging because (i) the full quadrotor ...
-
Recently, learning-based controllers have been shown to push mobile robotic systems to their limits and provide the robustness needed for many real-world applications. However, only classical optimization-based control frameworks offer the inherent flexibility to be dynamically adjusted during execution by, for example, setting target speeds or actuator limits. We present a framework to overcome t...
-
Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. Conversely, learning-based offline optimization approaches, such as Reinforcement Learning (RL), allow fast and efficien...
-
Reinforcement learning (RL) is an agent-based approach for teaching robots to navigate within the physical world. Gathering data for RL is known to be a laborious task, and real-world experiments can be risky. Simulators facilitate the collection of training data in a quicker and more cost-effective manner. However, RL frequently requires a significant number of simulation steps for an agent to be...
-
Parking in large metropolitan areas is often a time-consuming task with further implications for traffic patterns that affect urban landscaping. Reducing the premium space needed for parking has led to the development of automated mechanical parking systems. Compared to regular garages having one or two rows of vehicles on each island, automated garages can have multiple rows of vehicles stacked t...
-
From both an educational and research point of view, experiments on hardware are a key aspect of robotics and control. In the last decade, many open-source hardware and software frameworks for wheeled robots have been presented, mainly in the form of unicycles and car-like robots, with the goal of making robotics accessible to a wider audience and to support control systems development. Unicycles ...
-
Autonomous vehicles that operate in urban environments shall comply with existing rules and reason about the interactions with other decision-making agents. In this paper, we introduce a decentralized and communication-free interaction-aware motion planner and apply it to Autonomous Surface Vessels (ASVs) in urban canals. We build upon a sampling-based method, namely Model Predictive Path Integral...
-
This paper considers centralized mission-planning for a heterogeneous multi-agent system with the aim of locating a hidden target. We propose a mixed observable setting, consisting of a fully observable state-space and a partially observable environment, using a hidden Markov model. First, we construct rapidly exploring random trees (RRTs) to introduce the mixed observable RRT for finding plausibl...
-
We present a novel reinforcement learning based algorithm for multi-robot task allocation problem in ware-house environments. We formulate it as a Markov Decision Process and solve via a novel deep multi-agent reinforcement learning method (called RTAW) with attention inspired policy architecture. Hence, our proposed policy network uses global embeddings that are independent of the number of robot...
-
Effective task allocation is an essential component to the coordination of heterogeneous robots. This paper proposes a hybrid task allocation algorithm that improves upon given initial solutions, for example from the popular decentralized market-based allocation algorithm, via a derivative-free optimization strategy called Speeding-Up and Slowing-Down (SUSD). Based on the initial solutions, SUSD p...
-
Multi-Agent Path Finding (MA-PF) computes a set of collision-free paths for multiple agents from their respective starting locations to destinations. This paper considers a generalization of MA-PF called Multi-Agent Teamwise Cooperative Path Finding (MA-TC-PF), where agents are grouped as multiple teams and each team has its own objective to be minimized. For example, an objective can be the sum o...
-
Collaborative scheduling is an essential ability for a team of heterogeneous robots to collaboratively complete complex tasks, e.g., in a multi-robot assembly application. To enable collaborative scheduling, two key problems should be addressed, including allocating tasks to heterogeneous robots and adapting to robot failures in order to guarantee the completion of all tasks. In this paper, we int...
-
This paper presents a scalable online algorithm to generate safe and kinematically feasible trajectories for quadrotor swarms. Existing approaches rely on linearizing Euclidean distance-based collision constraints and on axis-wise decoupling of kinematic constraints to reduce the trajectory optimization problem for each quadrotor to a quadratic program (QP). This conservative approximation often f...
-
This paper presents a decentralized multi-agent trajectory planning (MATP) algorithm that guarantees to generate a safe, deadlock-free trajectory in an obstacle-rich environment under a limited communication range. The proposed algorithm utilizes a grid-based multi-agent path planning (MAPP) algorithm for deadlock resolution, and we introduce the subgoal optimization method to make the agent conve...
-
This paper proposes a new methodology to develop a time-varying group formation tracking scheme for a class of multi-agent systems (e.g. different types of multi-robot systems) utilising Negative Imaginary (NI) theory. It offers a two-loop control scheme in which the inner loop deploys an appropriate feedback linearising control law to transform the nonlinear dynamics of each agent into a double i...
-
Safe navigation is a fundamental challenge in multi-robot systems due to the uncertainty surrounding the future trajectory of the robots that act as obstacles for each other. In this work, we propose a principled data-driven approach where each robot repeatedly solves a finite horizon optimization problem subject to collision avoidance constraints with latter being formulated as distributionally r...
-
Forecasting the future states of surrounding traffic participants is a crucial capability for autonomous vehicles. The recently proposed occupancy flow field prediction introduces a scalable and effective representation to jointly predict surrounding agents' future motions in a scene. However, the challenging part is to model the underlying social interactions among traffic agents and the relation...
-
As autonomous driving systems prevail, it is becoming increasingly critical that the systems learn from databases containing fine-grained driving scenarios. Most databases currently available are human-annotated; they are expensive, time-consuming, and subject to behavioral biases. In this paper, we provide initial evidence supporting a novel technique utilizing drivers' electroencephalography (EE...
-
Accurate understanding and prediction of human behaviors are critical prerequisites for autonomous vehicles, especially in highly dynamic and interactive scenarios such as intersections in dense urban areas. In this work, we aim at identifying crossing pedestrians and predicting their future trajectories. To achieve these goals, we not only need the context information of road geometry and other t...
-
Existing multi-agent perception systems assume that every agent utilizes the same model with identical parameters and architecture. The performance can be degraded with different perception models due to the mismatch in their confidence scores. In this work, we propose a model-agnostic multi-agent perception framework to reduce the negative effect caused by the model discrepancies without sharing ...
-
This work explores scene graphs as a distilled representation of high-level information for autonomous driving, applied to future driver-action prediction. Given the scarcity and strong imbalance of data samples, we propose a self-supervision pipeline to infer representative and well-separated embeddings. Key aspects are interpretability and explainability; as such, we embed in our architecture at...
-
CueCAn: Cue-driven Contextual Attention for Identifying Missing Traffic Signs on Unconstrained Roads
Unconstrained Asian roads often involve poor infrastructure, affecting overall road safety. Missing traffic signs are a regular part of such roads. Missing or non-existing object detection has been studied for locating missing curbs and estimating reasonable regions for pedestrians on road scene images. Such methods involve analyzing task-specific single object cues. In this paper, we present the ...
-
Radar sensors employed for environment perception, e.g. in autonomous vehicles, output a lot of unwanted clutter. These points, for which no corresponding real objects exist, are a major source of errors in following processing steps like object detection or tracking. We therefore present two novel neural network setups for identifying clutter. The input data, network architectures and training co...
-
Real-time knowledge of the vehicle mass is valuable for several applications, mainly: active safety systems design and energy consumption optimization. This work describes a novel strategy for mass estimation in static and dynamic conditions. First, when the vehicle is powered-up, an initial estimation is given by observing the variations of one suspension deflection sensor mounted on the rear. Th...
-
Autonomous vehicles must often contend with conflicting planning requirements, e.g., safety and comfort could be at odds with each other if avoiding a collision calls for slamming the brakes. To resolve such conflicts, assigning importance ranking to rules (i.e., imposing a rule hierarchy) has been proposed, which, in turn, induces rankings on trajectories based on the importance of the rules they...
-
Autonomous agents (robots) face tremendous challenges while interacting with heterogeneous human agents in close proximity. One of these challenges is that the autonomous agent does not have an accurate model tailored to the specific human that the autonomous agent is interacting with, which could sometimes result in inefficient human-robot interaction and suboptimal system dynamics. Developing an...
-
Data-driven simulation has become a favorable way to train and test autonomous driving algorithms. The idea of replacing the actual environment with a learned simulator has also been explored in model-based reinforcement learning in the context of world models. In this work, we show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy buil...
-
Designing a safe and human-like decision-making system for an autonomous vehicle is a challenging task. Generative imitation learning is one possible approach for automating policy-building by leveraging both real-world and simulated decisions. Previous work that applies generative imitation learning to autonomous driving policies focuses on learning a low-level controller for simple settings. How...
-
In this paper, a novel method based on Artificial Potential Field (APF) theory is presented, for optimal motion planning in fully-known, static workspaces, for multiple final goal configurations. Optimization is achieved through a Reinforcement Learning (RL) framework. More specifically, the parameters of the underlying potential field are adjusted through a policy gradient algorithm in order to m...
-
In this paper, we introduce the first published planner to drive a car in dense, urban traffic using Inverse Reinforcement Learning (IRL). Our planner, DriveIRL, generates a diverse set of trajectory proposals and scores them with a learned model. The best trajectory is tracked by our self-driving vehicle's low-level controller. We train our trajectory scoring model on a 500+ hour real-world datas...
-
Sampling-based algorithms solve the path planning problem by generating random samples in the searchspace and incrementally growing a connectivity graph or a tree. Conventionally, the sampling strategy used in these algorithms is biased towards exploration to acquire information about the search-space. In contrast, this work proposes an optimization-based procedure that generates new samples so as...
-
In dynamic environments, robotic manipulators and especially cobots must be able to react to changing circumstances while in motion. This substantiates the need for quick trajectory planning algorithms that are able to cope with arbitrary velocity and acceleration boundary conditions. Apart from dynamic re-planning, being able to seamlessly join trajectories together opens the door for divide-and-...
-
In autonomous driving, an accurate understanding of environment, e.g., the vehicle-to-vehicle and vehicle-to-lane interactions, plays a critical role in many driving tasks such as trajectory prediction and motion planning. Environment information comes from high-definition (HD) map and historical trajectories of vehicles. Due to the heterogeneity of the map data and trajectory data, many data-driv...
-
Practical global path planning is critical for commercializing cleaning robots working in semi-structured environments. In the literature, global path planning methods for free space usually focus on path length and neglect the traffic rule constraints of the environments, which leads to high-frequency re-planning and increases collision risks. In contrast, those for structured environments are de...
-
Fast motion planning is of great significance, espe-cially when a timely mission is desired. However, the complexity of motion planning can grow drastically with the increase of environment details and mission complexity. This challenge can be further exacerbated if the tasks are coupled with the desired locations in the environment. To address these issues, this work aims at fast motion planning ...
-
We consider the problem of planning the motion of a drone equipped with a robotic arm, tasked with bringing its end-effector up to many (150+) targets in a fruit tree; to inspect every piece of fruit, for example. The task is complicated by the intersection of a version of Neighborhood TSP (to find an optimal order and a pose to visit every target), and a robotic motion-planning problem through a ...
-
In this paper, we address the problem of online quadrotor whole-body motion planning (SE(3) planning) in unknown and unstructured environments. We propose a novel multi-resolution search method, which discovers narrow areas requiring full pose planning and normal areas requiring only position planning. As a consequence, a quadrotor planning problem is decomposed into several SE(3) (if necessary) a...
-
This paper presents a method for task-oriented path planning and collision avoidance for a group of heterogeneous holonomic mobile sensors. It is a generalization of the authors' prior work on diffusion-based path planning. The proposed variant allows one to plan paths in environments cluttered with obstacles. The agents follow flow dynamics, i.e., the negative gradient of a function that is the s...
-
Predicting the future motion of road participants is crucial for autonomous driving but is extremely challenging due to staggering motion uncertainty. Recently, most motion forecasting methods resort to the goal-based strategy, i.e., predicting endpoints of motion trajectories as conditions to regress the entire trajectories, so that the search space of solution can be reduced. However, accurate g...
-
There is extensive literature on perceiving road structures by fusing various sensor inputs such as lidar point clouds and camera images using deep neural nets. Leveraging the latest advance of neural architects (such as transformers) and bird-eye-view (BEV) representation, the road cognition accuracy keeps improving. However, how to cognize the “road” for automated vehicles where there is no well...
-
We present a generalised architecture for reactive mobile manipulation while a robot's base is in motion toward the next objective in a high-level task. By performing tasks on-the-move, overall cycle time is reduced compared to methods where the base pauses during manipulation. Reactive control of the manipulator enables grasping objects with unpredictable motion while improving robustness against...
-
This paper addresses a new semantic multi-robot planning problem in uncertain and dynamic environments. Particularly, the environment is occupied with mobile and uncertain semantic targets. These targets are governed by stochastic dynamics while their current and future positions as well as their semantic labels are uncertain. Our goal is to control mobile sensing robots so that they can accomplis...
-
Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree struc...
-
State minimization of combinatorial filters is a fundamental problem that arises, for example, in building cheap, resource-efficient robots. But exact minimization is known to be NP-hard. This paper conducts a more nuanced analysis of this hardness than up till now, and uncovers two factors which contribute to this complexity. We show each factor is a distinct source of the problem's hardness and ...
-
This work presents a step towards utilizing incrementally-improving symbolic perception knowledge of the robot's surroundings for provably correct reactive control synthesis applied to an autonomous driving problem. Combining abstract models of motion control and information gathering, we show that assume-guarantee specifications (a subclass of Linear Temporal Logic) can be used to define and reso...
-
We consider a robot that answers questions about its environment by traveling to appropriate places and then sensing. Questions are posed as structured queries and may involve conditional or contingent relationships between observable properties. After formulating this problem, and empha-sizing the advantages of exploiting deducible information, we describe how non-trivial knowledge of the world a...
-
This paper presents a novel method for using Riemannian Motion Policies on volumetric maps, shown in the example of obstacle avoidance for Micro Aerial Vehicles (MAVs), Today, most robotic obstacle avoidance algorithms rely on sampling or optimization-based planners with volumetric maps. However, they are computationally expensive and often have inflexible monolithic architectures. Riemannian Moti...
-
For robotic vehicles to navigate robustly and safely in unseen environments, it is crucial to decide the most suitable navigation policy. However, most existing deep reinforcement learning based navigation policies are trained with a hand-engineered curriculum and reward function which are difficult to be deployed in a wide range of real-world scenarios. In this paper, we propose a framework to le...
-
Agile flights of autonomous quadrotors in clut-tered environments require constrained motion planning and control subject to translational and rotational dynamics. Tra-ditional model-based methods typically demand complicated design and heavy computation. In this paper, we develop a novel deep reinforcement learning-based method that tackles the challenging task of flying through a dynamic narrow ...
-
Although communication delays can disrupt multiagent systems, most of the existing multiagent trajectory planners lack a strategy to address this issue. State-of-the-art approaches typically assume perfect communication environments, which is hardly realistic in real-world experiments. This paper presents Robust MADER (RMADER), a decentralized and asynchronous multiagent trajectory planner that ca...
-
Collision avoidance in the presence of dynamic obstacles in unknown environments is one of the most critical challenges for unmanned systems. In this paper, we present a method that identifies obstacles in terms of ellipsoids to estimate linear and angular obstacle velocities. Our proposed method is based on the idea of any object can be approximately expressed by ellipsoids. To achieve this, we p...
-
Command and control of an aerial swarm is a complex task. This task increases in difficulty when the flight volume is restricted and the swarm and operator inhabit the same workspace. This work presents a novel method for interacting with and controlling a swarm of quadrotors in a confined space. EMG-based gesture control is used to control the position, orientation, and density of the swarm. Inte...
-
6-DoF robotic grasping is a long-lasting but un-solved problem. Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors, demonstrating superior accuracy on common objects but performing unsatisfactorily on photometrically challenging objects, e.g., objects in transparent or reflective materials. The bottleneck lies in that the surface of these obj...
-
Can a robot manipulate intra-category unseen objects in arbitrary poses with the help of a mere demonstration of grasping pose on a single object instance? In this paper, we try to address this intriguing challenge by using USEEK, an unsupervised SE(3)-equivariant keypoints method that enjoys alignment across instances in a category, to perform generaliz-able manipulation. USEEK follows a teacher-...
-
Recent advances in robotic mapping enable robots to use both semantic and geometric understanding of their surroundings to perform complex tasks. Current methods are optimized for reconstruction quality, but they do not provide a measure of how certain they are of their outputs. Therefore, algorithms that use these maps do not have a way of assessing how much they can trust the outputs. We present...
-
Fingertip-mounted pretouch sensors are very useful for robotic grasping. In this paper, we report a new (G3) dual-modal and dual sensing mechanisms (DMDSM) pretouch sensor for near-distance ranging and material sensing, which is based on pulse-echo ultrasound (US) and optoacoustics (OA). Different from previously reported versions, the G3 sensor utilizes a self-focused US/OA transceiver, thereby e...
-
We address the problem of grasping unknown objects identified from top-down images with a parallel gripper. When no object 3D model is available, the state-of-the-art grasp generators identify the best candidate locations for planar grasps using the RGBD image. However, while they generate the Cartesian location and orientation of the gripper, the height of the grasp center is often determined by ...
-
Generating high-quality instance-wise grasp con-figurations provides critical information of how to grasp specific objects in a multi-object environment and is of high importance for robot manipulation tasks. This work proposed a novel Single-Stage Grasp (SSG) synthesis network, which performs high-quality instance-wise grasp synthesis in a single stage: instance mask and grasp configurations are ...
-
Efficient grasp pose detection is essential for robotic manipulation in cluttered scenes. However, most methods only utilize point clouds or images for prediction, ignoring the advantages of different features. In this paper, we present a multi-modal fusion network for joint segmentation and grasp pose detection. We design a point cloud and image co-guided feature fusion module that can be used to...
-
In this work, we tackle 6-DoF grasp detection for transparent and specular objects, which is an important yet challenging problem in vision-based robotic systems, due to the failure of depth cameras in sensing their geometry. We, for the first time, propose a multiview RGB-based 6-DoF grasp detection network, GraspNeRF, that leverages the generalizable neural radiance field (NeRF) to achieve mater...
-
Physical interaction with textiles, such as assistive dressing or household tasks, requires advanced dexterous skills. The complexity of textile behavior during stretching and pulling is influenced by the material properties of the yarn and by the textile's construction technique, which are often unknown in real-world settings. Moreover, identification of physical properties of textiles through se...
-
We study six-dimensional (6D) perceptive peg-in-hole problem for practical connector insertion task in this paper. To enable the manipulator system to handle different types of pegs in complex environment, we develop a perceptive robotic assembly system that utilizes an in-hand RGB-D camera for peg-in-hole with multiple types of pegs. The proposed framework addresses the critical hole detection an...
-
We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera (without depth) that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function ...
-
In many automation tasks involving manipulation of rigid objects, the poses of the objects must be acquired. Vision-based pose estimation using a single RGB or RGB-D sensor is especially popular due to its broad applicability. However, single-view pose estimation is inherently limited by depth ambiguity and ambiguities imposed by various phenom-ena like occlusion, self-occlusion, reflections, etc....
-
Robot grasping has been widely studied in the last decade. Recently, Deep Learning made possible to achieve remarkable results in grasp pose estimation, using depth and RGB images. However, only few works consider the choice of the object to grasp. Moreover, they require a huge amount of data for generalizing to unseen object categories. For this reason, we introduce the Few-shot Semantic Grasping...
-
Flat objects with negligible thicknesses like books and disks are challenging to be grasped by the robot because of the width limit of the robot's gripper, especially when they are in cluttered environments. Pre-grasp manipulation is conducive to rearranging objects on the table and moving the flat objects to the table edge, making them graspable. In this paper, we formulate this task as Parameter...
-
This paper presents a new technique for learning category-level manipulation from raw RGB-D videos of task demonstrations, with no manual labels or annotations. Category-level learning aims to acquire skills that can be generalized to new objects, with geometries and textures that are different from the ones of the objects used in the demonstrations. We address this problem by first viewing both g...
-
We formulate grasp learning as a neural field and present Neural Grasp Distance Fields (NGDF). Here, the input is a 6D pose of a robot end effector and output is a distance to a continuous manifold of valid grasps for an object. In contrast to current approaches that predict a set of discrete candidate grasps, the distance-based NGDF representation is easily interpreted as a cost, and minimizing t...
-
Objects rarely sit in isolation in human environments. As such, we'd like our robots to reason about how multiple objects relate to one another and how those relations may change as the robot interacts with the world. To this end, we propose a novel graph neural network framework for multi-object manipulation to predict how inter-object relations change given robot actions. Our model operates on p...
-
A robot operating in a household environment will see a wide range of unique and unfamiliar objects. While a system could train on many of these, it is infeasible to predict all the objects a robot will see. In this paper, we present a method to generalize object manipulation skills acquired from a limited number of demonstrations, to novel objects from unseen shape categories. Our approach, Local...
-
Recent work in visual end-to-end learning for robotics has shown the promise of imitation learning across a variety of tasks. Such approaches are however expensive both because they require large amounts of real world data and rely on time-consuming real-world evaluations to identify the best model for deployment. These challenges can be mitigated by using simulation evaluations to identify high p...
-
In robotic manipulation, acquiring samples is extremely expensive because it often requires interacting with the real world. Traditional image-level data augmentation has shown the potential to improve sample efficiency in various machine learning tasks. However, image-level data augmentation is insufficient for an imitation learning agent to learn good manipulation policies in a reasonable amount...
-
Dextrous in-hand manipulation with a multi-fingered robotic hand is a challenging task, esp. when performed with the hand oriented upside down, demanding permanent force-closure, and when no external sensors are used. For the task of reorienting an object to a given goal orientation (vs. infinitely spinning it around an axis), the lack of external sensors is an additional fundamental challenge as ...
-
When humans perform contact-rich manipulation tasks, customized tools are often necessary to simplify the task. For instance, we use various utensils for handling food, such as knives, forks and spoons. Similarly, robots may benefit from specialized tools that enable them to more easily complete a variety of tasks. We present an end-to-end framework to automatically learn tool morphology for conta...
-
We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes-orders of magnitude more than prior work-in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision m...
-
We introduce NIFT, Neural Interaction Field and Template, a descriptive and robust interaction representation of object manipulations to facilitate imitation learning. Given a few object manipulation demos, NIFT guides the generation of the interaction imitation for a new object instance by matching the Neural Interaction Template (NIT) extracted from the demos in the target Neural Interaction Fie...
-
Place recognition is a critical and challenging task for mobile robots, aiming to retrieve an image captured at the same place as a query image from a database. Existing methods tend to fail while robots move autonomously under occlusion (e.g., car, bus, truck) and changing appearance (e.g., illumination changes, seasonal variation). Because they encode the image into only one code, entangling pla...
-
Large-scale place recognition is a fundamental but challenging task, which plays an increasingly important role in autonomous driving and robotics. Existing methods have achieved acceptable good performance, however, most of them are concentrating on designing elaborate global descriptor learning network structures. The importance of feature generalization and descriptor post-enhancing has long be...
-
In this work, we present a novel global descriptor termed stable triangle descriptor (STD) for 3D place recognition. For a triangle, its shape is uniquely determined by the length of the sides or included angles. Moreover, the shape of triangles is completely invariant to rigid transformations. Based on this property, we first design an algorithm to efficiently extract local key points from the 3D...
-
LiDAR based place recognition is popular for loop closure detection and re-localization. In recent years, deep learning brings improvements to place recognition by learnable feature extraction. However, these methods degenerate when the robot re-visits previous places with a large perspective difference. To address the challenge, we propose DeepRING to learn the roto-translation invariant represen...
-
We present a general approach for determining the unknown (or uncertain) position and orientation of a sensor mounted on a robot in a known environment, using only a few distance measurements (between 2 to 6 typically), which is advantageous, among others, in sensor cost, and storage and information-communication resources. In-between the measurements, the robot can perform predetermined local mot...
-
One recent promising approach to the Visual Place Recognition (VPR) problem has been to fuse the place recognition estimates of multiple complementary VPR techniques using methods such as shared representative appearance learning (SRAL) and multi-process fusion. These approaches come with a substantial practical limitation: they require all potential VPR methods to be brute-force run before they a...
-
The localization system is an essential element in robotics, which can provide accurate position information. Multiple localization systems can be integrated for reliable localization operations because there are various methods for measuring the position or processing algorithms. Significantly, the track-to-track (T2T) fusion method can fuse multiple localization systems using each system's estim...
-
Self-localization is a fundamental capability that mobile robot navigation systems integrate to move from one point to another using a map. Thus, any enhancement in localization accuracy is crucial to perform delicate dexterity tasks. This paper describes a new localization algorithm that maintains several populations of particles using the Monte Carlo Localization (MCL) algorithm, always choosing...
-
The Perspective-n-Point (PnP) problem has been widely studied in both computer vision and photogrammetry societies. With the development of feature extraction techniques, a large number of feature points might be available in a single shot. It is promising to devise a consistent estimator, i.e., the estimate can converge to the true camera pose as the number of points increases. To this end, we pr...
-
Accurate and robust localization systems are often highly desired in autonomous mobile robots. Existing LiDAR-based localization systems generally use standard particle filters which suffer from the well-known particle degeneracy problem. Furthermore, standard particle filters are ill-suited for handling discrepancies between maps and the actual operating environments. In this work, we present an ...
-
State estimation is an essential part of autonomous systems. Integrating the Ultra-Wideband (UWB) technique has been shown to correct the long-term estimation drift and bypass the complexity of loop closure detection. However, few works on robotics treat UWB as a stand-alone state estimation solution. The primary purpose of this work is to investigate planar pose estimation using only UWB range me...
-
Motion estimation is an essential element for autonomous vessels. It is used e.g. for lidar motion compensation as well as mapping and detection tasks in a maritime environment. Because the use of gyroscopes is not reliable and a high performance inertial measurement unit is quite expensive, we present an approach for visual pitch and roll estimation that utilizes a convolutional neural network fo...
-
Safe quadrupedal navigation through unknown environments is a challenging problem. This paper proposes a hierarchical vision-based planning framework (GPF-BG) integrating our previous Global Path Follower (GPF) navigation system and a gap-based local planner using Bézier curves, so called $B$ézier Gap (BG). This BG-based trajectory synthesis can generate smooth trajectories and guarantee safety fo...
-
Feature-based methods are a popular method for camera state estimation using event cameras. Due to the spatiotemporal nature of events, all event images exhibit smearing of events analogous to motion blur for a camera under motion. As such, events must be motion compensated to derive a sharp event image. However, this presents a causality dilemma where motion prior is required to unsmear the event...
-
We propose a lightweight system for stereo-matching using embedded graphic processing units (GPUs). The proposed system overcomes the trade-off between accuracy and processing speed in stereo matching, thus further improving the matching accuracy while ensuring real-time processing. The basic idea is to construct a tiny neural network based on a variational autoencoder (VAE) to achieve the upscali...
-
Recently, neural control policies have outperformed existing model-based planning-and-control methods for autonomously navigating quadrotors through cluttered environments in minimum time. However, they are not perception aware, a crucial requirement in vision-based navigation due to the camera's limited field of view and the underactuated nature of a quadrotor. We propose a learning-based system ...
-
Nano quadcopters are small, agile, and cheap platforms that are well suited for deployment in narrow, cluttered environments. Due to their limited payload, these vehicles are highly constrained in processing power, rendering conventional vision-based methods for safe and autonomous navigation incompatible. Recent machine learning developments promise high-performance perception at low latency, whi...
-
In this paper, we focus on the problem of efficiently locating a target object described with free-form text using a mobile robot equipped with vision sensors (e.g., an RGBD camera). Conventional active visual search predefines a set of objects to search for, rendering these techniques restrictive in practice. To provide added flexibility in active visual searching, we propose a system where a use...
-
We propose a hierarchical visual navigation solution, called Memory-based Exploration-value Evaluation Model (MEEM), to improve the agent's navigation performance. MEEM employs a hierarchical policy to tackle the challenge of sparse rewards, holds an episodic memory to store the historical information of the agent, and applies an Exploration-value Evaluation Model to calculate an exploration-value...
-
We present Visual Navigation and Locomotion over obstacles (ViNL), which enables a quadrupedal robot to navigate unseen apartments while stepping over small obstacles that lie in its path (e.g., shoes, toys, cables), similar to how humans and pets lift their feet over objects as they walk. ViNL consists of: (1) a visual navigation policy that outputs linear and angular velocity commands that guide...
-
Object goal visual navigation is a challenging task that aims to guide a robot to find the target object based on its visual observation, and the target is limited to the classes pre-defined in the training stage. However, in real households, there may exist numerous target classes that the robot needs to deal with, and it is hard for all of these classes to be contained in the training stage. To ...
-
Recent work has shown impressive localization performance using only images of ground textures taken with a downward facing monocular camera. This provides a reliable navigation method that is robust to feature sparse environments and challenging lighting conditions. However, these localization methods require an existing map for comparison. Our work aims to relax the need for a map by introducing...
-
This paper introduces a novel approach to GPS-denied visual navigation of a robot team over a wide (i.e., out of line of sight) area which we call WAVN (Wide Area Visual Navigation). Application domains include small-scale precision agriculture as well as exploration and surveillance. The proposed approach requires no exploration or map generation, merging, and updating, some of the most computati...
-
Radar sensors are emerging as solutions for perceiving surroundings and estimating ego-motion in extreme weather conditions. Unfortunately, radar measurements are noisy and suffer from mutual interference, which degrades the performance of feature extraction and matching, triggering imprecise matching pairs, which are referred to as outliers. To tackle the effect of outliers on radar odometry, $a$...
-
Despite the impressive results achieved by many existing Structure from Motion (SfM) approaches, there is still a need to improve the robustness, accuracy, and efficiency on large-scale scenes with many outlier matches and sparse view graphs. In this paper, we propose AdaSfM: a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets. Our approach first does a ...
-
The map fusion for multi-robot simultaneous localization and mapping (SLAM) consistently combines robot maps built independently into the global map. An established approach to map fusion is utilizing rendezvous, which refers to an encounter between multiple agents, to calculate the transformation into the global map. However, previous works using rendezvous have a limitation in that they are unre...
-
In this paper we propose a novel algorithm, Wi-Closure, to improve the computational efficiency and robustness of loop closure detection in multi-robot SLAM. Our approach decreases the computational overhead of classical approaches by pruning the search space of potential loop closures, prior to evaluation by a typical multi-robot SLAM pipeline. Wi-Closure achieves this by identifying candidates t...
-
Collaborative SLAM is at the core of perception in multi-robot systems as it enables the co-localization of the team of robots in a common reference frame, which is of vital importance for any coordination amongst them. The paradigm of a centralized architecture is well established, with the robots (i.e. agents) running Visual-Inertial Odometry (VIO) onboard while communicating relevant data, such...
-
Invariant Extended Kalman Filter (IEKF) has been successfully applied in Visual-inertial Odometry (VIO) as an advanced achievement of Kalman filter, showing great potential in sensor fusion. In this paper, we propose partial IEKF (PIEKF), which only incorporates rotation-velocity state into the Lie group structure and apply it for Visual-Inertial-Wheel Odometry (VIWO) to improve positioning accura...
-
The extrinsic parameters play a crucial role in multi-sensor fusion, such as visual-inertial Simultaneous Localization and Mapping(SLAM), as they enable the accurate alignment and integration of measurements from different sensors. However, extrinsic calibration is challenging in scenarios, such as underwater, where in-view structures are scanty and visibility is limited, causing incorrect extrins...
-
This paper proposes a method for learning continuous control policies for exploration and active landmark localization. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman filter to convert the partially obs...
-
This paper presents a visual SLAM system that uses both points and lines for robust camera localization, and simultaneously performs a piece-wise planar reconstruction (PPR) of the environment to provide a structural map in real-time. One of the biggest challenges in parallel tracking and mapping with a monocular camera is to keep the scale consistent when reconstructing the geometric primitives. ...
-
In this paper we address the rotation synchronization problem, where the objective is to recover absolute rotations starting from pairwise ones, where the unknowns and the measures are represented as nodes and edges of a graph, respectively. This problem is an essential task for structure from motion and simultaneous localization and mapping. We focus on the formulation of synchronization via neur...
-
Existence of symmetric objects, whose observation at different viewpoints can be identical, can deteriorate the performance of simultaneous localization and mapping (SLAM). This work proposes a system for robustly optimizing the pose of cameras and objects even in the presence of symmetric objects. We classify objects into three categories depending on their symmetry characteristics, which is effi...
-
Loop detection plays a key role in visual Si-multaneous Localization and Mapping (SLAM) by correcting the accumulated pose drift. In indoor scenarios, the richly distributed semantic landmarks are view-point invariant and hold strong descriptive power in loop detection. The current semantic-aided loop detection embeds the topology between semantic instances to search a loop. However, current seman...
-
Miniaturization of robotic grippers enables precise manipulation of small-size objects. However, most microgrippers are actuated by rigid actuators, and thus retain challenges such as micro-fabrication, complex structure, and lack of compliance. Here, we present a compliant microgripper driven by a soft polymer actuator. The proposed millimeter-scale soft polymer actuator can produce a linear disp...
-
In this study, we develop a hydraulically-driven soft robotic hand for handling heavy vegetables in a vegetable factory and report its experimental validations. The working population in agriculture is decreasing worldwide, creating a lot of demands for the robotic automation in harvest and trans-portation of agricultural produces. In particular, a vegetable factory deals with large and heavy vege...
-
This paper proposes a novel bin picking framework, two-stage grasping, aiming at precise grasping of cluttered small objects. Object density estimation and rough grasping are conducted in the first stage. Fine segmentation, detection, grasping, and pushing are performed in the second stage. A small object bin picking system has been realized to exhibit the concept of two-stage grasping. Experiment...
-
Shape displays are a class of haptic devices that enable whole-hand haptic exploration of 3D surfaces. However, their scalability is limited by the mechanical complexity and high cost of traditional actuator arrays. In this paper, we propose using electroadhesive auxetic skins as a strain-limiting layer to create programmable shape change in a continuous (“formable crust”) shape display. Auxetic s...
-
Recent work on battery-free soft robotics has demonstrated the use of liquid crystal elastomers (LCE) to build shape-changing materials activated by applied external heat. However, sources of heat must typically be in direct field-of-view of the robot (i.e. NIR, laser, and visual light EM sources or convective heats guns), be tethered to an external power supply (i.e. thermoelectric heating or res...
-
Enlightened by the fast-running gait of mammals like cheetahs and wolves, we design and fabricate a single-actuated untethered compliant robot that is capable of galloping at a speed of 313 mm/s or 1.56 body length per second (BL/s), faster than most reported soft crawlers in mm/s and BL/s. An in-plane prestressed hair clip mechanism (HCM) made up of semirigid materials, i.e. plastics are used as ...
-
Compared with rigid robots, soft robots have the advantages of inherent compliance, high adaptability, and impact tolerance. Many researchers are very interested in the motion design of soft robot underwater. In this paper, inspired by the method of octopus propulsion, a jet propulsion unit with 80% soft materials driven by pressure is designed. It can change the volume of its cavity to absorb and...
-
Robotic manipulation can benefit from wrist-mounted force/torque (F/T) sensors, but conventional F/T sensors can be expensive, difficult to install, and damaged by high loads. We present Visual Force/Torque Sensing (VFTS), a method that visually estimates the 6-axis F/T measurement that would be reported by a conventional F/T sensor. In contrast to approaches that sense loads using internal camera...
-
Modeling and control of high-dimensional, nonlinear robotic systems remains a challenging task. While various model- and learning-based approaches have been proposed to address these challenges, they broadly lack generalizability to different control tasks and rarely preserve the structure of the dynamics. In this work, we propose a new, data-driven approach for extracting control-oriented, low-di...
-
Shape memory alloy (SMA) has superior actuation capability over the limit of the scale. However, inherently low controllability is a primary issue that hinders practical usage. To address this challenge, this paper presents an SMA-based artificial muscle actuator capable of the displacement sensing through the capacitive sensor. To realize sensing capability, the theoretical model-based design and...
-
Data-driven approaches have shown promising results in modeling and controlling robots, specifically soft and flexible robots where developing physics-based models are more challenging. However, these methods often require a large number of real data, and gathering such data is time-consuming and can damage the robot as well. This paper proposed a novel data-efficient and non-parametric approach t...
-
This paper proposes a novel mathematical solution to solve the inverse kinematics (IK) of single section mobile continuum manipulators (SSMCMs). Thus, to achieve a given pose of the end-effector (EE), the proposed mathematical solution consists in determining the position and orientation parameters of the mobile platform and of a single section of the continuum manipulator. As advantages, the prop...
-
Current dynamical models of Cosserat rods often use the finite element method limited by computational efficiency or the finite difference method in a Cartesian framework with a compromise to accuracy. We employ the finite difference method in a geometric framework to develop solutions that are both computationally efficient and accurate. A numerical study is conducted on various backward-differen...
-
Concentric tube continuum robots (CTCRs) belong to the family of continuum robots with applications in minimally invasive surgeries. Because of this application domain, measuring the external forces along the body of the robot is paramount. CTCRs are made up of thin elastic rods and are intended to be applied inside the human body, where conventional sensor-based measurements are not feasible. Con...
-
In this paper, we propose a novel dynamic gait controller for the repetitive behavior of soft robot manipulators performing routine tasks. Compliance with soft robots is advantageous when the robot interacts with living organisms and other fragile objects. However, predicting and controlling repetitive behavior is challenging because of hysteresis and non-linear dynamics governing the interactions...
-
In this article, we investigate the model based position control of soft hydraulic actuators arranged in an an-tagonistic pair. A dynamical model of the system is constructed by employing the port-Hamiltonian formulation. A control algorithm is designed with an energy shaping approach, which accounts for the pressure dynamics of the fluid. A nonlinear observer is included to compensate the effect ...
-
Current research strive has tremendously changed the horizon of computer vision tasks in multiple agents tracking. Nevertheless, in the research of robotic assisted surgery, reliable surgical instrument tracking imposes challenge due to the high complexity in state modeling for the hierarchical structure of the instrument versus de-coupling the spatial-temporal correlations naturally embedded in t...
-
Current imaging of the artery relies primarily on computed tomography angiography (CTA), which requires contrast injections and exposure to radiation. In this paper, we present a method for fully autonomous artery 3D image acquisition using a linear ultrasound (US) probe and a 6 DoFs robot arm with a 3D camera. Robotic vessel acquisition can minimize tissue deformation and permit the reproduction ...
-
Tracheal intubation for patients with respiratory infectious diseases requires doctors to wear a full set of protective clothing, which takes a certain time. How to protect doctors from infection when facing an emergency operation has become an important issue. The intubation robot may solve this contradiction. To provide visual information for real-time path planning for robotic intubation, this ...
-
For robot-assisted surgery, an accurate surgical report reflects clinical operations during surgery and helps document entry tasks, post-operative analysis and follow-up treatment. It is a challenging task due to many complex and diverse interactions between instruments and tissues in the surgical scene. Although existing surgical report generation methods based on deep learning have achieved larg...
-
Robot-guided vascular access has the potential to deliver urgent medical care in situations where medical personnel are unavailable. However, this technique requires accurate and reliable segmentation of anatomical landmarks in the body. For the ultrasound imaging modality, obtaining large amounts of training data for a segmentation model is time-consuming and expensive. This paper introduces RESU...
-
Ultrasound imaging is a commonly used modality for several diagnostic and therapeutic procedures. However, the diagnosis by ultrasound relies heavily on the quality of images assessed manually by sonographers, which diminishes the objectivity of the diagnosis and makes it operator-dependent. The supervised learning-based methods for automated quality assessment require manually annotated datasets,...
-
Ultrasound scanning is an efficient imaging modality preferred for quick medical procedures. However, due to the lack of skilled sonographers, researchers have developed many Robotic Ultrasound System (RUS) prototypes for various procedures. Most of these systems have a human-in-the-loop and require an expert to point the robot to the region of the subject to be scanned. Only a few systems try to ...
-
In Robot-assisted Minimally Invasive Surgery (RMIS), the estimation of the pose of surgical tools is crucial for applications such as surgical navigation, visual servoing, autonomous robotic task execution and augmented reality. A plethora of hardware-based and vision-based methods have been proposed in the literature. However, direct application of these methods to RMIS has significant limitation...
-
In this paper, we present a Computer Vision (CV) based tracking and fusion algorithm, dedicated to a 3D printed gimbal system on drones flying in nature. The whole gimbal system can stabilize the camera orientation robustly in challenging environments by using skyline and ground plane as references. Our main contributions are the following: a) a light-weight Resnet-18 backbone network model was tr...
-
With robots being deployed in increasingly complex environments like underground mines and planetary surfaces, the multi-sensor fusion method has gained more and more attention which is a promising solution to state estimation in the such scene. The fusion scheme is a central component of these methods. In this paper, a light-weight iEKF-based LiDAR-inertial odometry system is presented, which uti...
-
3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. Complementary, mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and th...
-
Building 3D perception systems for autonomous vehicles that do not rely on high-density LiDAR is a critical research problem because of the expense of LiDAR systems compared to cameras and other sensors. Recent research has developed a variety of camera-only methods, where features are differentiably “lifted” from the multi-camera images onto the 2D ground plane, yielding a “bird's eye view” (BEV)...
-
Multi-view radar-camera fused 3D object detection provides a farther detection range and more helpful features for autonomous driving, especially under adverse weather. The current radar-camera fusion methods deliver kinds of designs to fuse radar information with camera data. However, these fusion approaches usually adopt the straightforward concatenation operation between multi-modal features, w...
-
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system. Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with camera features. However, the camera-to-LiDAR projection throws away the semantic density of camera features, hindering the effectiveness of such methods, especially for semantic-oriented tasks (such as 3D scene segmentati...
-
This work proposes a first-of-its-kind SLAM architecture fusing an event-based camera and a Frequency Modulated Continuous Wave (FMCW) radar for drone navigation. Each sensor is processed by a bio-inspired Spiking Neural Network (SNN) with continual Spike-Timing-Dependent Plasticity (STDP) learning, as observed in the brain. In contrast to most learning-based SLAM systems, our method does not requ...
-
The capability to extract task specific, semantic information from raw sensory data is a crucial requirement for many applications of mobile robotics. Autonomous inspection of critical infrastructure with Unmanned Aerial Vehicles (UAVs), for example, requires precise navigation relative to the structure that is to be inspected. Recently, Artificial Intelligence (AI)-based methods have been shown t...
-
In the practical application of point cloud completion tasks, real data quality is usually much worse than the CAD datasets used for training. A small amount of noisy data will usually significantly impact the overall system's accuracy. In this paper, we propose a quality evaluation network to score the point clouds and help judge the quality of the point cloud before applying the completion model...
-
Room layout estimation is a long-existing robotic vision task that benefits both environment sensing and motion planning. However, layout estimation using point clouds (PCs) still suffers from data scarcity due to annotation difficulty. As such, we address the semi-supervised setting of this task based upon the idea of model exponential moving averaging. But adapting this scheme to the state-of-th...
-
This paper presents an effective few-shot point cloud semantic segmentation approach for real-world applications. Existing few-shot segmentation methods on point cloud heavily rely on the fully-supervised pretrain with large annotated datasets, which causes the learned feature extraction bias to those pretrained classes. However, as the purpose of few-shot learning is to handle unknown/unseen clas...
-
In robotic applications, we often obtain tons of 3D point cloud data without color information, and it is difficult to visualize point clouds in a meaningful and colorful way. Can we colorize 3D point clouds for better visualization? Existing deep learning-based colorization methods usually only take simple 3D objects as input, and their performance for complex scenes with multiple objects is limi...
-
Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to...
-
Point cloud registration is a crucial problem in computer vision and robotics. Existing methods either rely on matching local geometric features, which are sensitive to the pose differences, or leverage global shapes, which leads to inconsistency when facing distribution variances such as partial overlapping. Combining the advantages of both types of methods, we adopt a coarse-to-fine pipeline tha...
-
Stereo matching can be used to estimate dense but inaccurate depth information for each pixel of a camera image. A LiDAR can provide accurate but sparse depth measurements. The fusion of both can combine their advantages. We propose an efficient method for fusing stereo and LiDAR at the cost level of Semi-Global Matching. It significantly improves density and accuracy of the estimated disparities ...
-
This paper presents Segregator, a global point cloud registration framework that exploits both semantic information and geometric distribution to efficiently build up outlier-robust correspondences and search for inliers. Current state-of-the-art algorithms rely on point features to set up putative correspondences and refine them by employing pair-wise distance consistency checks. However, such a ...
-
Most existing methods for category-level pose estimation rely on object point clouds. However, when considering transparent objects, depth cameras are usually not able to capture high-quality data, resulting in point clouds with severe artifacts. Without a complete point cloud, existing methods are not applicable to challenging transparent objects. To tackle this problem, we present StereoPose, a ...
-
Knowing the relative rotation angle improves relative pose estimation accuracy. We consider the problem of computing relative motion from a non-minimal number of correspondences with a known relative rotation angle. While several solvers for minimum correspondences have been proposed, no non-minimal solver for this problem currently exists. In this work, we propose two non-minimal solvers for this...
-
6-DoF pose estimation is an essential component of robotic manipulation pipelines. However, it usually suffers from a lack of generalization to new instances and object types. Most widely used methods learn to infer the object pose in a discriminative setup where the model filters useful information to infer the exact pose of the object. While such methods offer accurate poses, the model does not ...
-
In this paper, we propose a novel RGBD-based object 6DoF pose estimation network - RFFCE. It is a two-stage method that firstly leverages deep neural networks for feature extraction and object points matching, and then the geometric principles are utilized for final pose computation. Our approach consists of three primary innovations: residual feature fusion for representative RGBD feature extract...
-
Robotic manipulation, in particular in-hand object manipulation, often requires an accurate estimate of the object's 6D pose. To improve the accuracy of the estimated pose, state-of-the-art approaches in 6D object pose estimation use observational data from one or more modalities, e.g., RGB images, depth, and tactile readings. However, existing approaches make limited use of the underlying geometr...
-
We propose an interactive approach for 3D instance segmentation, where users can iteratively collaborate with a deep learning model to segment objects directly in a 3D point cloud. Current methods for 3D instance segmentation are generally trained in a fully-supervised fashion, which requires large amounts of costly training labels, and does not generalize well to classes unseen during training. F...
-
Category-level 6D pose and size estimation is to estimate the rotation, translation and size of the observed instance objects from an arbitrary angle in a cluttered scene. Compared with instance-level 6D pose estimation, there are two main challenges for category-level 6D pose estimation. One is that the algorithm needs to estimate the 6D pose and size of unseen objects, and no 3D models are avail...
-
6D pose estimation of textureless objects is a valuable but challenging task for many robotic applications. In this work, we propose a framework to address this challenge using only RGB images acquired from multiple viewpoints. The core idea of our approach is to decouple 6D pose estimation into a sequential two-step process, first estimating the 3D translation and then the 3D rotation of each obj...
-
The deployment of Reinforcement Learning to robotics applications faces the difficulty of reward engineering. Therefore, approaches have focused on creating reward functions by Learning from Observations (LfO) which is the task of learning policies from expert trajectories that only contain state sequences. We propose new methods for LfO for the important class of continuous control problems of le...
-
Preference-based reinforcement learning (PbRL) can enable robots to learn to perform tasks based on an individual's preferences without requiring a hand-crafted re-ward function. However, existing approaches either assume access to a high-fidelity simulator or analytic model or take a model-free approach that requires extensive, possibly unsafe online environment interactions. In this paper, we st...
-
Simulation is the key to scaling up validation and verification for robotic systems such as autonomous vehicles. Despite advances in high-fidelity physics and sensor simulation, a critical gap remains in simulating realistic behaviors of road users. This is because devising first principle models for human-like behaviors is generally infeasible. In this work, we take a data-driven approach to gene...
-
Recently, various successful applications utilizing expert states in imitation learning (IL) have been witnessed. However, IL from visual inputs (ILfVI), which has a greater promise to be widely applied by using online visual resources, suffers from low data-efficiency and poor performance resulted from on-policy learning and high-dimensional visual inputs. We propose OPIfVI (Off-Policy Imitation ...
-
Learning diverse skills is one of the main challenges in robotics. To this end, imitation learning approaches have achieved impressive results. These methods require explicitly labeled datasets or assume consistent skill execution to enable learning and active control of individual behaviors, which limits their applicability. In this work, we propose a cooperative adversarial method for obtaining ...
-
Learning skills by imitation is a promising concept for the intuitive teaching of robots. A common way to learn such skills is to learn a parametric model by maximizing the likelihood given the demonstrations. Yet, human demonstrations are often multi-modal, i.e., the same task is solved in multiple ways which is a major challenge for most imitation learning methods that are based on such a maximu...
-
This paper proposes a novel autonomous dynamic system (ADS) based controller for trajectory learning from demonstration (LfD). We call our method Learning Stable Dynamics via Iterative Quadratic Programming (LSD-IQP). LSD-IQP learns an energy function and an ADS from demonstrations via semi-infinite quadratic programming. Energy function constraints are imposed on the learned ADS to ensure converg...
-
Motion prediction for automated vehicles in complex environments is a difficult task that is to be mastered when automated vehicles are to be used in arbitrary situations. Many factors influence the future motion of traffic participants starting with traffic rules and reaching from the interaction between each other to personal habits of human drivers. Therefore, we present a novel approach for a ...
-
Visual imitation learning provides an effective framework to learn skills from demonstrations. However, the quality of the provided demonstrations usually significantly affects the ability of an agent to acquire desired skills. Therefore, the standard visual imitation learning assumes near-optimal demonstrations, which are expensive or sometimes prohibitive to collect. Previous works propose to le...
-
Motion forecasting for autonomous driving is a challenging task because complex driving scenarios involve a heterogeneous mix of static and dynamic inputs. It is an open problem how best to represent and fuse information about road geometry, lane connectivity, time-varying traffic light state, and history of a dynamic set of agents and their interactions into an effective encoding. To model this d...
-
Over the last two decades, the robotics community witnessed the emergence of various motion representations that have been used extensively, particularly in behavorial cloning, to compactly encode and generalize skills. Among these, probabilistic approaches have earned a relevant place, owing to their encoding of variations, correlations and adaptability to new task conditions. Modulating such pri...
-
Model generalization of the underlying dynamics is critical for achieving data efficiency when learning for robot control. This paper proposes a novel approach for learning dynamics leveraging the symmetry in the underlying robotic system, which allows for robust extrapolation from fewer samples. Existing frameworks that represent all data in vector space fail to consider the structured informatio...
-
Deep reinforcement learning (DRL) is one of the most powerful tools for synthesizing complex robotic behaviors. But training DRL models is incredibly compute and memory intensive, requiring large training datasets and replay buffers to achieve performant results. This poses a challenge for the next generation of field robots that will need to learn on the edge to adapt to their environment. In thi...
-
Robot data collected in complex real-world scenarios are often biased due to safety concerns, human preferences, and mission or platform constraints. Consequently, robot learning from such observational data poses great challenges for accurate parameter estimation. We propose a principled causal inference framework for robots to learn the parameters of a stochastic motion model using observational...
-
In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent completely built from predictive processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive agent system, wh...
-
Finding Nash equilibrial policies for two-player differential games requires solving Hamilton-Jacobi-Isaacs (HJI) PDEs. Self-supervised learning has been used to approximate solutions of such PDEs while circumventing the curse of dimensionality. However, this method fails to learn discontinuous PDE solutions due to its sampling nature, leading to poor safety performance of the resulting controller...
-
Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning ‘visual affordances’. Given an input image of a scene, we infer a distribution over plausible future states that can be achieved via interactions with it. To allow predicting di...
-
The need for rapid and reliable robot deployment is on the rise. Imitation Learning (IL) has become popular for producing motion planning policies from a set of demonstrations. However, many methods in IL are not guaranteed to produce stable policies. The generated policy may not converge to the robot target, reducing reliability, and may collide with its environment, reducing the safety of the sy...
-
Robotic agents that operate autonomously in the real world need to continuously explore their environment and learn from the data collected, with minimal human supervision. While it is possible to build agents that can learn in such a manner without supervision, current methods struggle to scale to the real world. Thus, we propose ALAN, an autonomously exploring robotic agent, that can perform tas...
-
The capabilities of a robot will be increased significantly by exploiting throwing behavior. In particular, throwing will enable robots to rapidly place the object into the target basket, located outside its feasible kinematic space, without traveling to the desired location. In previous approaches, the robot often learned a parameterized throwing kernel through analytical approaches, imitation le...
-
To approach the level of advanced human players in table tennis with robots, generating varied ball trajectories in a reproducible and controlled manner is essential. Current ball launchers used in robot table tennis either do not provide an interface for automatic control or are limited in their capabilities to adapt speed, direction, and spin of the ball. For these reasons, we present AIMY, a th...
-
This paper proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial representat...
-
To enable a mobile manipulator to effectively maneuver a cart, we derive a dynamic model for the cart that incorporates the nonholonomic constraints on its motion, and use this model to formulate an estimator for the cart's inertial parameters. By deriving the dynamic equations of the cart using a constrained Euler-Lagrange formulation, we are able to directly incorporate nonholonomic constraints ...
-
In this paper, we introduce Fourier-SOFT 2D (FS2D) as a new robust registration method. FS2D operates in the frequency domain where it exploits the well-known decoupling of rotation and translation. The challenging part of determining the rotation parameter is solved here based on a projection of the Fourier magnitude on a sphere and the SO(3) Fourier Transform (SOFT). The underlying use case is u...
-
Autonomy, cooperation and data fusion can increase the performance of robotic networks in many underwater applications. In this paper, we describe a novel occupancy grid (OG) based perception layer, and its use for controlling a network of autonomous underwater vehicles (AUVs), sensorised with passive sonars. Data fusion between the robots' bearing-only measurements (typical of passive sonars) ena...
-
Successful applications of complex vision-based behaviours underwater have lagged behind progress in terrestrial and aerial domains. This is largely due to the degraded image quality resulting from the physical phenomena involved in underwater image formation. Spectrally-selective light attenuation drains some colors from underwater images while backscattering adds others, making it challenging to...
-
We designed and validated a novel simulator for efficient development of multi-robot marine missions. To accelerate development of cooperative behaviors, the simulator models the robots' operating conditions with moderately high fidelity and runs significantly faster than real time, including acoustic communications, dynamic environmental data, and high-resolution bathymetry in large worlds. The s...
-
Deep Reinforcement Learning Based Tracking Control of an Autonomous Surface Vessel in Natural Waters
Accurate control of autonomous marine robots still poses challenges due to the complex dynamics of the environment. In this paper, we propose a Deep Reinforcement Learning (DRL) approach to train a controller for autonomous surface vessel (ASV) trajectory tracking and compare its performance with an advanced nonlinear model predictive controller (NMPC) in real environments. Taking into account env...
-
In this paper, we present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots. We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes. First, we adapt a new input space from raw RGB image space by exploiting underwater l...
-
Autonomous Underwater Vehicles (AUVs) conduct regular visual surveys of marine environments to characterise and monitor the composition and diversity of the benthos. The use of machine learning classifiers for this task is limited by the low numbers of annotations available and the many fine-grained classes involved. In addition to these challenges, there are domain shifts between image sets acqui...
-
Simultaneous localization and mapping (SLAM) frameworks for autonomous navigation rely on robust data association to identify loop closures for back-end trajectory optimization. In the case of autonomous underwater vehicles (AUVs) equipped with multibeam echosounders (MBES), data association is particularly challenging due to the scarcity of identifiable landmarks in the seabed, the large drift in...
-
Autonomous operation in underwater environ-ments is, arguably, one of the most complex domains. It requires safe operations under the presence of unpredictable surge, currents, uncertainty, and dynamic obstacles that challenges to the highest degree real-time motion planning; the primary focus of this paper. Although previous work addressed the problem of safe real-time 3D navigation in cluttered ...
-
In this paper, we present the Diver Interest via Pointing (DIP) algorithm, a highly modular method for conveying a diver's area of interest to an autonomous underwater vehicle (AUV) using pointing gestures for underwater humanrobot collaborative tasks. DIP uses a single monocular camera and exploits human body pose, even with complete dive gear, to extract underwater human pointing gesture poses a...
-
We explore the use of uncertainty estimation in the maritime domain, showing the efficacy on toy datasets (CIFAR10) and proving it on an in-house dataset, SHIPS. We present a method joining the intra-class uncertainty achieved using Monte Carlo Dropout, with recent discoveries in the field of outlier detection, to gain more holistic uncertainty measures. We explore the relationship between the int...
-
This paper presents an adaptive heading approach for perception awareness during trajectory following. By adapting the heading of a robot to improve the feature tracking in the current mapped environment, the accuracy in localisation can be improved. This can have a significant advantage for autonomous operations in GPS-denied environments such as subsea or in caves. The aim of the proposed approa...
-
Fast and safe manipulation of flexible objects with a robot manipulator necessitates measures to cope with vibrations. Existing approaches either increase the task execution time or require complex models and/or additional instrumentation to measure vibrations. This paper develops a model-based method that overcomes these limitations. It relies on a simple pendulum-like model for modeling the beam...
-
We propose a manifold optimization approach for solving constrained inference and planning problems. The approach employs a framework that transforms an arbitrary nonlinear equality constrained optimization problem into an unconstrained manifold optimization problem. The core of the transformation process is the formulation of constraint manifolds that represent sets of variables subject to equali...
-
We generalize the derivation of model predictive path integral control (MPPI) to allow for a single joint distribution across controls in the control sequence. This reformation allows for the implementation of adaptive importance sampling (AIS) algorithms into the original importance sampling step while still maintaining the benefits of MPPI such as working with arbitrary system dynamics and cost ...
-
Game theoretic methods have become popular for planning and prediction in situations involving rich multi-agent interactions. However, these methods often assume the existence of a single local Nash equilibria and are hence unable to handle uncertainty in the intentions of different agents. While maximum entropy (MaxEnt) dynamic games try to address this issue, practical approaches solve for MaxEn...
-
It is often necessary for drones to complete delivery, photography, and rescue in the shortest time to increase efficiency. Many autonomous drone races provide platforms to pursue algorithms to finish races as quickly as possible for the above purpose. Unfortunately, existing methods often fail to keep training and racing time short in drone racing competitions. This motivates us to develop a high...
-
We present a novel Curvature-Aware Model Pre-dictive Contouring Control (CA-MPCC) formulation for mobile robotics motion planning. Our method aims at generalizing the traditional contouring control formulation derived from machining to autonomous driving applications. The proposed controller is able of handling sharp curvatures in the reference path while subject to non-linear constraints, such as...
-
A Sequential Quadratic Programming Approach to the Solution of Open-Loop Generalized Nash Equilibria
In this work, we propose a numerical method for the solution of local generalized Nash equilibria (GNE) for the class of open-loop general-sum dynamic games for agents with nonlinear dynamics and constraints. In particular, we formulate a sequential quadratic programming (SQP) approach which requires only the solution of a single convex quadratic program at each iteration and is locally convergent...
-
Nonlinear model predictive control often involves nonconvex optimization for which real-time control systems require fast and numerically stable solutions. This work proposes RPGD, a Resampling Parallel Gradient Descent optimizer designed to exploit small-batch parallelism of modern hardware like neural accelerators or multithreaded microcontrollers. After initialization, it continuously maintains...
-
Safety is one of the main challenges when applying learning-based motion controllers to practical robotic systems, especially when the dynamics of the robots and their surrounding dynamic environments are unknown. This issue is further exacerbated when the learned information is unreliable and inaccurate. In this paper, we aim to enhance the safety of learning-enabled mobile robots in dynamic envi...
-
Large-scale UAV switching formation tracking control has been widely applied in many fields such as search and rescue, cooperative transportation, and UAV light shows. In order to optimize the control performance and reduce the computational burden of the system, this study proposes an event-triggered optimal formation tracking controller for discrete-time large-scale UAV systems (UASs). And an op...
-
Collision detection is an important component of many robotics applications, from robot control to simulation, including motion planning and estimation. While the seminal works on the topic date back to the 80s, it is only recently that the question of properly differentiating collision detection has emerged as a central issue, thanks notably to the ongoing and various efforts made by the scientif...
-
Combination of optimal control methods and machine learning approaches allows to profit from complementary benefits of each field in control of robotic systems. Data from optimal trajectories provides valuable information that can be used to learn a near-optimal state-dependent feedback control policy. To obtain high-quality learning data, careful selection of optimal trajectories, determined by a...
-
Accurate self and relative state estimation are the critical preconditions for completing swarm tasks, e.g., collaborative autonomous exploration, target tracking, search and rescue. This paper proposes Swarm-LIO: a fully decentralized state estimation method for aerial swarm systems, in which each drone performs precise ego-state estimation, exchanges ego-state and mutual observation information ...
-
With autonomous aerial vehicles enacting safety-critical missions, such as the Mars Science Laboratory Curiosity rover's landing on Mars, the tasks of automatically identifying and reasoning about potentially hazardous landing sites is paramount. This paper presents a coupled perception-planning solution which addresses the hazard detection, optimal landing trajectory generation, and contingency p...
-
In this paper, we present a swarm of Crazyflie nano-drones. The swarm can show various collective behaviors: Flocking, gradient following, going to a chosen point, formation, and scattered search of the environment. The methodology behind the behaviors is executed entirely on-board. Crazyflies use a common radio channel to share positions with each other. If desired, an operator can use the same c...
-
Visual object tracking is a very important task for unmanned aerial vehicle (UAV). Limited resources of UAV lead to strong demand for efficient and robust trackers. In recent years, deep learning-based trackers, especially, siamese trackers achieve very impressive results. Though siamese trackers can run a relatively fast speed on the high-end GPU, they are becoming heavier and heavier which restr...
-
Traditional approaches for active mapping focus on building geometric maps. For most real-world applications, however, actionable information is related to semantically meaningful objects in the environment. We propose an approach to the active metric-semantic mapping problem that enables multiple heterogeneous robots to collaboratively build a map of the environment. The robots actively explore t...
-
Multi-agent pursuit-evasion tasks involving intelligent targets are notoriously challenging coordination problems. In this paper, we investigate new ways to learn such coordinated behaviors of unmanned aerial vehicles (UAVs) aimed at keeping track of multiple evasive targets. Within a Multi-Agent Reinforcement Learning (MARL) framework, we specifically propose a variant of the Multi-Agent Deep Det...
-
This paper implements a vision-based moving target tracking system of quadrotors with visual-inertial localization in GNSS-denied indoor environments. We use the visual-inertial odometry to estimate the states of the UAV by minimizing visual and inertial residuals, and estimate the states of the target with extended Kalman Filter from visual detection. This research formulates the target tracking ...
-
The use of Micro Aerial Vehicles (MAVs) for inspection and surveillance missions has proved to be extremely useful, however, their usability is negatively impacted by the large power requirements and the limited operating time. This work describes the design and development of a novel hybrid aerial-ground vehicle, enabling multi-modal mobility and long operating time, suitable for long-endurance i...
-
Autonomously recharging UAVs from existing infrastructure has enormous potential for various applications, such as infrastructure inspection, surveillance, and search and rescue. While it is an active area of research, most related work focuses on alternating current (AC) infrastructure while very little work has been done on investigating the potential of recharging UAVs from direct current (DC) ...
-
Unmanned aerial vehicles assist in maritime search and rescue missions by flying over large search areas to autonomously search for objects or people. Reliably detecting objects of interest requires fast models to employ on embedded hardware. Moreover, with increasing distance to the ground station only part of the video data can be transmitted. In this work, we consider the problem of finding mea...
-
We propose TRADE for robust tracking and 3D localization of a moving target in complex environments, from UAVs equipped with a single camera. Ultimately TRADE enables 3d-aware target following. Tracking-by-detection approaches are vulnerable to target switching, especially between similar objects. Thus, TRADE predicts and incorporates the target 3D trajectory to select the right target from the tr...
-
In this paper, we present a LiDAR Inertial Odometry (LIO) algorithm utilizing adaptive keyframe generation which achieves fast and accurate state estimation for aerial and ground robots. It is known that keyframe generation significantly affects the performance of Simultaneous Localization and Mapping (SLAM) algorithms. Unlike existing SLAM algorithms that generate keyframes based on fixed conditi...
-
Exploration of unknown space with an autonomous mobile robot is a well-studied problem. In this work we broaden the scope of exploration, moving beyond the pure geometric goal of uncovering as much free space as possible. We believe that for many practical applications, exploration should be contextualised with semantic and object-level understanding of the environment for task-specific exploratio...
-
In this work, we study vulnerability of unmanned aerial vehicles (UAVs) to stealthy attacks on perception-based control. To guide our analysis, we consider two specific missions: ($i$) ground vehicle tracking (GVT), and (ii) vertical take-off and landing (VTOL) of a quadcopter on a moving ground vehicle. Specifically, we introduce a method to consistently attack both the sensors measurements and c...
-
Vision-based object tracking has boosted extensive autonomous applications for unmanned aerial vehicles (UAVs). However, the dynamic changes in flight maneuver and viewpoint encountered in UAV tracking pose significant difficulties, e.g., aspect ratio change, and scale variation. The conventional cross-correlation operation, while commonly used, has limitations in effectively capturing perceptual ...
-
This paper contributes a novel strategy for semantics-aware autonomous exploration and inspection path planning. Attuned to the fact that environments that need to be explored often involve a sparse set of semantic entities of particular interest, the proposed method offers volumetric exploration combined with two new planning behaviors that together ensure that a complete mesh model is reconstruc...
-
Inverted landing in a rapid and robust manner is a challenging feat for aerial robots, especially while depending entirely on onboard sensing and computation. In spite of this, this feat is routinely performed by biological fliers such as bats, flies, and bees. Our previous work has identified a direct causal connection between a series of onboard visual cues and kinematic actions that allow for r...
-
Aerial insects demonstrate fast and precise heading control when they perform body saccades and rapid escape maneuvers. While insect-scale micro-aerial-vehicles (IMAVs) have demonstrated early results on heading control, their flight endurance and heading angle tracking accuracy remain far inferior to that of natural fliers. In this work, we present a long endurance sub-gram aerial robot that can ...
-
Accurate and agile trajectory tracking in sub-gram Micro Aerial Vehicles (MAVs) is challenging, as the small scale of the robot induces large model uncertainties, demanding robust feedback controllers, while the fast dynamics and computational constraints prevent the deployment of computationally expensive strategies. In this work, we present an approach for agile and computationally efficient tra...
-
Flapping wing micro aerial vehicles face challenges in sensing and reacting to disturbances like wind gusts. This work introduces a new microscale bio-inspired digital strain sensor to detect these perturbations. The sensor is designed to change logic states when a specified strain threshold has been reached. The sensors are 3D printed on a flexible Mylar wing using two-photon polymerization. Thre...
-
Flight is an energetically expensive task. While aerial insects can effortlessly fly through natural environments, achieving power autonomous flights in insect-scale robots remains a major challenge. In prior works, we developed soft-actuated insect-scale aerial robots that demonstrated unique capabilities such as in-flight collision recovery and somersaults. However, the soft dielectric elastomer...
-
Hovering hummingbirds have inspired small flapping-wing aerial robots. Natural flyers, including hummingbirds and bats, undergo torsional wing deformation during flapping flight owing to complex wing structure, while previous artificial wings were relatively simple and difficult to design the torsional flexibility. In this paper, we proposed a hummingbird-bat hybrid (HBH) wing in which torsional f...
-
Precise relative localization is a crucial functional block for swarm robotics. This work presents a novel au-tonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones, i.e., sub-40g of weight and sub-100mW processing power. To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated fr...
-
We present a new framework for implementing real-time embedded safety-critical controllers which utilizes hybrid computing to address the issue of limited computational resources, a problem that is particularly prevalent in microrobotics. In our approach, the nominal stabilizing control algorithm is implemented digitally while the safety-critical quadratic program is solved via a dedicated analog ...
-
Centralized approaches for multi-robot coverage planning problems suffer from the lack of scalability. Learning-based distributed algorithms provide a scalable avenue in addition to bringing data-oriented feature generation capabilities to the table, allowing integration with other learning-based approaches. To this end, we present a learning-based, differentiable distributed coverage planner (D2C...
-
Conflict-Based Search is one of the most popular methods for multi-agent path finding. Though it is complete and optimal, it does not scale well. Recent works have been proposed to accelerate it by introducing various heuristics. However, whether these heuristics can apply to non-grid-based problem settings while maintaining their effectiveness remains an open question. In this work, we find that ...
-
Traditional approaches to the design of multiagent navigation algorithms consider the environment as a fixed constraint, despite the obvious influence of spatial constraints on agents' performance. Yet hand-designing improved environment layouts and structures is inefficient and potentially expensive. The goal of this paper is to consider the environment as a decision variable in a system-level op...
-
We consider a team of heterogeneous robots, each equipped with various types and quantities of resources, and tasked with supplying these resources to multiple areas of demand. We propose a Voronoi-based coverage control approach to deploy robots to areas of demand by defining a position- and time-varying density function to represent the quality at which demand is being met in the environment. Th...
-
Real-time sequential decision making under uncertainty is a challenging task for autonomous robots. Such problems are even more challenging when making decisions involving heterogeneous teams of robots completing multiple tasks. Deploying autonomous taxi cabs and utilizing drones for package delivery represent relevant examples of these types of problems. In this paper, we present an effective sol...
-
In this paper, we study a navigation problem where a mobile robot needs to locate a mmWave wireless signal. Using the directionality properties of the signal, we propose an estimation and path planning algorithm that can efficiently navigate in cluttered indoor environments. We formulate Extended Kalman filters for emitter location estimation in cases where the signal is received in line-of-sight ...
-
This paper addresses the task of joint multi-agent perception and planning, especially as it relates to the real-world challenge of collision-free navigation for connected self-driving vehicles. For this task, several communication-enabled vehicles must navigate through a busy intersection while avoiding collisions with each other and with obstacles. To this end, this paper proposes a learnable co...
-
Distributed Potential iLQR: Scalable Game-Theoretic Trajectory Planning for Multi-Agent Interactions
In this work, we develop a scalable, local tra-jectory optimization algorithm that enables robots to interact with other robots. It has been shown that agents' interactions can be successfully captured in game-theoretic formulations, where the interaction outcome can be best modeled via the equilibria of the underlying dynamic game. However, it is typically challenging to compute equilibria of dyn...
-
This article presents a 3D point cloud map-merging framework for egocentric heterogeneous multi-robot exploration, based on overlap detection and alignment, that is independent of a manual initial guess or prior knowledge of the robots' poses. The novel proposed solution utilizes state-of-the-art place recognition learned descriptors, that through the framework's main pipeline, offer a fast and ro...
-
This paper presents a solution to the automatic task planning problem for multi-agent systems. A formal frame-work is developed based on Nondeterministic Finite Automata with ∊-transitions, where given the capabilities, constraints and failure modes of the agents involved, any initial state of the system and a task specification, an optimal solution is generated that satisfies the system constrain...
-
This paper addresses the Multi-Robot Active In-formation Acquisition (AIA) problem, where a team of mobile robots, communicating through an underlying graph, estimates a hidden state expressing a phenomenon of interest. Applications like target tracking, coverage and SLAM can be expressed in this framework. Existing approaches, though, are either not scalable, unable to handle dynamic phenomena or...
-
Patrolling with multiple robots is a challenging task. While the robots collaboratively and repeatedly cover the regions of interest in the environment, their routes should satisfy two often conflicting properties: i) (efficiency) the time intervals between two consecutive visits to the regions are small; ii) (unpredictability) the patrolling trajectories are random and unpredictable. We manage to...
-
It is known that autonomous cars can increase road capacities by maintaining a smaller headway through vehicle platooning. Recent works have shown that these capacity increases can influence vehicles' route choices in unexpected ways similar to the well-known Braess's paradox, such that the network congestion might increase. In this paper, we propose that in mixed-autonomy networks, i.e., networks...
-
While there have been advancements in autonomous driving control and traffic simulation, there have been little to no works exploring their unification with deep learning. Works in both areas seem to focus on entirely different exclusive problems, yet traffic and driving are inherently related in the real world. In this paper, we present Traffic-Aware Autonomous Driving (TrAAD), a generalizable di...
-
We derive a learning framework to generate routing/pickup policies for a fleet of autonomous vehicles tasked with servicing stochastically appearing requests on a city map. We focus on policies that 1) give rise to coordination amongst the vehicles, thereby reducing wait times for servicing requests, 2) are non-myopic, and consider a-priori potential future requests, 3) can adapt to changes in the...
-
To achieve safe cooperative driving in mixed traffic of manned and unmanned vehicles, it is necessary to understand and model human drivers' driving behaviors. This paper proposed a Hidden Markov Model (HMM)-based method to analyze human driver's control and vehicle's dynamics; and then recognize the human driver's action, such as accelerating, braking, and changing lanes. With the knowledge of th...
-
The prediction of surrounding agents' motion is a key for safe autonomous driving. In this paper, we explore navigation maps as an alternative to the predominant High Definition (HD) maps for learning-based motion prediction. Navigation maps provide topological and geometrical information on road-level, HD maps additionally have centimeter-accurate lane-level information. As a result, HD maps are ...
-
Most current LiDAR simultaneous localization and mapping (SLAM) systems build maps in point clouds, which are sparse when zoomed in, even though they seem dense to human eyes. Dense maps are essential for robotic applications, such as map-based navigation. Due to the low memory cost, mesh has become an attractive dense model for mapping in recent years. However, existing methods usually produce me...
-
With the fast development of autonomous driving technologies, there is an increasing demand for high-definition (HD) maps, which provide reliable and robust prior information about the static part of the traffic environments. As one of the important elements in HD maps, road lane centerline is critical for downstream tasks, such as prediction and planning. Manually annotating centerlines for road ...
-
Controllable and realistic traffic simulation is critical for developing and verifying autonomous vehicles. Typical heuristic-based traffic models offer flexible control to make vehicles follow specific trajectories and traffic rules. On the other hand, data-driven approaches generate realistic and human-like behaviors, improving transfer from simulated to real-world traffic. However, to the best ...
-
Diverse and realistic traffic scenarios are crucial for evaluating the AI safety of autonomous driving systems in simulation. This work introduces a data-driven method called TrafficGen for traffic scenario generation. It learns from the fragmented human driving data collected in the real world and then generates realistic traffic scenarios. TrafficGen is an autoregressive neural generative model ...
-
Intelligent intersection managers can improve safety by detecting dangerous drivers or failure modes in autonomous vehicles, warning oncoming vehicles as they approach an intersection. In this work, we present FailureNet, a recurrent neural network trained end-to-end on trajectories of both nominal and reckless drivers in a scaled miniature city. FailureNet observes the poses of vehicles as they a...
-
Recent advancements in Vehicle-to-Everything communication technology have enabled autonomous vehicles to share sensory information to obtain better perception performance. With the rapid growth of autonomous vehicles and intelligent infrastructure, the V2X perception systems will soon be deployed at scale, which raises a safety-critical question: how can we evaluate and improve its performance un...
-
Existing spatial localization techniques for au-tonomous vehicles mostly use a pre-built 3D-HD map, often constructed using a survey-grade 3D mapping vehicle, which is not only expensive but also laborious. This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy, prov...
-
Dubins coverage has been extensively researched to address the coverage path planning (CPP) problem of a known environment for the curvature-constrained robot. However, its fixed-speed assumption prevents the robot from accelerating to reduce the time and limits its flexibility to avoid obstacles. Therefore, this paper presents a collision-free CPP approach (CFC) for the obstacle-constrained envir...
-
We introduce a new route-finding problem which considers perception and travel costs simultaneously. Specifically, we consider the problem of finding the shortest tour such that all objects of interest can be detected successfully. To represent a viable detection region for each object, we propose to use an entropy-based viewing score that generates a diameter-bounded region as a viewing neighborh...
-
We study the problem of allocating many mobile robots for the execution of a pre-defined sweep schedule in a known two-dimensional environment, with applications toward search and rescue, coverage, surveillance, monitoring, pursuit-evasion, and so on. The mobile robots (or agents) are assumed to have one-dimensional sensing capability with probabilistic guarantees that deteriorate as the sensing d...
-
Collision evaluation is of essential importance in various applications. However, existing methods are either cumbersome to calculate or not exact. Therefore, considering the cost of implementation, most whole-body planning works, which require evaluating collision between robots and environments, struggle to tradeoff between accuracy and computationally efficiency. In this paper, we propose a zer...
-
Balancing safety and efficiency when planning in crowded scenarios with uncertain dynamics is challenging where it is imperative to accomplish the robot's mission without incurring any safety violations. Typically, chance constraints are incorporated into the planning problem to provide probabilistic safety guarantees by imposing an upper bound on the collision probability of the planned trajector...
-
In this paper, we look into the minimum obstacle displacement (MOD) planning problem from a mobile robot motion planning perspective. This problem finds an optimal path to goal by displacing movable obstacles when no path exists due to collision with obstacles. However this problem is computationally expensive and grows exponentially in the size of number of movable obstacles. This work looks into...
-
Motion planning for mobile robot platforms is one of the long-established research fields in robotics. In this paper, we propose a trajectory planner for mobile holonomic robots to steer non-holonomic conventional passive wheelchairs in dynamic environments. The challenges to overcome when steering a wheelchair are to find smooth feasible trajectories, maintain a fast reactive response to dynamic ...
-
Safe path planning is critical for bipedal robots to operate in safety-critical environments. Common path planning algorithms, such as RRT or RRT*, typically use geometric or kinematic collision check algorithms to ensure collision-free paths toward the target position. However, such approaches may generate non-smooth paths that do not comply with the dynamics constraints of walking robots. It has...
-
We present a lightweight, decentralized algorithm for navigating multiple nonholonomic agents through challenging environments with narrow passages. Our key idea is to allow agents to yield to each other in large open areas instead of narrow passages, to increase the success rate of conventional decentralized algorithms. At pre-processing time, our method computes a medial axis for the freespace. ...
-
Collision detection between objects is critical for simulation, control, and learning for robotic systems. How-ever, existing collision detection routines are inherently non-differentiable, limiting their applications in gradient-based opti-mization tools. In this work, we propose DCOL: a fast and fully differentiable collision-detection framework that reasons about collisions between a set of com...
-
This paper investigates the problem of fixed-wing unmanned aerial vehicles (UAV s) motion planning with posture constraints and the problem of the more general symmetrical situations where UAVs have more than one optimal solution. In this paper, the posture constraints are formulated in the 3D Dubins method, and the symmetrical situations are overcome by a more collaborative strategy called the sh...
-
This paper presents an efficient and safe method to avoid static and dynamic obstacles based on LiDAR. First, point cloud is used to generate a real-time local grid map for obstacle detection. Then, obstacles are clustered by DBSCAN algorithm and enclosed with minimum bounding ellipses (MBEs). In addition, data association is conducted to match each MBE with the obstacle in the current frame. Cons...
-
Borrowing elementary ideas from solid mechanics and differential geometry, this presentation shows that the volume swept by a regular solid undergoing a wide class of volume-preserving deformations induces a rather natural metric structure with well-defined and computable geodesics on its configuration space. This general result applies to concrete classes of articulated objects such as robot mani...
-
Mobile manipulators have gained attention for the potential in performing large-scale tasks which are beyond the reach of fixed-base manipulators. The Robotic Task Sequencing Problem for mobile manipulators often requires optimizing the motion sequence of the robot to visit multiple targets while reducing the number of base placements. A two-step approach to this problem is clustering the task-spa...
-
Replanning in temporal logic tasks is extremely difficult during the online execution of robots. This study introduces an effective path planner that computes solutions for temporal logic goals and instantly adapts to non-static and partially unknown environments. Given prior knowledge and a task specification, the planner first identifies an initial feasible solution by growing a sampling-based s...
-
Many methods that solve robot planning problems, such as task and motion planners, employ discrete symbolic search to find sequences of valid symbolic actions that are grounded with motion planning. Much of the efficacy of these planners lies in this grounding-bad placement and grasp choices can lead to inefficient planning when a problem has many geometric constraints. Moreover, grounding methods...
-
This work proposes a robot task planning framework for retrieving a target object in a confined workspace among multiple stacked objects that obstruct the target. The robot can use prehensile picking and in-workspace placing actions. The method assumes access to 3D models for the visible objects in the scene. The key contribution is in achieving desirable properties, i.e., to provide (a) safety, b...
-
Representing and reasoning about uncertainty is crucial for autonomous agents acting in partially observable environments with noisy sensors. Partially observable Markov decision processes (POMDPs) serve as a general framework for representing problems in which uncertainty is an important factor. Online sample-based POMDP methods have emerged as efficient approaches to solving large POMDPs and hav...
-
A factored Nonlinear Program (Factored-NLP) explicitly models the dependencies between a set of continuous variables and nonlinear constraints, providing an expressive formulation for relevant robotics problems such as manipulation planning or simultaneous localization and mapping. When the problem is over-constrained or infeasible, a fundamental issue is to detect a minimal subset of variables an...
-
In Task and motion planning (TAMP), symbolic search is combined with continuous geometric planning. A task planner finds an action sequence while a motion planner checks its feasibility and plans the corresponding sequence of motions. However, due to the high combinatorial complexity of discrete search, the number of calls to the geometric planner can be very large. Previous works [1] [2] leverage...
-
PDDLStream solvers have recently emerged as viable solutions for Task and Motion Planning (TAMP) problems, extending PDDL to problems with continuous action spaces. Prior work has shown how PDDLStream problems can be reduced to a sequence of PDDL planning problems, which can then be solved using off-the-shelf planners. However, this approach can suffer from long runtimes. In this paper we propose ...
-
This paper presents a novel algorithm for robot task and motion planning (TAMP) problems by utilizing a reachability tree. While tree-based algorithms are known for their speed and simplicity in motion planning (MP), they are not well-suited for TAMP problems that involve both abstracted and geometrical state variables. To address this challenge, we propose a hierarchical sampling strategy, which ...
-
Dynamic movement primitives (DMPs) provide an effective method of learning manipulation skills from human demonstration. DMPs can be especially useful for imitating industrial manipulation tasks which are performed by humans and are difficult to model, for instance, deformable object manipulation. In this work the effectiveness of a conventional Cartesian space DMP is enhanced using a compact and ...
-
This work aims at leveraging instructional video to guide the solving of complex multi-contact task-and-motion planning tasks in robotics. Towards this goal, we propose an extension of the well-established Rapidly-Exploring Random Tree (RRT) planner, which simultaneously grows multiple trees around grasp and release states extracted from the guiding video. Our key novelty lies in combining contact...
-
Transparent object perception is a crucial skill for applications such as robot manipulation in household and laboratory settings. Existing methods utilize RGB-D or stereo inputs to handle a subset of perception tasks including depth and pose estimation. However transparent object perception remains to be an open problem. In this paper, we forgo the unreliable depth map from RGB-D sensors and exte...
-
To operate safely and efficiently alongside human workers, collaborative robots (cobots) require the ability to quickly understand the dynamics of manipulated objects. However, traditional methods for estimating the full set of inertial parameters rely on motions that are necessarily fast and unsafe (to achieve a sufficient signal-to-noise ratio). In this work, we take an alternative approach: by ...
-
This paper describes a novel tissue positioning system with an integrated suturing robot and demonstrates its ability to perform semi-automatic anastomoses of synthetic blood vessels. We began with a finite element analysis-based design consideration for achieving adequate grasping of blood vessels to demonstrate robust performance under expected clinical forces. We then conducted standardized pos...
-
Ultrasound is a vital imaging modality utilized for a variety of diagnostic and interventional procedures. However, an expert sonographer is required to make accurate maneuvers of the probe over the human body while making sense of the ultrasound images for diagnostic purposes. This procedure requires a substantial amount of training and up to a few years of experience. In this paper, we propose a...
-
Recent advancements toward perception and decision-making of flexible endoscopes have shown great potential in computer-aided surgical interventions. However, owing to modeling uncertainty and inter-patient anatomical variation in flexible endoscopy, the challenge remains for efficient and safe navigation in patient-specific scenarios. This paper presents a novel data-driven framework with self-co...
-
In recent years, a deep learning framework has been widely used for object pose estimation. While quaternion is a common choice for rotation representation, it cannot represent the ambiguity of the observation. In order to handle the ambiguity, the Bingham distribution is one promising solution. However, it requires complicated calculation when yielding the negative log-likelihood (NLL) loss. An a...
-
Trajectory prediction in a cluttered environment is key to many important robotics tasks such as autonomous navigation. However, there are an infinite number of possible trajectories to consider. To simplify the space of trajectories under consideration, we utilise homotopy classes to partition the space into countably many mathematically equivalent classes. All members within a class demonstrate ...
-
In this paper, we develop an approach that enables autonomous robots to build and compress semantic environment representations from point-cloud data. Our approach builds a three-dimensional, semantic tree representation of the environment from raw sensor data which is then compressed by a novel information-theoretic tree-pruning approach. The proposed approach is probabilistic and incorporates th...
-
Typical algorithms for point cloud registration such as Iterative Closest Point (ICP) require a favorable initial transform estimate between two point clouds in order to perform a successful registration. State-of-the-art methods for choosing this starting condition rely on stochastic sampling or global optimization techniques such as branch and bound. In this work, we present a new method based o...
-
Outdoor 3D object detection has played an essential role in the environment perception of autonomous driving. In complicated traffic situations, precise object recognition provides indispensable information for prediction and planning in the dynamic system, improving self-driving safety and reliability. However, with the vehicle's veering, the constant rotation of the surrounding scenario makes a ...
-
Detecting obstacles is crucial for safe and efficient autonomous driving. To this end, we present NVRadarNet, a deep neural network (DNN) that detects dynamic obstacles and drivable free space using automotive RADAR sensors. The network utilizes temporally accumulated data from multiple RADAR sensors to detect dynamic obstacles and compute their orientation in a top-down bird's-eye view (BEV). The...
-
Radar semantic segmentation is a challenging task in environmental understanding, due as the radar data is noisy and suffers measurement ambiguities, which could lead to poor feature learning. To better tackle such difficulties, we present a novel and high-performance Transformer-based Radar Semantic Segmentation method, named TransRSS, to effectively and efficiently feature extraction for radar s...
-
A domain shift exists between the distributions of large scale, outdoor lidar datasets due to being captured using different types of lidar sensors, in different locations, and under varying weather conditions. Inclement weather in particular affects the quality of lidar data, adding artifacts such as scattered and missed points, leading to a drop in performance of 3D object detection networks tra...
-
Affordances are a fundamental concept in robotics since they relate available actions for an agent depending on its sensory-motor capabilities and the environment. We present a novel Bayesian deep network to detect affordances in images, at the same time that we quantify the distribution of the aleatoric and epistemic variance at the spatial level. We adapt the Mask-RCNN architecture to learn a pr...
-
6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. 6D pose estimation is particularly challenging in bin- picking applications, where many objects are low-feature and reflective, and self-occlusion between objects of the same type is common. We propose a novel multi-view approach leveraging known camera transformations from an eye-...
-
Path planning and localization in low-light and inclement weather conditions are critical problems facing autonomous vehicle systems. Our proposed method applies a single modality, millimetre-wave radar perception system for the detection of roadside retro-reflectors. Radar-based perception tasks can be challenging to perform due to the sparse and noisy nature of radar data. We propose the use of ...
-
Real-time vision in robotics plays an important role in localising and recognising objects. Recently, deep learning approaches have been widely used in robotic vision. However, most of these approaches have assumed that training and test sets come from similar data distributions, which is not valid in many real world applications. This study proposes an approach to address domain generalisation (i...
-
LiDAR detection of long-range vehicles is challenging because very few and sparse points are measured in long distances and vehicles with similar shapes of targets could lead to false positives easily. To tackle these challenges, taking the environment information (HD maps) into account could be beneficial to predetermine where targets are more or less likely to appear. Compared with semantic maps...
-
This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot. A key feature of the self-supervised training process is a graph-matching algorithm that operates on the over-segmentation output of the point cloud that is recon...
-
A key contributor to recent progress in 3D detection from single images is monocular depth estimation. Existing methods focus on how to leverage depth explicitly, by generating pseudo-pointclouds or providing attention cues for image features. More recent works leverage depth prediction as a pretraining task and fine-tune the depth representation while training it for 3D detection. However, the ad...
-
Visual classification under uncertainty is a complex computer vision problem. We present a thorough comparison of several variants of convolutional neural network (CNN) classification techniques in the context of ambiguous image data interpretation. We explore possible improvements in classification accuracy achieved by insertion of prior ambiguity information during the annotation process. This e...
-
Data augmentations are important for training high-performance 3D object detectors that use point clouds. Despite recent efforts on designing new data augmentations, perhaps surprisingly, most current state-of-the-art 3D detectors only rely on a few simple data augmentations. In particular, different from 2D image data augmentations, 3D data augmentations need to account for different representati...
-
Restoring an accurate Bird's Eye View (BEV) map plays a crucial role in the perception of autonomous driving. The existing works of lifting representations from frontal view to BEV can be classified into two categories, i.e., Camera model-Based Feature Transformation (CBFT) and Camera model-Free Feature Transformation (CFFT). We empirically analyze the significant differences between CBFT and CFFT...
-
The awareness about moving objects in the surroundings of a self-driving vehicle is essential for safe and reliable autonomous navigation. The interpretation of LiDAR and camera data achieves exceptional results but typically requires to accumulate and process temporal sequences of data in order to extract motion information. In contrast, radar sensors, which are already installed in most recent v...
-
3D lane detection is an integral part of au-tonomous driving systems. Previous CNN and Transformer-based methods usually first generate a bird's-eye-view (BEV) feature map from the front view image, and then use a sub-network with BEV feature map as input to predict 3D lanes. Such approaches require an explicit view transformation between BEV and front view, which itself is still a challenging pro...
-
Object recognition and instance segmentation are fundamental skills in any robotic or autonomous system. Existing state-of-the-art methods are often unable to capture meaningful uncertainty in challenging or ambiguous scenes, and as such can cause critical errors in high-performance applications. In this paper, we explore a class of distributional instance segmentation models using latent codes th...
-
Degraded visual environments have strong impacts on the quality of LiDAR data. Experiments in artificial fog conditions show that noise points caused by water particles present various distance distributions which depend on visibility. This article introduces a mathematical framework based on Bayesian inference and Markov Chain Monte-Carlo sampling to infer optical visibility from point clouds. Th...
-
Detecting objects under adverse weather and lighting conditions is crucial for the safe and continuous operation of an autonomous vehicle, and remains an unsolved problem. We present a Gated Differentiable Image Processing (GDIP) block, a domain-agnostic network architecture, which can be plugged into existing object detection networks (e.g., Yolo) and trained end-to-end with adverse condition ima...
-
Deep learning has led to great progress in the detection of mobile (i.e. movement-capable) objects in urban driving scenes in recent years. Supervised approaches typically require the annotation of large training sets; there has thus been great interest in leveraging weakly, semi- or self- supervised methods to avoid this, with much success. Whilst weakly and semi-supervised methods require some a...
-
This paper presents a new method for correspondence estimation between a previously known topology of a branched deformable linear object and an image representation from a 3D stereo camera. Although frequently encountered in production, robotic deformable linear object manipulation still lacks reliable sensor feedback. Especially for branched deformable linear objects, such as wire harnesses, cor...
-
While manipulating rigid objects is an extensively explored research topic, deformable linear object (DLO) manipulation seems significantly underdeveloped. A potential reason for this is the inherent difficulty in describing and observing the state of the DLO as its geometry changes during manipulation. This paper proposes an algorithm for fast-tracking the shape of a DLO based on the masked image...
-
State estimation is one of the greatest challenges for cloth manipulation due to cloth's high dimensionality and self-occlusion. Prior works propose to identify the full state of crumpled clothes by training a mesh reconstruction model in simulation. However, such models are prone to suffer from a sim-to-real gap due to differences between cloth simulation and the real world. In this work, we prop...
-
Learning to Estimate 3-D States of Deformable Linear Objects from Single-Frame Occluded Point Clouds
Accurately and robustly estimating the state of deformable linear objects (DLOs), such as ropes and wires, is crucial for DLO manipulation and other applications. However, it remains a challenging open issue due to the high dimensionality of the state space, frequent occlusions, and noises. This paper focuses on learning to robustly estimate the states of DLOs from single-frame point clouds in the...
-
Feature Extraction for Effective and Efficient Deep Reinforcement Learning on Real Robotic Platforms
Deep reinforcement learning (DRL) methods can solve complex continuous control tasks in simulated environments by taking actions based solely on state observations at each decision point. Because of the dynamics involved, individual snapshots of real-world sensor measurements afford only partial state observability, so it is typical to use a history of observations to improve training and policy p...
-
Safety is essential for deploying Deep Reinforcement Learning (DRL) algorithms in real-world scenarios. Recently, verification approaches have been proposed to allow quantifying the number of violations of a DRL policy over input-output relationships, called properties. However, such properties are hard-coded and require task-level knowledge, making their application intractable in challenging saf...
-
Active perception describes a broad class of techniques that couple planning and perception systems to move the robot in a way to give the robot more information about the environment. In most robotic systems, perception is typically independent of motion planning. For example, traditional object detection is passive: it operates only on the images it receives. However, we have a chance to improve...
-
Graph networks have recently been used for decision making in automated driving tasks for their ability to capture a variable number of traffic participants. Current high-level graph-based approaches, however, do not model the entire road network and thus must rely on handcrafted features for vehicle-to-vehicle edges encompassing the road topology indirectly. We propose an entity-relation framewor...
-
Learning-based methods in robotics hold the promise of generalization, but what can be done if a learned policy does not generalize to a new situation? In principle, if an agent can at least evaluate its own success (i.e., with a reward classifier that generalizes well even when the policy does not), it could actively practice the task and finetune the policy in this situation. We study this probl...
-
The successful application of robotic control requires intelligent decision-making to handle the long tail of complex scenarios that arise in real-world environments. Recently, Deep Reinforcement Learning (DRL) has provided a data-driven framework to automatically learn effective policies in such complex settings. Since its introduction in 2018, Soft Actor-Critic (SAC) remains as one of the most p...
-
Humans are capable of abstracting various tasks as different combinations of multiple attributes. This perspective of compositionality is vital for human rapid learning and adaption since previous experiences from related tasks can be combined to generalize across novel compositional settings. In this work, we aim to achieve zero-shot policy generalization of Reinforcement Learning (RL) agents by ...
-
Offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience. However, current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories, and thus hold limited relevance for real-world robotics. In this work (Real-ORL), we posi...
-
We propose a framework to enable multipurpose assistive mobile robots to autonomously wipe tables to clean spills and crumbs. This problem is challenging, as it requires planning wiping actions while reasoning over uncertain latent dynamics of crumbs and spills captured via high-dimensional visual observations. Simultaneously, we must guarantee constraints satisfaction to enable safe deployment in...
-
Communication enables agents to cooperate to achieve their goals. Learning when to communicate, i.e., sparse (in time) communication, and whom to message is particularly important when bandwidth is limited. However, recent work in learning sparse individualized communication suffers from high variance during training, where decreasing communication comes at the cost of decreased reward, particular...
-
Enabling the capability of assessing risk and making risk-aware decisions is essential to applying reinforcement learning to safety-critical robots like drones. In this paper, we investigate a specific case where a nano quadcopter robot learns to navigate an apriori-unknown cluttered environment under partial observability. We present a distributional reinforcement learning framework to generate a...
-
Powered by deep representation learning, re-inforcement learning (RL) provides an end-to-end learning framework capable of solving self-driving (SD) tasks without manual designs. However, time-varying nonstationary environments cause proficient but specialized RL policies to fail at execution time. For example, an RL-based SD policy trained under sunny days does not generalize well to rainy weathe...
-
Deep reinforcement learning has shown promise in learning robust skills for robot control, but typically requires a large amount of samples to achieve good performance. Sim-to-real transfer learning has been developed to solve this problem, but the policy trained in simulation usually has unsatisfactory performance in the real world because simulators inevitably model the dynamics of reality imper...
-
In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained exploration can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. ...
-
Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data. If we could combine data from all available sources, including multiple kinds of robots, we could train more powerful navigation models. In this paper, we study how a general goal-conditioned model for vision-based navigation can be trained on dat...
-
Recently, a growing number of work design unsupervised paradigms for point cloud processing to alleviate the limitation of expensive manual annotation and poor transferability of supervised methods. Among them, CrossPoint follows the contrastive learning framework and exploits image and point cloud data for unsupervised point cloud understanding. Although the promising performance is presented, th...
-
Tasks involving locally unstable or discontinuous dynamics (such as bifurcations and collisions) remain challenging in robotics, because small variations in the environment can have a significant impact on task outcomes. For such tasks, learning a robust deterministic policy is difficult. We focus on structuring exploration with multiple stochastic policies based on a mixture of experts (MoE) poli...
-
Robots are traditionally bounded by a fixed embodiment during their operational lifetime, which limits their ability to adapt to their surroundings. Co-optimizing control and morphology of a robot, however, is often inefficient due to the complex interplay between the controller and morphology. In this paper, we propose a learning-based control method that can inherently take morphology into consi...
-
Operating under real world conditions is challenging due to the possibility of a wide range of failures induced by execution errors and state uncertainty. In relatively benign settings, such failures can be overcome by retrying or executing one of a small number of hand-engineered recovery strategies. By contrast, contact-rich sequential manipulation tasks, like opening doors and assembling furnit...
-
High-dimensional expensive problems are often encountered in the design and optimization of complex robotic and automated systems and distributed computing systems, and they suffer from a time-consuming fitness evaluation process. It is extremely challenging and difficult to produce promising solutions in a high-dimensional search space. This work proposes an evolutionary optimization framework wi...
-
The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn't work. We enable these capabilities in autonomous agents by proposing “Hypothesize, Simulate, Act, Update, and Repeat” (H-SAUR), a probabil...
-
When humans perform a task with an articulated object, they interact with the object only in a handful of ways, while the space of all possible interactions is nearly endless. This is because humans have prior knowledge about what interactions are likely to be successful, i.e., to open a new door we first try the handle. While learning such priors without supervision is easy for humans, it is noto...
-
Natural language is one of the most intuitive ways to express human intent. However, translating instructions and commands towards robotic motion generation and deployment in the real world is far from being an easy task. The challenge of combining a robot's inherent low-level geometric and kinodynamic constraints with a human's high-level semantic instructions traditionally is solved using task-s...
-
In this work, we show how to learn a visual walking policy that only uses a monocular RGB camera and proprioception. Since simulating RGB is hard, we necessarily have to learn vision in the real world. We start with a blind walking policy trained in simulation. This policy can traverse some terrains in the real world but often struggles since it lacks knowledge of the upcoming geometry. This can b...
-
In-space assembly is crucial to creating large-scale space structures and enabling long term space missions. Natural limitations in the size of transportation vehicles and ISRU production facilities necessitate an additive strategy with the size of the typical structural unit being essentially fixed and inversely proportional to the final assembly size. In prior robotic and space assembly examples...
-
Modular self-assembling systems typically assume that modules are present to assemble. But in sparsely observed ocean environments modules of an aquatic modular robotic system may be separated by distances they do not have the energy to cross, and the information needed for optimal path planning is often unavailable. In this work we present a flow-based rendezvous and docking controller that allow...
-
Robots “in-the-wild” encounter and must traverse widely varying terrain, ranging from solid ground to granular materials like sand to full liquids. Numerous approaches exist, including wheeled and legged robots, each excelling in specific domains. Screw-based locomotion is a promising approach for multi-domain mobility, leveraged in exploratory robotic designs, including amphibious vehicles and sn...
-
Aerial-aquatic vehicles are capable to move in the two most dominant fluids, making them more promising for a wide range of applications. We propose a prototype with special designs for propulsion and thruster configuration to cope with the vast differences in the fluid properties of water and air. For propulsion, the operating range is switched for the different mediums by the dual-speed propulsi...
-
This study proposes a modular Multi-axis Elastic Actuator (MAEA) for legged robots that can effectively cope with impacts that may occur during dynamic maneuvering. MAEA has multi-axis compliance and can measure the torque without additional encoders. Therefore, effective impact resistance is possible with less volume and weight than conventional Series Elastic Actuators (SEA). The 6-axis stiffnes...
-
Applications of rolling diaphragm transmissions for medical and teleoperated robotics are of great interest, due to the low friction of rolling diaphragms combined with the power density and stiffness of hydraulic transmissions. However, the stiffness-enabling pressure preloads can form a tradeoff against bearing loading in some rolling diaphragm layouts, and transmission setup can be difficult. U...
-
Table-top robotic arms for education and research must be low-cost for availability and lightweight and soft for safety. Therefore, as such a robot, this study focuses on designing a plastic table-top cable-driven robotic arm with all motors located at the base link. However, locating all motors at the base link results in a significant distance between a driving motor and driven joint, increases ...
-
For robots to interact safely with humans and travel with minimal weight, low-density packable actuators are sought. Electronically-driven active materials like shape memory wire and other artificial muscle fibers offer solutions, but these materials need a restoring force. Moreover, if joint bending is required, the actuators must exert a bending moment around the joint. In this paper, we model t...
-
Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load, but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically d...
-
This paper proposes the concept design of a novel XY compliant parallel manipulator (CPM) with spatial configuration, which is beneficial to promote the performance of the XY CPM. Evolved from a planar configuration, a spatial compliant parallelogram flexure is devised as the basic module structure. Then, a mirror-symmetric XY CPM adopting spatial layout is proposed based on four-prismatic-prismat...
-
We propose a computational approach for designing fully-integrated compliant mechanisms with bio-inspired joints that are stabilized and actuated by elastic elements. Similar to human knees or finger phalanges, our mechanisms leverage sliding between pairs of contacting surfaces to generate complex motions. Due to the vast design space, however, finding surface shapes that lead to ideal approximat...
-
Underactuated tendon-driven fingers are a simple, yet effective solution, for realizing robotic grippers and hands. The lack of controllable degrees of actuation and precise sensing is compensated by the deformable structure of the finger, which is able to adapt to the objects to be grasped and manipulated, and also to implement grasping strategies based on environmental constraint exploitation. O...
-
Springs are commonly used in wearable robotic devices to provide assistive joint torque without the need for motors and batteries. However, different tasks (such as walking or running) and different users (such as athletes with strong legs or the elderly with weak legs) necessitate different assistive joint torques, and therefore, springs with different stiffness. Variable stiffness springs are a ...
-
Novel Spring Mechanism Enables Iterative Energy Accumulation under Force and Deformation Constraints
Springs can provide force at zero net energy cost by recycling negative mechanical work to benefit motor-driven robots or spring-augmented humans. However, humans have limited force and range of motion, and motors have a limited ability to produce force. These limits constrain how much energy a conventional spring can store and, consequently, how much assistance a spring can provide. In this paper...
-
This paper presents the design and performance of a planar 3R robot capable of dexterous constrained manipulation when interacting with a stiff environment. A novel variable stiffness actuator (VSA) having a stiffness ratio of approximately 500 is also described. Variable stiffness actuation, together with a combined position/compliance manipulation path, is used to: 1) allow the robot to passivel...
-
This paper presents a stiffness-changeable soft finger using chain mail jamming. This finger can achieve adaptive grasping and in-hand manipulation by reshaping and exerting changeable gripping force. The jamming phenomenon happens when particles in a chamber get interlocked where confining pressure is exerted at their boundaries, which is widely used to construct mechanisms with changeable stiffn...
-
Synthetic fiber ropes are widely used for robots because of their advantages such as lightweight, high tensile strength and flexibility. However, there is limited information on the physical properties of synthetic fiber ropes when used for robots. This study focuses on repetitive twisting of synthetic fiber ropes and provides information for selecting them for robots based on durability. To this ...
-
The main advantages of legged robots over wheeled ones are their abilities to traverse on uneven terrain due to the use of intermittent contacts and an ability to shift the center of mass relative to the contact location. A robot's leg design can be implemented by using an open-chain mechanism actuated with high-density torque actuators though this solution needs a vast energy budget. An alternati...
-
A cuspidal serial robot can travel from one inverse kinematic solution to another without crossing a singularity. Cuspidal robots ask for extra care and caution in trajectory planning, as identifying an aspect related to one unique inverse kinematic solution is not possible. The issues related to motion planning with cuspidal robots are related to the inherent property arising from the geometric d...
-
This paper presents a novel two-step method for synthesizing spatial linkage mechanisms. Compared with planar mechanisms, the main challenge in synthesizing spatial mechanisms is that the generating motion varies depending on its mechanism topologies. Therefore, we propose a big data approach to determine the topology of spatial mechanisms. We adopt a three-dimensional (3D) spring-connected rigid ...
-
Crochet is a textile craft that has resisted mech-anization and industrialization except for a select number of one-off crochet machines. These machines are only capable of producing a limited subset of common crochet stitches. Crochet machines are not used in the textile industry, yet mass-produced crochet objects and clothes sold in stores like Target and Zara are almost certainly the products o...
-
This paper presents a simple yet effective design of a platform to automate the task of shellfish aquaculture, specifically pearl oysters. Compared to traditional methods, our platform can eliminate the tedious task of cleaning the pearl oysters due to fouling. Inspired by the low and high tide characteristics of the intertidal zone, our platform employs an air-water displacement mechanism to peri...
-
Non-rigidly foldable origamis are of great interest to build robotic components, as they are light, offer large deployability and can also be multistable. In this paper, we consider the Kresling tower, and propose an original way to actively modulate its kinetostatic properties. Actuated stiffening mechanisms are embedded on some folds of the origami. By adjusting the axial stiffness of the folds,...
-
Springs are essential mechanical elements that are used across a wide variety of industries and mechanisms. Common across many spring types and applications is the importance of compactness, low mass and customizability. In this paper, we present a novel rotary spring design that is lightweight, compact and customizable. In addition, we empirically validate the design by experimentally quantifying...
-
We present the design, development, and evaluation of HREyes: biomimetic communication devices which use light to communicate information and, for the first time, gaze direction from AUVs to humans. First, we introduce two types of information displays using the HREye devices: active lucemes and ocular lucemes. Active lucemes communicate information explicitly through animations, while ocular luce...
-
Doctors perform limited one-way intestine endoscopy, in which advanced surgical robots with depth sensors, such as stereo and ToF endoscopes, can only provide sparse and incomplete depth information. However, dense, accurate and instant depth estimation during endoscopy is vital for doctors to judge the 3D location and shape of intestinal tissues, which affects the human-robot interaction between ...
-
Human-robot collaborative disassembly is an emerging trend in the sustainable recycling process of electronic and mechanical products. It requires the use of advanced technologies to assist workers in repetitive physical tasks and deal with creaky and potentially damaged components. Nevertheless, when disassembling worn-out or damaged components, unexpected robot behaviors may emerge, so harmless ...
-
When humans interact with robots influence is inevitable. Consider an autonomous car driving near a human: the speed and steering of the autonomous car will affect how the human drives. Prior works have developed frameworks that enable robots to influence humans towards desired behaviors. But while these approaches are effective in the short-term (i.e., the first few human-robot interactions), her...
-
This manuscript introduces an object deformability-agnostic framework for co-carrying tasks that are shared between a person and multiple robots. Our approach allows the full control of the co-carrying trajectories by the person while sharing the load with multiple robots depending on the size and the weight of the object. This is achieved by merging the haptic information transferred through the ...
-
Collaborative robots can relief human operators from excessive efforts during payload lifting activities. Modelling the human partner allows the design of safe and efficient collaborative strategies. In this paper, we present a control approach for human-robot collaboration based on human monitoring through whole-body wearable sensors, and interaction modelling through coupled rigid-body dynamics....
-
Robot policies need to adapt to human preferences and/or new environments. Human experts may have the domain knowledge required to help robots achieve this adaptation. However, existing works often require costly offline re-training on human feedback, and those feedback usually need to be frequent and too complex for the humans to reliably provide. To avoid placing undue burden on human experts an...
-
This paper discusses the incorporation of a pair of Supernumerary Robotic Limbs (SuperLimbs) onto the next generation of NASA space suits. The wearable robots attached to the space suit assist an astronaut in performing Extra-Vehicular Activities (EVAs). The SuperLimbs grab handrails fixed to the outside of a space vehicle to securely hold the astronaut body. The astronaut can use both hands for p...
-
Cooperative table-carrying is a complex task due to the continuous nature of the action and state-spaces, multimodality of strategies, and the need for instantaneous adaptation to other agents. In this work, we present a method for predicting realistic motion plans for cooperative human-robot teams on the task. Using a Variational Recurrent Neural Network (VRNN) to model the variation in the traje...
-
In physical human-robot interaction (pHRI) it is essential to reliably estimate and localize contact forces between the robot and the environment. In this paper, a complete contact detection, isolation, and reaction scheme is presented and tested on a new 6-dof industrial collaborative robot. We combine two popular methods, based on monitoring energy and generalized momentum, to detect and isolate...
-
In human-aware navigation, the robot tacitly games with humans, balancing safety and efficiency according to human intentions. Poor balance or bad intent recognition causes the robot to stop conservatively or advance rashly, resulting in a deadlock or even a collision respectively. To address the issue, this paper proposes an improved limit cycle for collaboratively parameterizing human intentions...
-
Human-robot interaction (HRI) demands efficient time performance along the tasks. However, some interaction approaches may extend the time to complete such tasks. Thus, the time performance in HRI must be enhanced. This work presents an effective way to enhance the time performance in HRI tasks with a mixed reality (MR) method based on a gaze-speech system. In this paper, we design an MR world for...
-
End-to-end autonomous driving has great potential in the transportation industry. However, the lack of transparency and interpretability of the automatic decision-making process hinders its industrial adoption in practice. There have been some early attempts to use attention maps or cost volume for better model explainability which is difficult for ordinary passengers to understand. To bridge the ...
-
Practical implementations of deep reinforcement learning (deep RL) have been challenging due to an amplitude of factors, such as designing reward functions that cover every possible interaction. To address the heavy burden of robot reward engineering, we aim to leverage subjective human preferences gathered in the context of human-robot interaction, while taking advantage of a baseline reward func...
-
We present EWareNet, a novel intent and affect-aware social robot navigation algorithm among pedestrians. Our approach predicts the trajectory-based pedestrian intent from gait sequence, which is then used for intent-guided navigation taking into account social and proxemic constraints. We propose a transformer-based model that works on commodity RGB-D cameras mounted onto a moving robot. Our inte...
-
Designing a socially-aware navigation method for crowded environments has become a critical issue in robotics. In order to perform navigation in a crowded environment without causing discomfort to nearby pedestrians, it is necessary to design a global planner that is able to consider both human-robot interaction (HRI) and prediction of future states. In this paper, we propose a socially-aware glob...
-
HOI detection is essential for human-computer interaction, especially in behavior detection and robot manipulation. Existing mainstream transformer methods of HOI detection are focused on single-stream detection only, e.g., $image \rightarrow HOI(\mathcal{P}_{1})$, or $image \rightarrow HO\rightarrow I(\mathcal{P}_{2})$. Both paths have their own characteristics of concern, so we propose a novel m...
-
Robot person following (RPF) is a capability that supports many useful human-robot-interaction (HRI) applications. However, existing solutions to person following often as-sume full observation of the tracked person. As a consequence, they cannot track the person reliably under partial occlusion where the assumption of full observation is not satisfied. In this paper, we focus on the problem of ro...
-
Person re-identification plays a key role in applications where a mobile robot needs to track its users over a long period of time, even if they are partially unobserved for some time, in order to follow them or be available on demand. In this context, deep-learning-based real-time feature extraction on a mobile robot is often performed on special-purpose devices whose computational resources are ...
-
The capability of humanoid robots to generate facial expressions is crucial for enhancing interactivity and emotional resonance in human-robot interaction. However, humanoid robots vary in mechanics, manufacturing, and ap-pearance. The lack of consistent processing techniques and the complexity of generating facial expressions pose significant challenges in the field. To acquire solutions with hig...
-
The requirements of modern production systems together with more advanced robotic technologies have fostered the integration of teams comprising humans and autonomous robots. While this integration has the potential to provide various benefits, it also raises questions about how to effectively manage these teams, taking into account the different characteristics of the agents involved. This paper ...
-
We study landmark-based SLAM with unknown data association: our robot navigates in a completely unknown environment and has to simultaneously reason over its own trajectory, the positions of an unknown number of landmarks in the environment, and potential data associations between measurements and landmarks. This setup is interesting since: (i) it arises when recovering from data association failu...
-
Bundle adjustment (BA) is a well-studied fundamental problem in the robotics and vision community. In man-made environments, coplanar points and lines are ubiquitous. However, the number of works on bundle adjustment with coplanar points and lines is relatively small. This paper focuses on this special BA problem, referred to as $\pi-\mathbf{BA}$. For a point or a line on a plane, we derive a new ...
-
Robotic perception is currently at a cross-roads between modern methods, which operate in an efficient latent space, and classical methods, which are mathematically founded and provide interpretable, trustworthy results. In this paper, we introduce a Convolutional Bayesian Kernel Inference (Con-vBKI) layer which learns to perform explicit Bayesian inference within a depthwise separable convolution...
-
Accurate mapping of large-scale environments is an essential building block of most outdoor autonomous systems. Challenges of traditional mapping methods include the balance between memory consumption and mapping accuracy. This paper addresses the problem of achieving large-scale 3D reconstruction using implicit representations built from 3D LiDAR measurements. We learn and store implicit features...
-
High-definition maps are crucial perception elements for autonomous robot navigation systems, which can provide accurate scene layout and environment information for downstream motion prediction and planning control tasks. Traditional methods based on manual annotation or SLAM algorithms require massive labor efforts and time costs, which hinders the deployment of practical applications. Online co...
-
This paper proposes Contour Context, a simple, effective, and efficient topological loop closure detection pipeline with accurate 3-DoF metric pose estimation, targeting the urban autonomous driving scenario. We interpret the Cartesian bird's eye view (BEV) image projected from 3D LiDAR points as layered distribution of structures. To recover elevation information from BEVs, we slice them at diffe...
-
We present the Reflectance Field Map, a reliable real-time method for detecting shiny surfaces, like glass, metal, and mirrors, with lidar. The Reflectance Field Map combines the theory developed for Light Field Mapping, common in computer graphics, with occupancy grid mapping. Like early methods for sonar-based robot mapping, we show how the addition of angular viewpoint information to a standard...
-
Sensing environmental obstacles and establishing an occupancy map of surroundings are critical to achieving automated parking for autonomous vehicles. This paper presents a method to obtain surrounding occupancy information from inverse perspective mapping (IPM) images. This method uses the easily-accessed pseudo-labels from LiDAR to supervise a visual network, which can detect occupied boundaries...
-
Modeling scene geometry using implicit neural representation has revealed its advantages in accuracy, flexibility, and low memory usage. Previous approaches have demonstrated impressive results using color or depth images but still have difficulty handling poor light conditions and large-scale scenes. Methods taking global point cloud as input require accurate registration and ground truth coordin...
-
Accurate localization is a core component of a robot's navigation system. To this end, global navigation satellite systems (GNSS) can provide absolute measurements outdoors and, therefore, eliminate long-term drift. However, fusing GNSS data with other sensor data is not trivial, especially when a robot moves between areas with and without sky view. We propose a robust approach that tightly fuses ...
-
We propose EMS®, a cloud-enabled massive computational experiment management system supporting high-throughput computational robotics research. Compared to existing systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational ...
-
Sensor simulation has emerged as a promising and powerful technique to find solutions to many real-world robotic tasks like localization and pose tracking. However, commonly used simulators have high hardware requirements and are therefore used mostly on high-end computers. In this paper, we present an approach to simulate range sensors directly on embedded hardware of mobile robots that use trian...
-
Robotic applications involving people often require advanced perception systems to better understand complex real-world scenarios. To address this challenge, photo-realistic and physics simulators are gaining popularity as a means of generating accurate data labeling and designing scenarios for evaluating generalization capabilities, e.g., lighting changes, camera movements or different weather co...
-
The paper presents a novel domain-specific language, RoboSC, for developing supervisory controllers for robotic applications. RoboSC supports concepts of ROS/ROS2 and supervisory control theory. It enables users to focus on the modeling and the synthesis process of supervisory controllers for ROS applications only because it generates all artifacts needed to connect such controllers to ROS applica...
-
As advanced algorithms enable robots to handle more challenging tasks and operate more autonomously, the on-board computer cannot meet the increased demands regarding computing power and memory storage in an efficient way. Leveraging the massive computing power of the cloud and low-latency connectivity to the edge can compensate for this lack of computing resources. However, this introduces a new ...
-
The Unified Robot Description Format (URDF) and, to a lesser extent, the COLLAborative Design Activity (COLLADA) format are two of the most popular domain-specific languages (DSLs) to represent kinematic chains in robotics with support in many tools including Gazebo, MoveIt!, KDL or IKFast. In this paper we analyse both DSLs with respect to their structure and semantics as seen by tools that produ...
-
We present SIERRA, a novel framework for accelerating development and improving reproducibility of results in robotics research. SIERRA accelerates research by automating the process of generating experiments from queries over independent variables, executing experiments, and processing the results to generate deliverables such as graphs and videos. It shifts the paradigm for testing hypotheses fr...
-
This paper presents OpTaS, a task specification Python library for Trajectory Optimization (TO) and Model Predictive Control (MPC) in robotics. Both TO and MPC are increasingly receiving interest in optimal control and in particular handling dynamic environments. While a flurry of software libraries exists to handle such problems, they either provide interfaces that are limited to a specific probl...
-
In this paper, we propose a novel representation for grasping using contacts between multi-finger robotic hands and objects to be manipulated. This representation significantly reduces the prediction dimensions and accelerates the learning process. We present an effective end-to-end network, CMG-Net, for grasping unknown objects in a cluttered environment by efficiently predicting multi-finger gra...
-
This paper introduces Amazon Robotic Manipulation Benchmark (ARMBench), a large-scale, object-centric benchmark dataset for robotic manipulation in the context of a warehouse. Automation of operations in modern warehouses requires a robotic manipulator to deal with a wide variety of objects, unstructured storage, and dynamically changing inventory. Such settings pose challenges in perceiving the i...
-
We introduce the Few-Shot Object Learning (FEWSOL) dataset for object recognition with a few images per object. We captured 336 real-world objects with 9 RGB-D images per object from different views. Fewsol has object segmentation masks, poses, and attributes. In addition, synthetic images generated using 330 3D object models are used to augment the dataset. We investigated (i) few-shot object cla...
-
In the era of deep learning, data is the critical determining factor in the performance of neural network models. Generating large datasets suffers from various challenges such as scalability, cost efficiency and photorealism. To avoid expensive and strenuous dataset collection and annotations, researchers have inclined towards computer-generated datasets. However, a lack of photorealism and a lim...
-
As LiDAR sensors have become ubiquitous, the need for an efficient LiDAR data compression algorithm has increased. Modern LiDARs produce gigabytes of scan data per hour (Fig. 1) and are often used in applications with limited compute, bandwidth, and storage resources. We present a fast, lossless compression algorithm for Li-DAR range and attribute scan sequences including multiple-return range, si...
-
Robots operating in the real world are expected to detect, classify, segment, and estimate the pose of objects to accomplish their task. Modern approaches using deep learning not only require large volumes of data but also pixel-accurate annotations in order to evaluate the performance and therefore safety of these algorithms. At present, publicly available tools for annotating data are scarce and...
-
We present a new traffic dataset, Meteor, which captures traffic patterns and multi-agent driving behaviors in unstructured scenarios. Meteor consists of more than 1000 one-minute videos, over 2 million annotated frames with bounding boxes and GPS trajectories for 16 unique agent categories, and more than 13 million bounding boxes for traffic agents. Meteor is a dataset for rare and interesting, m...
-
In this paper, we address the lack of datasets for – and the issue of reproducibility in – collaborative SLAM pose graph optimizers by providing a novel pose graph generator. Our pose graph generator, kollagen, is based on a random walk in a planar grid world, similar to the popular M3500 dataset for single agent SLAM. It is simple to use and the user can set several parameters, e.g., the number o...
-
Obstacle avoidance is an essential topic in the field of autonomous drone research. When choosing an avoidance algorithm, many different options are available, each with their advantages and disadvantages. As there is currently no consensus on testing methods, it is quite challenging to compare the performance between algorithms. In this paper, we propose AvoidBench, a benchmarking suite which can...
-
Terrain-aware locomotion has become an emerging topic in legged robotics. However, it is hard to generate diverse, challenging, and realistic unstructured terrains in simulation, which limits the way researchers evaluate their locomotion policies. In this paper, we prototype the generation of a terrain dataset via terrain authoring and active learning, and the learned samplers can stably generate ...
-
Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data. We take on these challenges via a new benchmark: Train Offline, Test Online (TOTO). TOTO provides remote users with access to shared robots for evaluating methods on common...
-
The main challenge in developing effective reinforcement learning (RL) pipelines is often the design and tuning the reward functions. Well-designed shaping reward can lead to significantly faster learning. Naively formulated rewards, however, can conflict with the desired behavior and result in overfitting or even erratic performance if not properly tuned. In theory, the broad class of potential b...
-
The highly varied and deformable structure of clothing presents a challenging task in the area of robot manipulation. Recent literature has shown an increasing interest in this field, however limited information exists on the influence of end-effector selection, instead focusing on the perception, modelling, and methodology in handling fabrics. Here, we present a benchmark set of household clothin...
-
Sampling-based motion planning algorithms have been continuously developed for more than two decades. Apart from mobile robots, they are also widely used in manipulator motion planning. Hence, these methods play a key role in collaborative and shared workspaces. Despite numerous improvements, their performance can highly vary depending on the chosen parameter setting. The optimal parameters depend...
-
Deep reinforcement learning (RL) has brought many successes for autonomous robot navigation. However, there still exists important limitations that prevent real-world use of RL-based navigation systems. For example, most learning approaches lack safety guarantees; and learned navigation systems may not generalize well to unseen environments. Despite a variety of recent learning techniques to tackl...
-
Several successful approaches exist for solving the complex problem of multi-robot planning and coordination. Due to the lack of adequate benchmarking tools, comparing these approaches and judging their suitability for use in realistic scenarios is currently difficult. Therefore, we propose an open-source benchmark suite that aims to close this gap. Unlike existing benchmarks, our approach uses fu...
-
LiDAR sensors are widely used for 3D object detection in various mobile robotics applications. LiDAR sensors continuously generate point cloud data in real-time. Conventional 3D object detectors detect objects using a set of points acquired over a fixed duration. However, recent studies have shown that the performance of object detection can be further enhanced by utilizing spatio-temporal informa...
-
In this paper, we present a simple yet effective semi-supervised 3D object detector named DDS3D. Our main contributions have two-fold. On the one hand, different from previous works using Non-Maximal Suppression (NMS) or its variants for obtaining the sparse pseudo labels, we propose a dense pseudo-label generation strategy to get dense pseudo-labels, which can retain more potential supervision in...
-
Robotic systems need advanced mobility capabili-ties to operate in complex, three-dimensional environments designed for human use, e.g., multi-level buildings. Incorporating some level of autonomy enables robots to operate robustly, reliably, and efficiently in such complex environments, e.g., automatically “returning home” if communication between an operator and robot is lost during deployment. ...
-
Considerable research effort has been devoted to LiDAR-based 3D object detection and empirical performance has been significantly improved. While progress has been en-couraging, we observe an overlooked issue: it is not yet common practice to compare different 3D detectors under the same cost, e.g., inference latency. This makes it difficult to quantify the true performance gain brought by recentl...
-
Zero-shot object detection has shown its ability to overcome the problems of data scarcity and novel classes. Existing methods generally utilize static semantic vectors to classify objects and guide the network to map visual features to semantic vectors. However, the distribution of semantic vectors cannot adequately represent visual features, which makes migration from seen to unseen classes diff...
-
Anomaly segmentation on the urban landscape scene is an important task in autonomous driving. This process exploits a pre-trained semantic segmentation network to estimate anomalous regions. Anomaly segmentation approaches implemented with extra requirements such as out-of-domain data, extra network, or network retraining might increase the computational cost or degradation of segmentation perform...
-
Domain generalization for semantic segmentation is highly demanded in real applications, where a trained model is expected to work well in previously unseen domains. One challenge lies in the lack of data which could cover the diverse distributions of the possible unseen domains for training. In this paper, we propose a WEb-image assisted Domain GEneralization (WEDGE) scheme, which is the first to...
-
In this paper, we explore incremental few-shot object detection (iFSD), which incrementally learns novel classes using only a few examples without revisiting base classes. Previous iFSD works achieved the desired results by applying metalearning. However, meta-learning approaches show insufficient performance that is difficult to apply to practical problems. In this light, we propose a simple fine...
-
In this paper, we present a simple and efficient scheme for segmenting approximately convex 3D object instances in depth images in a few-shot setting via discriminatively modeling the 3D shape of the object using a neural network. Our key idea is to select pairs of 3D points on the depth image between which we compute surface geodesics. As the number of such geodesics is quadratic in the number of...
-
3D point cloud semantic segmentation is one of the fundamental tasks for environmental understanding. Although significant progress has been made in recent years, the performance of classes with few examples or few points is still far from satisfactory. In this paper, we propose a novel multi-to-single knowledge distillation framework for the 3D point cloud semantic segmentation task to boost the ...
-
Instance segmentation is a long-standing task for supporting robotic bin picking. However, objects of diverse classes can be closely packed with occlusions in cluttered and chaotic scenes, hence, even recent methods could have difficulty in locating clear and precise boundaries to distinguish nearby objects. In this work, we aim to improve the boundary quality of the instance masks for robust and ...
-
Background subtraction is an important topic in computer vision and video analysis. It is challenging to robustly segment foreground and background in complex scenarios. In the literature there are efforts to address some of the main challenges such as illumination change, dynamic backgrounds, hard shadows, and intermittent object motion. However, most of the research has focused on applying advan...
-
In autonomous driving, the novel objects and lack of annotations challenge the traditional 3D LiDAR semantic segmentation based on deep learning. Few-shot learning is a feasible way to solve these issues. However, currently few-shot semantic segmentation methods focus on camera data, and most of them only predict the novel classes without considering the base classes. This setting cannot be direct...
-
Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourced annotated data sets, semantic segmentation with radar signals has become very promising. However, existing methods are either computationally expen...
-
Transferring knowledge learned from the labeled source domain to the raw target domain for unsupervised domain adaptation (UDA) is essential to the scalable deployment of autonomous driving systems. State-of-the-art methods in UDA often employ a key idea: utilizing joint supervision signals from both source and target domains for self-training. In this work, we improve and extend this aspect. We p...
-
Every autonomous driving dataset has a different configuration of sensors, originating from distinct geographic regions and covering various scenarios. As a result, 3D detectors tend to overfit the datasets they are trained on. This causes a drastic decrease in accuracy when the detectors are trained on one dataset and tested on another. We observe that lidar scan pattern differences form a large ...
-
We introduce a technique for pairwise registration of neural fields that extends classical optimization-based local registration (i.e. ICP) to operate on Neural Radiance Fields (NeRF)-neural 3D scene representations trained from collections of calibrated images. NeRF does not decompose illumination and color, so to make registration invariant to illumination, we introduce the concept of a “surface...
-
We present a system for applying sim2real approaches to “in the wild” scenes with realistic visuals, and to policies which rely on active perception using RGB cameras. Given a short video of a static scene collected using a generic phone, we learn the scene's contact geometry and a function for novel view synthesis using a Neural Radiance Field (NeRF). We augment the NeRF rendering of the static s...
-
We show that ensembling effectively quantifies model uncertainty in Neural Radiance Fields (NeRFs) if a density-aware epistemic uncertainty term is considered. The naive ensembles investigated in prior work simply average rendered RGB images to quantify the model uncertainty caused by conflicting explanations of the observed scene. In contrast, we additionally consider the termination probabilitie...
-
We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a moment...
-
We propose a novel visual re-localization method based on direct matching between the implicit 3D descriptors and the 2D image with transformer. A conditional neural radiance field(NeRF) is chosen as the 3D scene representation in our pipeline, which supports continuous 3D descriptors generation and neural rendering. By unifying the feature matching and the scene coordinate regression to the same ...
-
This paper addresses the challenge of reconstructing a scene with a neural radiance field (NeRF) for robot vision and scene understanding using multiple modalities. Researchers have introduced the use of NeRF to represent an object for synthesizing and rendering novel views of complex scenes by optimizing a 3-D radiance field for ray casting and rendering for 2-D RGB images. However, using RGB ima...
-
A spatial AI that can perform complex tasks through visual signals and cooperate with humans is highly anticipated. To achieve this, we need a visual SLAM that easily adapts to new scenes without pre-training and generates dense maps for downstream tasks in real-time. None of the previous learning-based and non-learning-based visual SLAMs satisfy all needs due to the intrinsic limitations of their...
-
Most recently proposed methods for robotic per-ception are based on deep learning, which require very large datasets to perform well. The accuracy of a learned model is mainly dependent on the data distribution it was trained on. Thus for deploying such models, it is crucial to use training data belonging to the robot's environment. However, collecting and labeling data is a significant bottleneck...
-
Robots can incorporate data from human teachers when learning new tasks. However, this data can often be noisy, which can cause robots to learn slowly or not at all. One method for learning from human teachers is Human-in-the-loop Reinforcement Learning (HRL), which can combine information from both an environmental reward and external feedback from human teachers. However, many HRL methods assume...
-
Automated Valet Parking (AVP) has been exten-sively researched as an important application of autonomous driving. Considering the high dynamics and density of real parking lots, a system that considers multiple vehicles simultaneously is more robust and efficient than a single vehicle setting as in most studies. In this paper, we propose a dis-tributed Multi-agent Reinforcement Learning(MARL) meth...
-
Autonomous navigation in crowded environments is an open problem with many applications, essential for the coexistence of robots and humans in the smart cities of the future. In recent years, deep reinforcement learning approaches have proven to outperform model-based algorithms. Nevertheless, even though the results provided are promising, the works are not able to take advantage of the capabilit...
-
Real-time learning is crucial for robotic agents adapting to ever-changing, non-stationary environments. A common setup for a robotic agent is to have two different computers simultaneously: a resource-limited local computer tethered to the robot and a powerful remote computer connected wirelessly. Given such a setup, it is unclear to what extent the performance of a learning system can be affecte...
-
Reinforcement learning (RL) exhibits impressive performance when managing complicated control tasks for robots. However, its wide application to physical robots is limited by the absence of strong safety guarantees. To overcome this challenge, this paper explores the control Lyapunov barrier function (CLBF) to analyze the safety and reachability solely based on data without explicitly employing a ...
-
Safety is a fundamental property for the real-world deployment of robotic platforms. Any control policy should avoid dangerous actions that could harm the environment, humans, or the robot itself. In reinforcement learning (RL), safety is crucial when exploring a new environment to learn a new skill. This paper introduces a new formulation of safe exploration for robotic RL in the tangent space of...
-
In machine learning, meta-learning methods aim for fast adaptability to unknown tasks using prior knowledge. Model-based meta-reinforcement learning combines reinforcement learning via world models with Meta Reinforcement Learning (MRL) for increased sample efficiency. However, adaption to unknown tasks does not always result in preferable agent behavior. This paper introduces a new Meta Adaptatio...
-
We study the training performance of ROS local planners based on Reinforcement Learning (RL), and the trajectories they produce on real-world robots. We show that recent enhancements to the Soft Actor Critic (SAC) algorithm such as RAD and DrQ achieve almost perfect training after only 10000 episodes. We also observe that on real-world robots the resulting SACPlanner is more reactive to obstacles ...
-
Clothes grasping and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable grasping points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often ...
-
Due to the COVID-19 epidemic, video conferencing has evolved as a new paradigm of communication and teamwork. However, private and personal information can be easily leaked through cameras during video conferencing. This includes leakage of a person's appearance as well as the contents in the background. This paper proposes a novel way of using online low-resolution thermal images as conditions to...
-
Despite rapid advancements in the lifelong learning (LL) research, a large body of research mainly focuses on improving the performance in the existing static continual learning (CL) setups. These methods lack the ability to succeed in a rapidly changing dynamic environment, where an AI agent needs to quickly learn new instances in a ‘single pass' from the non-i.i.d (also possibly temporally conti...
-
Large language models (LLMs) trained on code-completion have been shown to be capable of synthesizing simple Python programs from docstrings [1]. We find that these code-writing LLMs can be re-purposed to write robot policy code, given natural language commands. Specifically, policy code can express functions or feedback loops that process perception outputs (e.g., from object detectors [2], [3]) ...
-
It is crucial to address the following issues for ubiquitous robotics manipulation applications: (a) vision-based manipulation tasks require the robot to visually learn and understand the object with rich information like dense object descriptors; and (b) sim-to-real transfer in robotics aims to close the gap between simulated and real data. In this paper, we present Sim-to-Real Dense Object Nets ...
-
Based on the recent advancements in representation learning, we propose a novel pipeline for task-oriented voice-controlled robots with raw sensor inputs. Previous methods rely on a large number of labels and task-specific reward functions. Not only can such an approach hardly be improved after the deployment, but also has limited generalization across robotic platforms and tasks. To address these...
-
Perceptual understanding of the scene and the relationship between its different components is important for successful completion of robotic tasks. Representation learning has been shown to be a powerful technique for this, but most of the current methodologies learn task specific representations that do not necessarily transfer well to other tasks. Furthermore, representations learned by supervi...
-
We propose Predictive Information bottleneck for Goal representation learning (PI-Goal), a self-supervised method for sample-efficient goal-conditioned reinforcement learning (RL). Goal-conditioned RL learns to reach commanded goals with reward signals. A goal could be given in a noisy or abstract form, and thus jeopardizes sample efficiency. Previous methods usually assume that the agent can map ...
-
Collaborative robots became a popular tool for increasing productivity in partly automated manufacturing plants. Intuitive robot teaching methods are required to quickly and flexibly adapt the robot programs to new tasks. Gestures have an essential role in human communication. However, in human-robot-interaction scenarios, gesture-based user interfaces are so far used rarely, and if they employ a ...
-
Despite recent advances in video-guided robotic imitation learning, many methods still rely on human experts to provide sparse rewards that indicate whether robots have successfully completed tasks. The challenge of enabling robots to autonomously evaluate whether their actions can complete complex, multi-stage tasks remains unresolved. In this work, we propose an efficient few-shot robotic learni...
-
Self-assessment rules play an essential role in safe and effective real-world robotic applications, which verify the feasibility of the selected action before actual execution. But how to utilize the self-assessment results to re-choose actions remains a challenge. Previous methods eliminate the selected action evaluated as failed by the self-assessment rules, and re-choose one with the next-highe...
-
We have developed an autonomous robot motion generation model based on distributed and integrated multimodal learning. Since each modality used as a robot's senses, such as image, joint angle, and torque, has a different physical meaning and time characteristic, the generation of autonomous motions using multimodal learning has sometimes failed due to overlearning in one of the modalities. Inspire...
-
Tasks where the set of possible actions depend discontinuously on the state pose a significant challenge for current reinforcement learning algorithms. For example, a locked door must be first unlocked, and then the handle turned before the door can be opened. The sequential nature of these tasks makes obtaining final rewards difficult, and transferring information between task variants using cont...
-
For assisting humans in their daily lives, robots need to perform long-horizon tasks, such as tidying up a room or preparing a meal. One effective strategy for handling a long-horizon task is to break it down into short-horizon subgoals, that the robot can execute sequentially. In this paper, we propose extending a predictive learning model using deep neural networks (DNN) with a Subgoal Proposal ...
-
The Multi-Object Navigation (MultiON) task requires a robot to localize an instance (each) of multiple object classes. It is a fundamental task for an assistive robot in a home or a factory. Existing methods for MultiON have viewed this as a direct extension of Object Navigation (ON), the task of localising an instance of one object class, and are pre-sequenced, i.e., the sequence in which the obj...
-
In this work, we present a method to extract the skeleton of a self-occluded tree canopy by estimating the unobserved structures of the tree. A tree skeleton compactly describes the topological structure and contains useful information such as branch geometry, positions and hierarchy. This can be critical to planning contact interactions for agricultural manipulation, yet is difficult to gain due ...
-
Plants are dynamic organisms and understanding temporal variations in vegetation is an essential problem for robots in the wild. However, associating repeated 3D scans of plants across time is challenging. A key step in this process is re-identifying and tracking the same individual plant components over time. Previously, this has been achieved by comparing their global spatial or topological loca...
-
In this paper, we present a method for creating high-quality 3D models of sorghum panicles for phenotyping in breeding experiments. This is achieved with a novel reconstruction approach that uses seeds as semantic landmarks in both 2D and 3D. To evaluate the performance, we develop a new metric for assessing the quality of reconstructed point clouds without ground-truth. Finally, a counting method...
-
Plant phenotyping is a central task in agriculture, as it describes plants' growth stage, development, and other relevant quantities. Robots can help automate this process by accurately estimating plant traits such as the number of leaves, leaf area, and the plant size. In this paper, we address the problem of joint semantic, plant instance, and leaf instance segmentation of crop fields from RGB d...
-
Crop inspection is a critical part of modern agricultural practices that helps farmers assess the current status of a field and then make crop management decisions. Current crop inspection methods are labour-intensive tasks, which makes them rather slow and expensive to apply. In this paper, we exploit recent advancements in implicit mapping to tackle the challenging context of agricultural enviro...
-
The determination of a crop's growth-stage is critical information for precision agriculture. Estimates of the growth-stage are used to guide irrigation and the application of agrochemicals. Of particular importance is the use of fertilizers, however, growth-stage estimates may also suggest further investigation of potential crop infections and infestations. Traditionally, the growth-stage is base...
-
Precision agricultural robots require high-resolution navigation solutions. In this paper, we introduce a robust neural-inertial sequence learning approach to track such robots with ultra-intermittent GNSS updates. First, we propose an ultra-lightweight neural-Kalman filter that can track agricultural robots within 1.4 m (1.4–5.8× better than competing techniques), while tracking within 2.75 m wit...
-
Monitoring the traits of plants and fruits is a fundamental task in horticulture. With accurate measurements, farmers can predict the yield of their crops and use this information for making informed management decisions, and breeders can use it for variety selection. Agricultural robotic applications promise to automate this monitoring task. In this paper, we address the problem of monitoring fru...
-
Redundant manipulators offer a continuum of joint configurations which satisfy a specific end-effector pose, an advantage when operating within confined spaces. This, how-ever, challenges a controller to select a single goal configuration from a wide range when path planning. This paper outlines the use of the MySQL database management system for systematic goal selection during redundant manipula...
-
Cable driven parallel robots (CDPRs) are often challenging to model and to dynamically control due to the inherent flexibility and elasticity of the cables. The additional inclusion of online geometric reconfigurability to a CDPR results in a complex underdetermined system with highly non-linear dynamics. The necessary (numerical) redundancy resolution requires multiple layers of optimization rend...
-
Snake robots offer considerable potential for endoscopic interventions due to their ability to follow curvilinear paths. Telemanipulation is an open problem due to hyper-redundancy, as input devices only allow a specification of six degrees of freedom. Our work addresses this by presenting a unified telemanipulation strategy which enables follow-the-leader locomotion and reorientation keeping the ...
-
Obtaining the shape and size of a robot's workspace is essential for both its design and control. However, determining the accurate workspace of a multi-segment continuum robot by graphic or analytical methods is a challenging task due to its inherent flexibility and complex structure. Existing numerical methods have limitations when applied to a continuum robot. This paper presents an Equivalent ...
-
This paper presents two methods for the computation of the null space velocity command in redundant robots. Both these methods resort to the solution of a constrained optimization problem. The first one is a formalization of the traditional Gradient Projection Method (GPM) which guarantees the respect of the joint bounds and a gradual activation/deactivation of the null space command. The second o...
-
The redundant manipulators' analytical solutions can be obtained by the parameterization method. Multiple parameterized joints and their corresponding parametric representations exist for a redundant manipulator. However, how to select the optimal parameterized joints has yet to be well-addressed. This paper delves into the mechanism of the parameterization method and proposes a method to select t...
-
A novel kinematically redundant 6+1-degree-of-freedom (dof) spatial hybrid parallel robot is proposed. Each of the two legs of the robot has a fully parallel structure to minimize the moving inertia by mounting actuators on the base. The kinematic model of each leg and overall robot architecture is developed based on the constraint conditions of the robot geometry. The singularity analysis of legs...
-
Trajectory optimization (TO) is an efficient tool to generate a redundant manipulator's joint trajectory following a 6-dimensional Cartesian path. The optimization performance largely depends on the quality of initial trajectories. However, the selection of a high-quality initial trajectory is non-trivial and requires a considerable time budget due to the extremely large space of the solution traj...
-
A novel kinematically redundant ($6+3$) -DoF parallel robot is presented in this paper. Three identical 3-DoF RU/2-RUS legs are attached to a configurable platform through spherical joints. With the selected leg mechanism, the motors are mounted at the base, reducing the reflected inertia. The robot is intended to be actuated with direct-drive motors in order to perform intuitive physical human-ro...
-
Generating feasible robot motions in real-time requires achieving multiple tasks (i.e., kinematic requirements) simultaneously. These tasks can have a specific goal, a range of equally valid goals, or a range of acceptable goals with a preference toward a specific goal. To satisfy multiple and potentially competing tasks simultaneously, it is important to exploit the flexibility afforded by tasks ...
-
How does a wheeled robot move and turn? The answer is straightforward for a conventional wheeled robot, but it is not so easy for a robot with a discrete wheel design. Regular wheeled robots always have four contact points, resulting in static stability during locomotion. However, QuadRunner's novel leg mechanism provides only a semi-circular wheel shape, and proper gait planning is needed to go s...
-
Liquids and granular media are pervasive throughout human environments. Their free-flowing nature causes people to constrain them into containers. We do so with thousands of different types of containers made out of different materials with varying sizes, shapes, and colors. In this work, we present a state-of-the-art sensing technique for robots to perceive what liquid is inside of an unknown con...
-
This paper proposes a set of tactile based skills to perform robotic cable routing operations for deformable linear objects (DLOs) characterized by considerable stiffness and constrained at both ends. In particular, tactile data are exploited to reconstruct the shape of the grasped portion of the DLO and to estimate the future local one. This information is exploited to obtain a grasping configura...
-
Correspondence search is an essential step in rigid point cloud registration algorithms. Most methods maintain a single correspondence at each step and gradually remove wrong correspondances. However, building one-to-one correspondence with hard assignments is extremely difficult, especially when matching two point clouds with many locally similar features. This paper proposes an optimization meth...
-
We present an end-to-end trainable multi-task model that locates and retrieves target objects from multi-object scenes. The model is an extension of the Siamese Mask R-CNN, which combines the components of Siamese Neural Networks (SNNs) and Mask R-CNN for performing one-shot instance segmentation. The proposed network, called Grasping Siamese Mask R-CNN (GSMR-CNN), extends Siamese Mask R-CNN by ad...
-
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry fr...
-
Planning for deformable object manipulation has been a challenge for a long time in robotics due to its high computational cost. In this work, we propose to reduce this cost by reducing the number of pick points on a deformable object in the action space. We do this by identifying a small number of key particles that are sufficient as pick points to reach a given goal state. We find these key part...
-
Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes. Previous works infer manipulation relationship by deep neural network trained with data collected from a predefined view, which has limitation in visual dislocation in unstructured environments. Multi-vi...
-
Recovering full 3D shapes from partial observations is a challenging task that has been extensively addressed in the computer vision community. Many deep learning methods tackle this problem by training 3D shape generation networks to learn a prior over the full 3D shapes. In this training regime, the methods expect the inputs to be in a fixed canonical form, without which they fail to learn a val...
-
Accurately estimating the shape of objects in dense clutters makes important contribution to robotic packing, because the optimal object arrangement requires the robot planner to acquire shape information of all existed objects. However, the objects for packing are usually piled in dense clutters with severe occlusion, and the object shape varies significantly across different instances for the sa...
-
Particle filtering is a common technique for six degree of freedom (6D) pose estimation due to its ability to tractably represent belief over object pose. However, the particle filter is prone to particle deprivation due to the high-dimensional nature of 6D pose. When particle deprivation occurs, it can cause mode collapse of the underlying belief distri-bution during importance sampling. If the r...
-
Grasping an object when it is in an ungraspable pose is a challenging task, such as books or other large flat objects placed horizontally on a table. Inspired by human manipulation, we address this problem by pushing the object to the edge of the table and then grasping it from the hanging part. In this paper, we develop a model-free Deep Reinforcement Learning framework to synergize pushing and g...
-
Bimanual manipulation is important for building intelligent robots that unlock richer skills than single arms. We consider a multi-object bimanual rearrangement task, where a reinforcement learning (RL) agent aims to jointly control two arms to rearrange these objects as fast as possible. Solving this task efficiently is challenging for an RL agent due to the requirement of discovering precise int...
-
We study the problem of learning graph dynamics of deformable objects that generalizes to unknown physical properties. Our key insight is to leverage a latent representation of elastic physical properties of cloth-like deformable objects that can be extracted, for example, from a pulling interaction. In this paper we propose EDO-Net (Elastic Deformable Object - Net), a model of graph dynamics trai...
-
Given point cloud input, the problem of 6-DoF grasp pose detection is to identify a set of hand poses in SE(3) from which an object can be successfully grasped. This important problem has many practical applications. Here we propose a novel method and neural network model that enables better grasp success rates relative to what is available in the literature. The method takes standard point cloud ...
-
Learning diverse dexterous manipulation behaviors with assorted objects remains an open grand challenge. While policy learning methods offer a powerful avenue to attack this problem, these approaches require extensive per-task engineering and algorithmic tuning. This paper seeks to escape these constraints, by developing a Pre-Grasp informed Dexterous Manipulation (PGDM) framework that generates d...
-
In-hand manipulation is challenging for a multi-finger robotic hand due to its high degrees of freedom and complex interaction with the object. To enable in-hand manipulation, existing deep reinforcement learning-based approaches mainly focus on training a single robot-structure-specific policy through the centralized learning mechanism, lacking adaptability to changes like robot malfunction. To s...
-
Learning from demonstration (LfD) with behavior cloning is attractive for its simplicity; however, compounding errors in long and complex skills can be a hindrance. Considering a target skill as a sequence of motor primitives is helpful in this respect. Then the requirement that a motor primitive ends in a state that allows the successful execution of the subsequent primitive must be met. In this ...
-
Sim-and-real training is a promising alternative to sim-to-real training for robot manipulations. However, the current sim-and-real training is neither efficient, i.e., slow con-vergence to the optimal policy, nor effective, i.e., sizeable real-world robot data. Given limited time and hardware budgets, the performance of sim-and-real training is not satisfactory. In this paper, we propose a Consen...
-
Thin plastic bags are ubiquitous in retail stores, healthcare, food handling, recycling, homes, and school lunchrooms. They are challenging both for perception (due to specularities and occlusions) and for manipulation (due to the dynamics of their 3D deformable structure). We formulate the task of “bagging:” manipulating common plastic shopping bags with two handles from an unstructured initial s...
-
Dexterous manipulation of objects through fine control of physical contacts is essential for many important tasks of daily living. A fundamental ability underlying fine contact control is compliant control, i.e., controlling the contact forces while moving. For robots, the most widely explored approaches heavily depend on models of manipulated objects and expensive sensors to gather contact locati...
-
Virtualizing the physical world into virtual models has been a critical technique for robot navigation and planning in the real world. To foster manipulation with articulated objects in everyday life, this work explores building articulation models of indoor scenes through a robot's purposeful inter-actions in these scenes. Prior work on articulation reasoning primarily focuses on siloed objects o...
-
Humans naturally exploit haptic feedback during contact-rich tasks like loading a dishwasher or stocking a bookshelf. Current robotic systems focus on avoiding unexpected contact, often relying on strategically placed environment sensors. Recently, contact-exploiting manipulation policies have been trained in simulation and deployed on real robots. However, they require some form of real-world ada...
-
This paper develops a new nonlinear filter, called Moment-based Kalman Filter (MKF), using the exact moment propagation method. Existing state estimation methods use linearization techniques or sampling points to compute approximate values of moments. However, moment propagation of probability distributions of random variables through nonlinear process and measurement models play a key role in the...
-
While substantial progress has been made in the absolute performance of localization and Visual Place Recognition (VPR) techniques, it is becoming increasingly clear from translating these systems into applications that other capabilities like integrity and predictability are just as important, especially for safety- or operationally-critical autonomous systems. In this research we present a new, ...
-
In this paper, we present an algorithm for learning time-correlated measurement covariances for application in batch state estimation. We parameterize the inverse measurement covariance matrix to be block-banded, which conveniently factorizes and results in a computationally efficient approach for correlating measurements across the entire trajectory. We train our covariance model through supervis...
-
Visual localization allows autonomous robots to relocalize when losing track of their pose by matching their current observation with past ones. However, ambiguous scenes pose a challenge for such systems, as repetitive structures can be viewed from many distinct, equally likely camera poses, which means it is not sufficient to produce a single best pose hypothesis. In this work, we propose a prob...
-
Multi-sensor fusion-based localization technology has achieved high accuracy in autonomous systems. How to improve the robustness is the main challenge at present. The most commonly used LiDAR and camera are weather-sensitive, while the FMCW radar has strong adaptability but suffers from noise and ghost effects. In this paper, we propose a heterogeneous localization method of Radar on LiDAR Map (R...
-
Aggressive motions from agile flights or traversing irregular terrain induce motion distortion in LiDAR scans that can degrade state estimation and mapping. Some methods exist to mitigate this effect, but they are still too simplistic or computationally costly for resource-constrained mobile robots. To this end, this paper presents Direct LiDAR-Inertial Odometry (DLIO), a lightweight LiDAR-inertia...
-
In this paper, we propose using online public maps, e.g., OpenStreetMap (OSM), for large-scale radar-based localization without needing a prior sensing map. This can potentially extend the localization system to anywhere worldwide without building, saving, or maintaining a sensing map, as long as an online public map covers the operating area. Existing methods using OSM only use route network or s...
-
In this paper, we propose a continuous-time-based LiDAR-inertial-vehicle odometry method, which can tightly fuse the data from Light Detection And Ranging (LiDAR), inertial measurement units (IMU), and vehicle measurements. The lateral acceleration constraint is further added to trajectory estimation to make the estimated trajectory follow the motion characteristics of vehicles. In addition, since...
-
Visual localization for mobile robots and intelligent vehicles in prior LiDAR maps can achieve high accuracy and low cost. However, algorithms for finding the cross-modal correspondences between images and LiDAR map points are not yet stable. In this paper, we propose a monocular visual localization system in prior LiDAR maps, which is based on the cross-modal registration to optimize the camera p...
-
In this paper, we present a Radar-Inertial Odometry (RIO) approach that utilizes performance improving modules, enhanced for the sparse and noisy radar signals, from the vision community in order to estimate the full 6DoF pose and 3D velocity of a robot in an unprepared environment. Our method leverages a multi-state approach in which we make use of several past robot poses and trails of measureme...
-
We present Loc-NeRF, a real-time vision-based robot localization approach that combines Monte Carlo localization and Neural Radiance Fields (NeRF). Our system uses a pre-trained NeRF model as the map of an environment and can localize itself in real-time using an RGB camera as the only exteroceptive sensor onboard the robot. While neural radiance fields have seen significant applications for visua...
-
This paper considers the problem of audio source separation, where the goal is to isolate a target audio signal (say Alice's speech) from a mixture of multiple interfering signals (e.g., when many people are talking). This problem has gained renewed interest mainly due to the significant growth in voice-controlled devices, including robots in homes, offices, and other public facilities. Although a...
-
This study presents a framework, L-C*, for resilient and lightweight direct visual localization, employing a loosely coupled fusion of visual and inertial data. Unlike indirect methods, direct visual localization facilitates accurate pose estimation on general color three-dimensional maps that are not tailored for visual localization. However, it suffers from temporal localization failures and hig...
-
Visual place retrieval aims to search images in the database that depict similar places as the query image. However, global descriptors encoded by the network usually fall into a low dimensional principal space, which is harmful to the retrieval performance. We first analyze the cause of this phenomenon, pointing out that it is due to degraded distribution of the gradients of descriptors. Then, we...
-
Learning-based visual odometry (VO) algorithms achieve remarkable performance on common static scenes, benefiting from high-capacity models and massive annotated data, but tend to fail in dynamic, populated environments. Semantic segmentation is largely used to discard dynamic associations before estimating camera motions but at the cost of discarding static features and is hard to scale up to uns...
-
There are a multitude of emerging imaging technologies that could benefit robotics. However the need for bespoke models, calibration and low-level processing represents a key barrier to their adoption. In this work we present NOCaL, Neural Odometry and Calibration using Light fields, a semi-supervised learning architecture capable of interpreting previously unseen cameras without calibration. NOCa...
-
Implicit neural representations have shown promising potential for 3D scene reconstruction. Recent work applies it to autonomous 3D reconstruction by learning information gain for view path planning. Effective as it is, the computation of the information gain is expensive, and compared with that using volumetric representations, collision checking using the implicit representation for a 3D point i...
-
Autonomous exploration to build a map of an unknown environment is a fundamental robotics problem. However, the quality of the map directly influences the quality of subsequent robot operation. Instability in a simultaneous localization and mapping (SLAM) system can lead to poor-quality maps and subsequent navigation failures during or after exploration. This becomes particularly noticeable in con...
-
Machine learning techniques rely on large and diverse datasets for generalization. Computer vision, natural language processing, and other applications can often reuse public datasets to train many different models. However, due to differences in physical configurations, it is challenging to leverage public datasets for training robotic control policies on new robot platforms or for new tasks. In ...
-
Navigation has been classically solved in robotics through the combination of SLAM and planning. More recently, beyond waypoint planning, problems involving significant components of (visual) high-level reasoning have been explored in simulated environments, mostly addressed with large-scale machine learning, in particular RL, offline-RL or imitation learning. These methods require the agent to le...
-
Real-time navigation in non-trivial environments by micro aerial vehicles (MAVs) predominantly relies on modelling the MAV with idealized geometry, such as a sphere. Simplified, conservative representations increase the likelihood of a planner failing to identify valid paths. That likelihood increases the more a robot's geometry differs from the idealized version. Few current approaches consider t...
-
This work focuses on the problem of visual target navigation, which is very important for autonomous robots as it is closely related to high-level tasks. To find a special object in unknown environments, classical and learning-based approaches are fundamental components of navigation that have been investigated thoroughly in the past. However, due to the difficulty in the representation of complic...
-
VINet: Visual and Inertial-based Terrain Classification and Adaptive Navigation over Unknown Terrain
We present a visual and inertial-based terrain classification network (VINet) for robotic navigation over different traversable surfaces. We use a novel navigation-based labeling scheme for terrain classification and generalization on unknown surfaces. Our proposed perception method and adaptive scheduling control framework can make predictions according to terrain navigation properties and lead t...
-
We investigate the Vision-and-Language Navigation (VLN) problem in the context of autonomous driving in outdoor settings. We solve the problem by explicitly grounding the navigable regions corresponding to the textual command. At each timestamp, the model predicts a segmentation mask corresponding to the intermediate or the final navigable region. Our work contrasts with existing efforts in VLN, w...
-
Parallax and Time-of-Flight (ToF) are often regarded as complementary in robotic vision where various light and weather conditions remain challenges for advanced camera-based 3-Dimensional (3-D) reconstruction. To this end, this paper establishes Parallax among Corresponding Echoes (PaCE) to triangulate acoustic ToF pulses from arbitrary sensor positions in 3-D space for the first time. This is ac...
-
Ultra-Wideband (UWB) systems are becoming increasingly popular for indoor localization, where range measurements are obtained by measuring the time-of-flight of radio signals. However, the range measurements typically suffer from a systematic error or bias that must be corrected for high-accuracy localization. In this paper, a ranging protocol is proposed alongside a robust and scalable antenna-de...
-
This paper explores a machine learning approach on data from a single-chip mmWave radar for generating high resolution point clouds – a key sensing primitive for robotic applications such as mapping, odometry and localization. Unlike lidar and vision-based systems, mmWave radar can operate in harsh environments and see through occlusions like smoke, fog, and dust. Unfortunately, current mmWave pro...
-
3D LiDAR place recognition plays a vital role in various robot applications' including robotic navigation, autonomous driving, and simultaneous localization and mapping. However, most previous studies evaluated their models on accumulated 2D scans instead of real-world 3D LiDAR scans with a larger number of points, which limits the application in real scenarios. To address this limitation, we prop...
-
We propose a robust framework for planar pose graph optimization contaminated by loop closure outliers. Our framework rejects outliers by first decoupling the robust PGO problem wrapped by a Truncated Least Squares kernel into two subproblems. Then, the framework introduces a linear angle representation to rewrite the first subproblem that is originally formulated in rotation matrices. The framewo...
-
This paper presents a method for robust optimization for online incremental Simultaneous Localization and Mapping (SLAM). Due to the NP-Hardness of data association in the presence of perceptual aliasing, tractable (approximate) approaches to data association will produce erroneous measurements. We require SLAM back-ends that can converge to accurate solutions in the presence of outlier measuremen...
-
We propose a novel real-time LiDAR intensity image-based simultaneous localization and mapping method, which addresses the geometry degeneracy problem in un-structured environments. Traditional LiDAR-based front-end odometry mostly relies on geometric features such as points, lines and planes. A lack of these features in the environment can lead to the failure of the entire odometry system. To avo...
-
We present a novel real-time dense and semantic neural field mapping system that uses only monocular images as input. Our scene representation is a dense continuous radiance field represented by a Multi-Layer Perceptron (MLP), trained from scratch in real-time. We build on high-performance sparse visual SLAM and use camera poses and sparse keypoint depths as supervision alongside RGB keyframes. Si...
-
The uncertainty quantification of prediction models (e.g., neural networks) is crucial for their adoption in many robotics applications. This is arguably as important as making accurate predictions, especially for safety-critical applications such as self-driving cars. This paper proposes our approach to uncertainty quantification in the context of visual localization for autonomous driving, where...
-
In the context of robotics, accurate ground-truth positioning is the cornerstone for the development of mapping and localization algorithms. In outdoor environments and over long distances, total stations provide accurate and precise measurements, that are unaffected by the usual factors that deteriorate the accuracy of Global Navigation Satellite System (GNSS). While a single robotic total statio...
-
We present an open-source Visual-Inertial-Leg Odometry (VILO) state estimation solution for legged robots, called Cerberus, which precisely estimates position on various terrains in real-time using a set of standard sensors, including stereo cameras, IMU, joint encoders, and contact sensors. In addition to estimating robot states, we perform online kinematic parameter calibration and outlier rejec...
-
Spiking neural networks have significant potential utility in robotics due to their high energy efficiency on specialized hardware, but proof-of-concept implementations have not yet typically achieved competitive performance or capability with conventional approaches. In this paper, we tackle one of the key practical challenges of scalability by introducing a novel modular ensemble network approac...
-
Task automation of surgical robot has the potentials to improve surgical efficiency. Recent reinforcement learning (RL) based approaches provide scalable solutions to surgical automation, but typically require extensive data collection to solve a task if no prior knowledge is given. This issue is known as the exploration challenge, which can be alleviated by providing expert demonstrations to an R...
-
Accurate needle insertion is an important task in many medical procedures. This paper studies the case of an autonomous needle insertion system for central venous access, which is a risky and challenging procedure involving the simultaneous manipulation of an ultrasound probe and of a catheterization needle. The goal of this medical operation is to provide access to a deep central vein, which is a...
-
Vitreoretinal surgery pertains to the treatment of delicate tissues on the fundus of the eye using thin instruments. Surgeons frequently rotate the eye during surgery, which is called orbital manipulation, to observe regions around the fundus without moving the patient. In this paper, we propose the autonomous orbital manipulation of the eye in robot-assisted vitreoretinal surgery with our tele-op...
-
Important challenges in retinal microsurgery in-clude prolonged operating time, inadequate force feedback, and poor depth perception due to a constrained top-down view of the surgery. The introduction of robot-assisted technology could potentially deal with such challenges and improve the surgeon's performance. Motivated by such challenges, this work develops a strategy for autonomous needle navig...
-
Catheter-based cardiac ablation is the preferred method of treating atrial fibrillation. Conventionally, the catheter is navigated in the heart using X-ray fluoroscopy imaging and an electroanatomical map. Although successful, these imaging modalities do not provide real-time feedback on the quality of lesions created, which in turn could lead to recurrence of arrhythmia. Intracardiac echo (ICE) c...
-
Intracardiac transcatheter systems guided by advanced imaging modalities are gaining popularity in treating mitral regurgitation in non-surgical candidates. Robotically steerable transcatheter systems must use model-based control strategies to ensure safer and more effective transcatheter procedures with less trauma while using smaller control gains. In this paper, a 4-DoF robotically steerable te...
-
Atrial fibrillation (AF) is mostly treated via robotic catheter-based cardiac ablation procedures. Over the last few decades, cables or tendon mechanisms are at the core of available cardiac catheters. Despite advances, the use of cables often results in considerable force loss, nonlinear hysteresis, and control challenges. Most catheters are not equipped with force sensing, which increases the ri...
-
Capsule endoscopic robot holds great promise for the early diagnosis of gastrointestinal diseases without causing discomfort to patients. However, currently available active capsule endoscopic robots suffer from issues such as complex structure, poor mobility, large size, and high cost, which have hindered their widespread adoption and resulted in a lower screening rate for gastrointestinal diseas...
-
Magnetic field is a favorable power source for actuation and control of micro-/nanorobots. To overcome the fast decay of magnetic field for large-workspace microrobotic actuation, mobile field source-based systems have been proposed. In this work, we report a new mobile-coil system, i.e., QuadMag. It consists of four electromagnetic coils, whose motion is actuated by a parallel mechanism. Compared...
-
Neurosurgery could benefit from robot-assisted minimally invasive approaches, but existing robot tools are insufficiently small and compact. Magnetic actuation is an attractive approach to medical robotics because it allows small, modular serial mechanisms to be remotely actuated. Despite these advantages, magnetic actuation is relatively weak compared to alternative actuation methods. In this pap...
-
In this paper, we present the design, fabrication, and evaluation of a novel flexible, yet structurally strong, Concentric Tube Steerable Drilling Robot (CT-SDR) to improve minimally invasive treatment of spinal tumors. Inspired by concentric tube robots, the proposed two degree-of-freedom (DoF) CT-SDR, for the first time, not only allows a surgeon to intuitively and quickly drill smooth planar an...
-
This paper introduces a novel class of hyperredun-dant robots comprised of chains of permanently magnetized spheres enclosed in a cylindrical polymer skin. With their shape controlled using an externally-applied magnetic field, the spherical joints of these robots enable them to bend to very small radii of curvature. These robots can be used as steerable tips for endoluminal instruments. A kinemat...
-
In the last decade, various robotic platforms have been introduced that could support delicate retinal surgeries. Concurrently, to provide semantic understanding of the surgical area, recent advances have enabled microscope-integrated intraoperative Optical Coherent Tomography (iOCT) with high-resolution 3D imaging at near video rate. The combination of robotics and semantic understanding enables ...
-
The 3D reconstruction of patient specific bone models plays a crucial role in orthopaedic surgery for clinical evaluation, surgical planning and precise implant design or selection. This paper considers the problem of reconstructing a patient-specific 3D tibia and fibula model from only two 2D X-ray images and one 3D general model segmented from the lower leg CT scans of one randomly selected pati...
-
Accurate and robust tracking and reconstruction of the surgical scene is a critical enabling technology toward autonomous robotic surgery. Existing algorithms for 3D perception in surgery mainly rely on geometric information, while we propose to also leverage semantic information inferred from the endoscopic video using image segmentation algorithms. In this paper, we present a novel, comprehensiv...
-
Automating the process of manipulating and delivering sutures during robotic surgery is a prominent problem at the frontier of surgical robotics, as automating this task can significantly reduce surgeons' fatigue during tele-operated surgery and allow them to spend more time addressing higher-level clinical decision making. Accomplishing autonomous suturing and suture manipulation in the real worl...
-
Endobronchial intervention is increasingly used as a minimally invasive means for the treatment of pulmonary diseases. In order to reduce the difficulty of manipulation in complex airway networks, robust lumen detection is essential for intraoperative guidance. However, these methods are sensitive to visual artifacts which are inevitable during the surgery. In this work, a cross domain feature int...
-
Autonomous suturing has been a long-sought-after goal for surgical robotics. Outside of staged environments, accurate localization of suture needles is a critical foundation for automating various suture needle manipulation tasks in the real world. When localizing a needle held by a gripper, previous work usually tracks them separately without considering their relationship. Because of the signifi...
-
Optical Coherence Elastography (OCE) is a method that discerns local tissue stiffness using optical information. This method has recently been explored for laryngeal cancer tumor margin detection but has not been widely deployed clinically. Part of the challenge hindering such clinical deployment is the need for controlled high-precision mechanical probing of the tissue. This paper explores the co...
-
Optical Coherence Tomography (OCT) is a non-invasive imaging technique that is instrumental in retinal disease diagnosis and treatment. Segmentation of retinal layers in OCT is an essential step, but remains challenging for common pixel-wise segmentation methods usually fail to obtain the correct layer topology. To tackle this challenge, we propose a novel Locate-to-Segment (L2S) framework to prov...
-
Ultrasound (US) is widely used in image-guided needle procedures. Correctly tracking the needle tip position in US images during the procedure plays an important role in improving the needle targeting accuracy and patient safety. This paper presents a leaning-based visual tracking network with a Siamese architecture, which makes full use of the attention mechanism to explore the potential of globa...
-
Small catheters undergo significant torsional deflections during endovascular interventions. A key challenge in enabling robot control of these catheters is the estimation of their bending planes. This paper considers approaches for estimating these bending planes based on bi-plane image feedback. The proposed approaches attempt to minimize error between either the direct (position-based) or insta...
-
Surgical assistive robots offer the potential for drastically improved patient outcomes through more accurate, more repeatable surgical procedures like shoulder arthroplasty operations. Existing robotic systems typically rely on optical marker tracking and require invasive marker attachment for localization, complicating the surgical workflow and patient recovery. But moving towards a markerless s...
-
In this work, we address the problem of robust segmentation of a continuum robot from images without the need for training data or markers. We present a method that leverages information about the kinematics of these robots to produce an estimate of the robot shape, which is refined through optimization over global image statistics. Our approach can be straightforwardly applied to any continuum ro...
-
Collaborative 3D object detection exploits information exchange among multiple agents to enhance accuracy of object detection in presence of sensor impairments such as occlusion. However, in practice, pose estimation errors due to imperfect localization would cause spatial message misalignment and significantly reduce the performance of collaboration. To alleviate adverse impacts of pose errors, w...
-
Autonomous driving powered by deep learning requires large-scale, high-quality training data from diverse driving environments to operate effectively worldwide. However, collecting and annotating such data is costly and time-consuming. To address this challenge, active learning methods have been explored to select the most informative data samples for training. Nevertheless, most existing methods ...
-
Obstacle detection is a safety-critical problem in robot navigation, where stereo matching is a popular vision-based approach. While deep neural networks have shown impressive results in computer vision, most of the previous obstacle detection works only leverage traditional stereo matching techniques to meet the computational constraints for real-time feedback. This paper proposes a computational...
-
We present a novel approach to interactive 3D object perception for robots. Unlike previous perception algorithms that rely on known object models or a large amount of annotated training data, we propose a poking-based approach that automatically discovers and reconstructs 3D objects. The poking process not only enables the robot to discover unseen 3D objects but also produces multi-view observati...
-
Monocular 3D object detection reveals an economical but challenging task in autonomous driving. Recently center-based monocular methods have developed rapidly with a great trade-off between speed and accuracy, where they usually depend on the object center's depth estimation via 2D features. However, the visual semantic features without sufficient pixel geometry information, may affect the perform...
-
To achieve accurate 3D object detection at a low cost for autonomous driving, many multi-camera methods have been proposed and solved the occlusion problem of monocular approaches. However, due to the lack of accurate estimated depth, existing multi-camera methods often generate multiple bounding boxes along a ray of depth direction for difficult small objects such as pedestrians, resulting in an ...
-
Vision-based autonomous navigation systems rely on fast and accurate object detection algorithms to avoid obstacles. Algorithms and sensors designed for such systems need to be computationally efficient, due to the limited energy of the hardware used for deployment. Biologically inspired event cameras are a good candidate as a vision sensor for such systems due to their speed, energy efficiency, a...
-
Deploying object detectors in embedded devices such as unmanned aerial vehicles (UAVs) comes with many challenges. This is due to both the UAV itself having low embedded resources in terms of computation and memory, and also due to the nature of the captured visual data with the variations in objects' scale, orientation, density, viewpoint, distribution, shape, context and others. It is crucial fo...
-
Test-time domain adaptation, i.e. adapting source-pretrained models to the test data on-the-fly in a source-free, unsupervised manner, is a highly practical yet very challenging task. Due to the domain gap between source and target data, inference quality on the target domain can drop drastically especially in terms of absolute scale of depth. In addition, unsupervised adaptation can degrade the m...
-
Transparent objects are widely used in industrial automation and daily life. However, robust visual recognition and perception of transparent objects have always been a major challenge. Currently, most commercial-grade depth cameras are still not good at sensing the surfaces of transparent objects due to the refraction and reflection of light. In this work, we present a transformer-based transpare...
-
We propose a technique for depth completion of transparent objects using augmented data captured directly from real environments with complicated geometry. Using cyclic adversarial learning we train translators to convert between painted versions of the objects and their real transparent counterpart. The translators are trained on unpaired data, hence datasets can be created rapidly and without an...
-
Depth estimation is an important task in various robotics systems and applications. In mobile robotics systems, monocular depth estimation is desirable since a single RGB camera can be deployable at a low cost and compact size. Due to its significant and growing needs, many lightweight monocular depth estimation networks have been proposed for mobile robotics systems. While most lightweight monocu...
-
Event cameras have the potential to overcome the limitations of classical computer vision in real-world applications. Depth estimation is a crucial step for high-level robotics tasks and has attracted much attention from the community. In this paper, we propose an event-based dense depth estimation architecture, Mixed-EF2DNet, which firstly predicts inter-grid optical flow to compensate for lost t...
-
Distance estimation from vision is fundamental for a myriad of robotic applications such as navigation, manipu-lation, and planning. Inspired by the mammal's visual system, which gazes at specific objects, we develop two novel constraints relating time-to-contact, acceleration, and distance that we call the $\tau$ -constraint and $\Phi$ -constraint. They allow an active (moving) camera to estimate...
-
Self-supervised depth estimation draws a lot of attention recently as it can promote the 3D sensing capa-bilities of self-driving vehicles. However, it intrinsically relies upon the photometric consistency assumption, which hardly holds during nighttime. Although various supervised night-time image enhancement methods have been proposed, their generalization performance in challenging driving scen...
-
The great potential of unsupervised monocular depth estimation has been demonstrated by many works due to low annotation cost and impressive accuracy comparable to supervised methods. To further improve the performance, recent works mainly focus on designing more complex network structures and exploiting extra supervised information, e.g., semantic segmentation. These methods optimize the models b...
-
This paper presents a framework to represent high-fidelity pointcloud sensor observations for efficient communication and storage. The proposed approach exploits Sparse Gaussian Process to encode pointcloud into a compact form. Our approach represents both the free space and the occupied space using only one model (one 2D Sparse Gaussian Process) instead of the existing two-model framework (two 3D...
-
Is it possible for a synthetic to realistic domain adapted neural network in single image depth estimation to truly generalize on real world data? The resultant, adapted model will only generalize on the realistic domain dataset, which only reflects a small portion of the true, real world. As a result, the network still has to cope with the potential danger of domain shift between the realistic do...
-
To achieve autonomy in unknown and unstruc-tured environments, we propose a method for semantic-based planning under perceptual uncertainty. This capability is cru-cial for safe and efficient robot navigation in environment with mobility-stressing elements that require terrain-specific locomotion policies. We propose the Semantic Belief Graph (SBG), a geometric- and semantic-based representation o...
-
Safety has been a key goal for autonomous driving since its inception, and we believe recognizing and responding to risk is a key component of safety. In this work, we aim to answer the question, “How can explainable risk representations be generated and used to produce risk-averse trajectories?” To answer this question, previous work uses risk metrics to formulate an optimization problem. In cont...
-
Multi-objective or multi-destination path planning is crucial for mobile robotics applications such as mobility as a service, robotics inspection, and electric vehicle charging for long trips. This work proposes an anytime iterative system to concurrently solve the multi-objective path planning problem and determine the visiting order of destinations. The system is comprised of an anytime informab...
-
Motion planning framed as optimisation in structured latent spaces has recently emerged as competitive with traditional methods in terms of planning success while significantly outperforming them in terms of computational speed. However, the real-world applicability of recent work in this domain remains limited by the need to express obstacle information directly in state-space, involving simple g...
-
Sampling-based motion planning works well in many cases but is less effective if the configuration space has narrow passages. In this paper, we propose a learning-based strategy to sample in these narrow passages, which improves overall planning time. Our algorithm first learns from the configuration space planning graphs and then uses the learned information to effectively generate narrow passage...
-
Coverage Path Planning (CPP) is an essential feature of robots deployed for applications such as lawn mowing, cleaning, painting, and exploration. However, most of the state-of-the-art CPP methods are proposed for fixed-morphology robots, and the coverage performance is limited by physical constraints such as the inaccessibility of narrow spaces. Apart from area coverage, productivity depends on c...
-
This paper firstly presents our optimal path planning algorithm within a $2\mathrm{D}$ non-convex, polytopic region defined as a sequence of connected convex polytopes. The path is a B-spline curve but being parametrized with its equivalent Bézier representation. By doing this, the local convexity bound of each curve's interval is significantly tighter. Thus, it allows many more possibilities for ...
-
In this paper, we study planning in stochastic systems, modeled as Markov decision processes (MDPs), with preferences over temporally extended goals. Prior work on temporal planning with preferences assumes that the user preferences form a total order, meaning that every pair of outcomes are comparable with each other. In this work, we consider the case where the preferences over possible outcomes...
-
Future conceptions of agile, just-in-time fabrication, lean and “smart” manufacturing, and a host of allied processes that exploit advanced automation, depend in part on realizing improvements in logistics planning. The present paper hypothesizes that the key to improving flexibility will be the inclusion of sophisticated, time-correlated stochastic models of demand—whether that be demand by end-u...
-
In this work, we present a novel robustness measure for continuous-time stochastic trajectories with respect to Signal Temporal Logic (STL) specifications. We show the soundness of the measure and develop a monitor for reasoning about partial trajectories. Using this monitor, we introduce an STL sampling-based motion planning algorithm for robots under uncertainty. Given a minimum robustness requi...
-
This paper presents a new multi-layered algorithm for motion planning under motion and sensing uncertainties for Linear Temporal Logic specifications. We propose a technique to guide a sampling-based search tree in the combined task and belief space using trajectories from a simplified model of the system, to make the problem computationally tractable. Our method eliminates the need to construct f...
-
A key challenge in fast ground robot navigation in 3D terrain is balancing robot speed and safety. Recent work has shown that 2.5D maps (2D representations with additional 3D information) are ideal for real-time safe and fast planning. However, the prevalent approach of generating 2D occupancy grids through raytracing makes the generated map unsafe to plan in, due to inaccurate representation of u...
-
This paper addresses the problem of robotic exploration of unknown indoor environments with deadlines. Indoor exploration using mobile robots has typically focused on exploring the entire environment without considering deadlines. The objective of the prioritized exploration in this paper is to rapidly compute the geometric layout of an initially unknown environment by exploring key regions of the...
-
In communication restricted environments, a multi-robot system can be deployed to either: i) maintain constant communication but potentially sacrifice operational efficiency due to proximity constraints or ii) allow disconnections to increase environmental coverage efficiency, challenges on how, when, and where to reconnect (rendezvous problem). In this work we tackle the latter problem and notice...
-
Recent advances in Reinforcement Learning (RL) have made significant contributions in past years by offering intelligent solutions to solve robotic tasks. However, most RL algorithms, especially the model-free RL, are plagued by low learning efficiency and safety problems. In this paper, we propose using the Bayesian Neural Networks (BNNs) to guide the agent exploring actively to enhance the learn...
-
Combining task and motion planning (TAMP) is crucial for intelligent robots to perform complex and long-horizon tasks. In TAMP, many approaches generally employ Monte-Carlo tree search (MCTS) with upper confidence bound (UCB) for task planning to handle exploration-exploitation trade-off and find globally optimal solutions. However, since UCB basically considers the estimation error caused by nois...
-
We consider the problem of task assignment and scheduling for human-robot teams to enable the efficient completion of complex problems, such as satellite assembly. In high-mix, low volume settings, we must enable the human-robot team to handle uncertainty due to changing task requirements, potential failures, and delays to maintain task completion efficiency. We make two contributions: (1) we acco...
-
Robotic task planning is computationally challenging. To reduce planning cost and support life-long operation, we must leverage prior planning experience. To this end, we address the problem of extracting reusable and generalizable abstract skills from successful plan executions. In previous work, we introduced a supporting framework, allowing us, theoretically, to extract an abstract skill from a...
-
Efficient multi-robot task allocation (MRTA) is fundamental to various time-sensitive applications such as disaster response, warehouse operations, and construction. This paper tackles a particular class of these problems that we call MRTA-collective transport or MRTA-CT - here tasks present varying workloads and deadlines, and robots are subject to flight range, communication range, and payload c...
-
We investigate the utility of employing multiple buffers in solving a class of rearrangement problems with pick- n-swap manipulation primitives. In this problem, objects stored randomly in a lattice are to be sorted using a robot arm with k 1 swap spaces or buffers, capable of holding up to $k$ objects on its end-effector simultaneously. On the structural side, we show that the addition of each ne...
-
Imagine placing an online order on your way to the grocery store, then being able to pick the collected basket upon arrival or shortly after. Likewise, imagine placing any online retail order, made ready for pickup in minutes instead of days. In order to realize such a low-latency automatic warehouse logistics system, solvers must be made to be basket-aware. That is, it is more important that the ...
-
We propose a new formulation for the multi-robot task planning and allocation problem that incorporates (a) precedence relationships between tasks; (b) coordination for tasks allowing multiple robots to achieve increased efficiency; and (c) cooperation through the formation of robot coalitions for tasks that cannot be performed by individual robots alone. In our formulation, the tasks and the rela...
-
In this paper we provide a practical demonstration of how the modularity in a Behavior Tree (BT) decreases the effort in programming a robot task when compared to a Finite State Machine (FSM). In recent years the way to represent a task plan to control an autonomous agent has been shifting from the standard FSM towards BTs. Many works in the literature have highlighted and proven the benefits of s...
-
Precise pick-and-place is essential in robotic applications. To this end, we define an exact training method and an iterative inference method that improve pick-and-place precision with Transporter Networks [1]. We conduct a large scale experiment on 8 simulated tasks. A systematic analysis shows, that the proposed modifications have a significant positive effect on model performance. Considering ...
-
Recent progress in end-to-end Imitation Learning approaches has shown promising results and generalization capabilities on mobile manipulation tasks. Such models are seeing increasing deployment in real-world settings, where scaling up requires robots to be able to operate with high autonomy, i.e. requiring as little human supervision as possible. In order to avoid the need for one-on-one human su...
-
Robot control for tactile feedback based manip-ulation can be difficult due to modeling of physical contacts, partial observability of the environment, and noise in perception and control. This work focuses on solving partial observability of contact-rich manipulation tasks as a Sequence-to-Sequence (Seq2Seq) Imitation Learning (IL) problem. The proposed Seq2Seq model first produces a robot-enviro...
-
Cables are commonplace in homes, hospitals, and industrial warehouses and are prone to tangling. This paper extends prior work on autonomously untangling long cables by introducing novel uncertainty quantification metrics and actions that interact with the cable to reduce perception uncertainty. We present Sliding and Grasping for Tangle Manipulation 2.0 (SGTM 2.0), a system that autonomously unta...
-
Deep learning-based grasp prediction models have become an industry standard for robotic bin-picking systems. To maximize pick success, production environments are often equipped with several end-effector tools that can be swapped on-the-fly, based on the target object. Tool-change, however, takes time. Choosing the order of grasps to perform, and corresponding tool-change actions, can improve sys...
-
Federated learning is a promising technique for training global models in a data-decentralized environment. In this paper, we propose a federated learning approach for robotic object grasping. The main challenge is that the data collected by multiple robots deployed in different environments tends to form heterogeneous data distributions (i.e., non-IID) and that the existing federated learning met...
-
To perform dynamic cable manipulation to realize the configuration specified by a target image, we formulate dynamic cable manipulation as a stochastic forward model. Then, we propose a method to handle uncertainty by maximizing the expectation, which also considers estimation errors of the trained model. To avoid issues like multiple local minima and requirement of differentiability by gradient-b...
-
The skill of pivoting an object with a robotic system is challenging for the external forces that act on the system, mainly given by contact interaction. The complexity increases when the same skills are required to generalize across different objects. This paper proposes a framework for learning robust and generalizable pivoting skills, which consists of three steps. First, we learn a pivoting po...
-
Automating garment manipulation is challenging due to extremely high variability in object configurations. To reduce this intrinsic variation, we introduce the task of “canonicalized-alignment” that simplifies downstream applications by reducing the possible garment configurations. This task can be considered as “cloth state funnel” that manipulates arbitrarily configured clothing items into a pre...
-
Learning to manipulate 3D objects in an interactive environment has been a challenging problem in Reinforcement Learning (RL). In particular, it is hard to train a policy that can generalize over objects with different semantic categories, diverse shape geometry and versatile functionality. In this study, we focused on the contact information in manipulation processes, and proposed a unified repre...
-
Robust and efficient grasping of different objects is still an open problem due to the difficulty of integrating multidisciplinary knowledge such as gripper ontology design, perception, control, and learning. In recent years, learning-based methods have achieved excellent results in grasping various novel objects. However, current methods are usually limited to a single grasping mode or rely on di...
-
Robotic grasping of 3D deformable objects is critical for real-world applications such as food handling and robotic surgery. Unlike rigid and articulated objects, 3D deformable objects have infinite degrees of freedom. Fully defining their state requires 3D deformation and stress fields, which are exceptionally difficult to analytically compute or experimentally measure. Thus, evaluating grasp can...
-
Hierarchical Imitation Learning (HIL) has been proposed to recover highly-complex behaviors in long-horizon tasks from expert demonstrations by modeling the task hierarchy with the option framework. Existing methods either overlook the causal relationship between the subtask and its corresponding policy or cannot learn the policy in an end-to-end fashion, which leads to suboptimality. In this work...
-
Neural control of memory-constrained, agile robots requires small, yet highly performant models. We leverage graph hyper networks to learn graph hyper policies trained with off-policy reinforcement learning resulting in networks that are two orders of magnitude smaller than commonly used networks yet encode policies comparable to those encoded by much larger networks trained on the same task. We s...
-
Interactions with articulated objects are a challenging but important task for mobile robots. To tackle this challenge, we propose a novel closed-loop control pipeline, which integrates manipulation priors from affordance estimation with sampling-based whole-body control. We introduce the concept of agent-aware affordances which fully reflect the agent's capabilities and embodiment and we show tha...
-
Multi-objective optimization problems are ubiquitous in robotics, e.g., the optimization of a robot manipulation task requires a joint consideration of grasp pose configurations, collisions and joint limits. While some demands can be easily hand-designed, e.g., the smoothness of a trajectory, several task-specific objectives need to be learned from data. This work introduces a method for learning ...
-
In order to efficiently learn a dynamics model for a task in a new environment, one can adapt a model learned in a similar source environment. However, existing adaptation methods can fail when the target dataset contains transitions where the dynamics are very different from the source environment. For example, the source environment dynamics could be of a rope manipulated in free space, whereas ...
-
Complex and contact-rich robotic manipulation tasks, particularly those that involve multi-fingered hands and underactuated object manipulation, present a significant challenge to any control method. Methods based on reinforcement learning offer an appealing choice for such settings, as they can enable robots to learn to delicately balance contact forces and dexterously reposition objects without ...
-
Mobile manipulation tasks such as opening a door, pulling open a drawer, or lifting a toilet seat require constrained motion of the end-effector under environmental and task constraints. This, coupled with partial information in novel environments, makes it challenging to employ classical motion planning approaches at test time. Our key insight is to cast it as a learning problem to leverage past ...
-
Optimizing behaviors for dexterous manipulation has been a longstanding challenge in robotics, with a variety of methods from model-based control to model-free reinforcement learning having been previously explored in literature. Such prior work often require extensive trial-and-error training along with task-specific tuning of reward functions, which makes applying dexterous manipulation for gene...
-
A fundamental challenge in teaching robots is to provide an effective interface for human teachers to demonstrate useful skills to a robot. This challenge is exacerbated in dexterous manipulation, where teaching high-dimensional, contact-rich behaviors often require esoteric teleoperation tools. In this work, we present Holo − Dex, a framework for dexter-ous manipulation that places a teacher in a...
-
When using a tool, the grasps used for picking it up, reposing, and holding it in a suitable pose for the desired task could be distinct. Therefore, a key challenge for autonomous in-hand tool manipulation is finding a sequence of grasps that facilitates every step of the tool use process while continuously maintaining force closure and stability. Due to the complexity of modeling the contact dyna...
-
Recent work has demonstrated the ability of deep reinforcement learning (RL) algorithms to learn complex robotic behaviours in simulation, including in the domain of multi-fingered manipulation. However, such models can be challenging to transfer to the real world due to the gap between simulation and reality. In this paper, we present our techniques to train a) a policy that can perform robust de...
-
Although deep reinforcement learning has recently been very successful at learning complex behaviors, it requires a tremendous amount of data to learn a task. One of the fundamental reasons causing this limitation lies in the nature of the trial-and-error learning paradigm of reinforcement learning, where the agent communicates with the environment and pro-gresses in the learning only relying on t...
-
Existing video super-resolution methods often utilize a few neighboring frames to generate a higher-resolution image for each frame. However, the abundant information in distant frames has not been fully exploited in these methods: corresponding patches of the same instance appear across distant frames at different scales. Based on this observation, we propose to improve the video super-resolution...
-
Intelligent robots require object-level scene understanding to reason about possible tasks and interactions with the environment. Moreover, many perception tasks such as scene reconstruction, image retrieval, or place recognition can benefit from reasoning on the level of objects. While keypoint-based matching can yield strong results for finding correspondences for images with small to medium vie...
-
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones. However, existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices. In this work, we propose a sim...
-
This work presents a new method for unsupervised thermal image classification and semantic segmentation by transferring knowledge from the RGB domain using a multi-domain attention network. Our method does not require any thermal annotations or co-registered RGB-thermal pairs, enabling robots to perform visual tasks at night and in adverse weather conditions without incurring additional costs of d...
-
Event-based cameras have recently shown great potential for high-speed motion estimation owing to their ability to capture temporally rich information asynchronously. Spiking Neural Networks (SNNs), with their neuro-inspired event-driven processing can efficiently handle such asynchronous data, while neuron models such as the leaky-integrate and fire (LIF) can keep track of the quintessential timi...
-
3D face reconstruction plays a major role in many human-robot interaction systems, from automatic face authentication to human-computer interface-based entertainment. To improve robustness against occlusions and noise, 3D face reconstruction networks are often trained on a set of in-the-wild face images preferably captured along different viewpoints of the subject. However, collecting the required...
-
Existing multi-agent perception algorithms usually select to share deep neural features extracted from raw sensing data between agents, achieving a trade-off between accuracy and communication bandwidth limit. However, these methods assume all agents have identical neural networks, which might not be practical in the real world. The transmitted features can have a large domain gap when the models ...
-
As robots are widely used in assisting manual tasks, an interesting challenge is: Can mobile robots help create a labeled knowledge dataset that can be used for efficiently creating deep learning models for other sensors? This paper proposes an Unsupervised Person Labeling and Identification (UPLIFT) framework to automatically enlarge the labeled knowledge dataset. Typically, manual data labeling ...
-
We are witnessing significant progress on perception models, specifically those trained on large-scale internet images. However, efficiently generalizing these perception models to unseen embodied tasks is insufficiently studied, which will help various relevant applications (e.g., home robots). Unlike static perception methods trained on pre-collected images, the embodied agent can move around in...
-
The development of embodied agents that can communicate with humans in natural language has gained increasing interest over the last years, as it facilitates the diffusion of robotic platforms in human-populated environments. As a step towards this objective, in this work, we tackle a setting for visual navigation in which an autonomous agent needs to explore and map an unseen indoor environment w...
-
Miniaturized autonomous unmanned aerial vehicles (UAVs) are an emerging and trending topic. With their form factor as big as the palm of one hand, they can reach spots otherwise inaccessible to bigger robots and safely operate in human surroundings. The simple electronics aboard such robots (sub-100 mW) make them particularly cheap and attractive but pose significant challenges in enabling onboard...
-
We propose a compact pipeline to unify all the steps of Visual Localization: image retrieval, candidate re-ranking and initial pose estimation, and camera pose refinement. Our key assumption is that the deep features used for these individual tasks share common characteristics, so we should reuse them in all the procedures of the pipeline. Our DRAN (Deep Retrieval and image Alignment Network) is a...
-
In this work we present FreDSNet, a deep learning solution which obtains semantic 3D understanding of indoor environments from single panoramas. Omnidirectional images reveal task-specific advantages when addressing scene understanding problems due to the 360-degree contextual information about the entire environment they provide. However, the inherent characteristics of the omnidirectional images...
-
Robust visual place recognition (VPR) against significant appearance changes is crucial for the life-long operation of mobile robots. Focusing on this task, we propose a Co-Attentive Hierarchical Image Representations (CAHIR) framework for VPR, which unifies attention-sharing global and local descriptor generation into one encoding pipeline. The hierarchical descriptors are applied to a coarse-to-...
-
We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual- inertial odometry to produce dense depth estimates with metric scale. Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment. We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in inverse RMS...
-
Despite the giant leap made in object 6D pose estimation and robotic grasping under structured scenarios, most approaches depend heavily on the exact CAD models of target objects beforehand, thereby limiting their wide applications. To address this, we propose a novel knowledge-guided network - KGNet to estimate the pose and size of category-level unseen objects. This network includes three primar...
-
We demonstrate how off-the-shelf single-image depth estimation methods can be augmented with guidance from optical flow to achieve consistent and accurate online depth estimation using video sequences of static scenes. While previous work has successfully leveraged the complementary nature of optical flow and depth estimation, these techniques use computationally expensive test time optimization s...
-
Correspondence identification (CoID) is an essential component for collaborative perception in multi-robot systems, such as connected autonomous vehicles. The goal of CoID is to identify the correspondence of objects observed by multiple robots in their own field of view in order for robots to consistently refer to the same objects. CoID is challenging due to perceptual aliasing, object non-covisi...
-
Human-assistive systems, such as robots, need to correctly understand the surrounding situation based on obser-vations and output the required support actions for humans. Language is one of the important channels to communicate with humans, and robots are required to have the ability to express their understanding and action-planning results. In this study, we propose a new task of operative actio...
-
Unsupervised visual odometry as an active topic has attracted extensive attention, benefiting from its label-free practical value and robustness in real-world scenarios. However, the performance of camera pose estimation and tracking through deep neural network is still not as ideal as most other tasks, such as detection, segmentation and depth estimation, due to the lack of drift correction in th...
-
Accurately estimating the human pose is an essential task for many applications in robotics. However, existing pose estimation methods suffer from poor performance when occlusion occurs. Recent advances in NLP have been very successful in predicting the missing words conditioned on visible words. We draw upon the sentence completion analogy in NLP to guide our model to address occlusions in the po...
-
We introduce a new task of question generation to eliminate the uncertainty of referring expressions in 3D indoor environments (3D-REQ). Referring to an object using natural language is one of the most common occurrences in daily human conversations; therefore, instructing robots to identify a certain object using natural language could be an essential task in var-ious robotic applications, such a...
-
Gaze estimation provides insight into a person's intent and engagement level, which is helpful in collaborative human-robot applications. With significant advancements in deep learning architectures, appearance-based gaze estimation has gained much attention. Appearance-based methods have shown significant improvement in gaze accuracy and, unlike traditional approaches, they function well in envir...
-
Many older adults prefer to stay in their own homes and age-in-place. However, physical and cognitive limitations in independently completing activities of daily living (ADLs) requires older adults to receive assistive support, often necessitating transitioning to care centers. In this paper, we present the development of a novel deep learning human activity recognition and classification architec...
-
Given a building floorplan, humans can localize themselves by matching the observation of the environment with the floorplan using geometric, semantic, and topological clues. Inspired by this insight, this paper proposes a learning- based topometric robot localization method FloorplanNet, which implements a match between a metric robot map and the potentially inaccurate building floorplan in nonun...
-
Autonomous localization in unknown environments is a fundamental problem in many emerging fields and the monocular visual approach offers many advantages, due to being a rich source of information and avoiding comparatively more complicated setups and multisensor calibration. Deep learning opened new venues for monocular odometry yielding not only end-to-end approaches but also hybrid methods comb...
-
Correspondence selection, a crucial step in many computer vision tasks, aims to distinguish between inliers and outliers from putative correspondences. The coherence of correspondences is often used for predicting inlier probability, but it is difficult for neural networks to extract coherence contexts based only on quadruple coordinates. To overcome this difficulty, we propose enhancing the preli...
-
A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks, such as image matching, image retrieval, and visual localization. State-of-the-art descriptors, from hand-crafted descriptors such as SIFT to learned ones such as HardNet, are usually high-dimensional; 128 dimensions or even more. The higher the dimensionality, the large...
-
Visual SLAM (Simultaneous Localisation and Mapping) aims to simultaneously estimate camera poses and depth maps from navigation videos captured. While recent deep learning based methods have achieved great success on this task, they tend to work well on source domain data and suffer from performance degradation on the unseen data of target domain. Hence, we propose an online adaptation approach to...
-
This paper presents a simultaneous localization and map-assisted environment recognition (SLAMER) method. Mobile robots usually have an environment map and environment information can be assigned to the map. Important information such as no entry zone can be predicted from the map if localization has succeeded. However, this prediction is failed when localization does not work. Uncertainty of pose...
-
Performing accurate localization while maintaining the low-level communication bandwidth is an essential challenge of multi-robot simultaneous localization and mapping (MR-SLAM). In this paper, we tackle this problem by generating a compact yet discriminative feature descriptor with minimum inference time. We propose descriptor distillation that formulates the descriptor generation into a learning...
-
Occupancy mapping has been widely utilized to represent the surroundings for autonomous robots to perform tasks such as navigation and manipulation. While occupancy mapping in 2-D environments has been well-studied, there have been few approaches suitable for 3-D dynamic occupancy mapping which is essential for aerial robots. This paper presents a novel 3-D dynamic occupancy mapping algorithm call...
-
State-of-the-art monocular visual-inertial odometry (VIO) approaches rely on sparse point features in part due to their efficiency, robustness, and prevalence, while ignoring high-level structural regularities such as planes that are common to man-made environments and can be exploited to further constrain motion. Generally, planes can be observed by a camera for significant periods of time due to...
-
In this paper, we present BAMF-SLAM, a novel multi-fisheye visual-inertial SLAM system that utilizes Bundle Adjustment (BA) and recurrent field transforms (RFT) to achieve accurate and robust state estimation in challenging scenarios. First, our system directly operates on raw fisheye images, enabling us to fully exploit the wide Field-of-View (FoV) of fisheye cameras. Second, to overcome the low-...
-
In this paper, we present our approach to efficiently leveraging GPU resources to improve the performance of local bundle adjustment for visual-inertial SLAM. We observe that for local bundle adjustment (i) the Schur complement method, a technique often used to speed up bundle adjustment, has the largest overhead when solving for the parameter update, and (ii) the workload consists of operations o...
-
In recent years, the visual-inertial-ranging (VIR) state estimator with a position-unknown UWB network has become popular. However, most existing VIR methods leverage centralized algorithms to initialize the UWB anchors, which are challenging to be applied to massive UWB networks. In this paper, we propose a distributed initialization method for consistent visual-inertial-ranging odometry with a p...
-
Vascular shunt insertion is a fundamental surgical procedure used to temporarily restore blood flow to tissues. It is often performed in the field after major trauma. We formulate a problem of automated vascular shunt insertion and propose a pipeline to perform Automated Vascular Shunt Insertion (AVSI) using a da Vinci Research Kit. The pipeline uses a learned visual model to estimate the locus of...
-
Communication latency in any delicate telerobotic operation (such as remote surgery over distance) would impose a significant challenge due to the temporal degradation of visual perception and can substantially affect the outcomes. Less is known, however, about the neurophysiological basis of how operators adapt/react to delayed visual feedback. Identification of such neural markers might provide ...
-
Injections into specific retinal layers of the eye present a serious challenge to surgeons in terms of accuracy and perception. The emergence of new gene therapies further emphasizes the need for effective tools for localized drug delivery. Unlike the dominant approach of delivering drugs via a transvitreal intraocular pathway, this paper demonstrates the feasibility of delivering injections into ...
-
Manual labeling of gestures in robot-assisted surgery is labor intensive, prone to errors, and requires expertise or training. We propose a method for automated and explainable generation of gesture transcripts that leverages the abundance of data for image segmentation. Surgical context is detected using segmentation masks by examining the distances and intersections between the tools and objects...
-
Endotracheal intubation is a mandatory competency for most medical staff. This procedure involves opening the entrance of the patient's upper windpipe using a laryngoscope and then inserting a tube into the windpipe to supply Oxygen to the patient. This time critical intervention requires careful control of the force vector on the tongue to lift it parallel to the jaw than to push the jaw to open ...
-
As a permanent solution for patients who cannot contract their urinary bladder, an artificial detrusor muscle appears a higher outcome approach compared to current sacral neurostimulators featured by severe long-term side effects. In this paper, a novel soft robotic detrusor is presented to overcome the limitations of the state-of-the-art solutions. It is based on two identical origami-based hydra...
-
Cochlear implants (CIs) can improve hearing in patients suffering from sensorineural hearing loss via an electrode array (EA) carefully inserted in the scala tympani. Current EAs can cause trauma during insertion, threatening hearing preservation; hence we proposed a pre-curved thermally drawn EA that curls into the cochlea under the influence of body temperature. However, the additional surgical ...
-
Dexterity is demanded for an endoscopic tool to handle complicated procedures in neurosurgery, e.g., removing diseased tissue from inside the deep brain along a tortuous path. Current robotic tools are either rigid or lack wristed motion ability at the tip, leading to limited usage in minimally invasive procedures. In this paper, a hybrid steerable robot with a magnetic wristed forceps is proposed...
-
While in the design phase of a robotic system for the procedures performed in surgical confined spaces or hard-to-reach-deep surgical fields, designers can leverage a systematic method to compare the design alternatives for tele-surgical manipulators quantitatively. Unlike most of the work in the literature, we propose an approach for comparing design alternatives by considering the spurious motio...
-
Many surgical tasks require three or more tools working together, where a hands-free interface could extend a surgeon's actions to control a third surgical tool. However, most current interfaces do not allow skilled control of grasping critical to robotic manipulation. Here we first present a systematic study to identify efficient and intuitive interaction strategies to control grasping of a surgi...
-
Transseptal puncture (TSP) is a prerequisite for left atrial catheter ablation for atrial fibrillation, requiring access from the right side of the heart. It is a demanding procedural step associated with complications, including inadvertent puncturing and application of large forces on the tissue wall. Robotic systems have shown great potential to overcome such challenges by introducing force-sen...
-
Despite the availability of computer-aided simulators and recorded videos of surgical procedures, junior residents still heavily rely on experts to answer their queries. However, expert surgeons are often overloaded with clinical and academic workloads and limit their time in answering. For this purpose, we develop a surgical question-answering system to facilitate robot-assisted surgical scene an...
-
Teleoperated techniques enable remote human-robot interaction and have been widely accepted in robot-assisted surgeries. However, it is still hard to guarantee the safety of teleoperated surgery due to the imperfect input commands limited by remote perception, preventing teleoperated surgery from being widely used. We propose a new framework to avoid the collision of surgery robots and human tissu...
-
In medical robotics and image-guided surgery (IGS), registration is needed in order to align together the coordinate frames of robots, medical imaging modalities, surgical tools, and patients. Existing registration algorithms often assume one point set to be a noise-free model while the other to contain noise and outliers. However, in real scenarios, noise and outliers can exist in both point sets...
-
In robotic assisted surgeries, surgical tools are inserted into the human body via an incision point in the abdominal wall, which is imposed as a remote center of motion (RCM). The selection of the incision's point location in the human body is critical for the success of the surgical procedure. In this paper, we propose a simulation tool for finding the optimal incision point location, which can ...
-
Robot-assisted percutaneous needle insertion is expected to significantly increase targeting accuracy in minimally invasive operations. For this, it is necessary to provide mathematical models that can accurately capture the underlying dynamics of medical needles. Here, we present a novel nonlinear mathematical model of flexible medical needles based on the Absolute Nodal Coordinate Formulation. T...
-
The diagnostic value of biopsies is highly dependent on the placement of needles. Robotic trajectory guidance has been shown to improve needle positioning, but feedback for real-time navigation is limited. Haptic display of needle tip forces can provide rich feedback for needle navigation by enabling localization of tissue structures along the insertion path. We present a collaborative robotic bio...
-
Coexisting with others and interacting in society implies sharing knowledge and attention about world objects, events, features, episodes, and even imagination or abstract ideas in time and space. Inspired by human phenomenological, cognitive and behavioral research, this work focuses on the study of joint attention (JA) for human-robot interaction (HRI), based on two main assumptions: a) the perc...
-
In this work, we study how to build socially intelligent robots to assist people in their homes. In particular, we focus on assistance with online goal inference, where robots must simultaneously infer humans' goals and how to help them achieve those goals. Prior assistance methods either lack the adaptivity to adjust helping strategies (i.e., when and how to help) in response to uncertainty about...
-
Embodied agents are expected to perform more complicated tasks in an interactive environment, with the progress of Embodied AI in recent years. Existing embodied tasks including Embodied Referring Expression (ERE) and other QA-form tasks mainly focuses on interaction in term of linguistic instruction. Therefore, enabling the agent to manipulate objects in the environment for exploration actively h...
-
This paper introduces a deep learning (DL) approach to predicting congestion delays in large multi-robot systems. The problem is motivated by real-world problems in modern logistics automation, such as a warehouse with hundreds to thousands of coordinated mobile robots. Here, the large scale, the complexity of the control software, and the uncertainties of the robots' dynamics make direct (simulat...
-
We consider the coordinated escort problem, where a decentralised team of supporting robots implicitly assist the mission of higher-value principal robots. The defining challenge is how to evaluate the effect of supporting robots' actions on the principal robots' mission. To capture this effect, we define two novel auxiliary reward functions for supporting robots called satisfaction improvement an...
-
We investigate and develop algorithms for social fairness in coverage control problems. Existing coverage control methods are efficient, optimizing the average expected distance from any event to the nearest robot. However, in societal applications like disaster response or transportation, these conventional objectives lead to disparate coverage costs with respect to different groups within a popu...
-
We develop a resilient binary hypothesis testing frame-work for decision making in adversarial multi-robot crowdsensing tasks. This framework exploits stochastic trust observations between robots to arrive at tractable, resilient decision making at a centralized Fusion Center (FC) even when i) there exist malicious robots in the network and their number may be larger than the number of legitimate ...
-
This paper proposes an algorithm for generating Pareto-optimal privacy-aware trajectories for multi-robot coverage. Our approach utilizes a genetic algorithm to generate a set of modified trajectories for a team of robots that wishes to obscure its goal from an observer. A novel velocity-constrained crossover algorithm ensures all child trajectories are feasible for a holonomic vehicle. The Pareto...
-
The motion planning problem for multiple unstop-pable agents is of interest in many robotics applications, for example, autonomous traffic management for multiple fixed-wing aircraft. Unfortunately, many of the existing algorithms cannot provide safety for such agents, because they require the agents to be able to brake to a complete stop for safety and feasibility insurance. In this paper, we pre...
-
In this paper, we consider a team of mobile robots executing simultaneously multiple behaviors by different subgroups, while maintaining global and subgroup line-of-sight (LOS) network connectivity that minimally constrains the original multi-robot behaviors. The LOS connectivity between pairwise robots is preserved when two robots stay within the limited communication range and their LOS remains ...
-
In this work, we address a visbility-based target tracking problem in a polygonal environment in which a group of mobile observers try to maintain a line-of-sight with a mobile intruder. We build a bridge between data mining and visibility-based tracking using a novel tiling scheme for the polygon. First, we propose a tracking strategy for a team of guards located on the tiles to dynamically track...
-
This paper proposes a decentralized passive impedance control scheme for collaborative grasping using under-actuated aerial manipulators (AMs). The AM system is formulated, using a proper coordinate transformation, as an inertially decoupled dynamics with which a passivity-based control design is conducted. Since the interaction for grasping can be interpreted as a feedback interconnection of pass...
-
In this work, we propose a distributed control scheme for multi-robot systems in the presence of multiple constraints using control barrier functions. The proposed scheme expands previous work where only one single constraint can be handled. Here we show how to transform multiple constraints to a collective one using a smoothly approximated minimum function. Additionally, human-in-the-loop control...
-
This paper presents LEMURS, an algorithm for learning scalable multi-robot control policies from cooperative task demonstrations. We propose a port-Hamiltonian description of the multi-robot system to exploit universal physical constraints in interconnected systems and achieve closed-loop stability. We represent a multi-robot control policy using an architecture that combines self-attention mechan...
-
Active search, in applications like environment monitoring or disaster response missions, involves autonomous agents detecting targets in a search space using decision making algorithms that adapt to the history of their observations. Active search algorithms must contend with two types of uncertainty: detection uncertainty and location uncertainty. The more common approach in robotics is to focus...
-
Unmanned Aerial Vehicles (UAVs) have become prevalent in Search-And-Rescue (SAR) missions. However, existing solutions to the control and coordination of UAV s are mostly limited to specific environments and are not robust to handle unreliable/unstable communications. To deal with these challenges, Hierarchical Multi-Agent Actor-Critic (HMAAC) framework is proposed where a high-level policy is pla...
-
Robotic solutions for quick disaster response are essential to ensure minimal loss of life, especially when the search area is too dangerous or too vast for human rescuers. We model this problem as an asynchronous multi-agent active-search task where each robot aims to efficiently seek objects of interest (OOIs) in an unknown environment. This formulation addresses the requirement that search miss...
-
Rescue missions in mountain environments are hardly achievable by standard legged robots—because of the high slopes—or by flying robots—because of limited payload capacity. We present a concept for a rope-aided climbing robot which can negotiate up-to-vertical slopes and carry heavy payloads. The robot is attached to the mountain through a rope, and it is equipped with a leg to push against the mo...
-
The deployment of robots for Gas Source Localization (GSL) tasks in hazardous scenarios significantly reduces the risk to humans and animals. Gas sensing using mobile robots focuses primarily on simplified scenarios, due to the complexity of gas dispersion, with a current trend towards tackling more complex environments. However, most state-of-art GSL algorithms for environments with obstacles onl...
-
A self-driving car must be able to reliably handle adverse weather conditions (e.g., snowy) to operate safely. In this paper, we investigate the idea of turning sensor inputs (i.e., images) captured in an adverse condition into a benign one (i.e., sunny), upon which the downstream tasks (e.g., semantic segmentation) can attain high accuracy. Prior work primarily formulates this as an unpaired imag...
-
In this paper, we propose a novel learning framework for autonomous systems that uses a small amount of “auxiliary information” that complements the learning of the main modality, called “small-shot auxiliary modality distillation network (AMD-S-Net)”. The AMD-S-Net contains a two-stream framework design that can fully extract information from different types of data (i.e., paired/unpaired multi-m...
-
Accurate camera-to-lidar calibration is a requirement for sensor data fusion in many 3D perception tasks. In this paper, we present SceneCalib, a novel method for simultaneous self-calibration of extrinsic and intrinsic parameters in a system containing multiple cameras and a lidar sensor. Existing methods typically require specially designed calibration targets and human operators, or they only a...
-
Road anomaly detection is critical to safe autonomous driving, because current road scene understanding models are usually trained in a closed-set manner and fail to identify unknown objects. What's worse, it is difficult, if not impossible, to collect a large-scale dataset with anomaly annotations. So this paper studies unsupervised anomaly detection which finds out anomaly regions using scene pa...
-
Learning-based behavior prediction methods are increasingly being deployed in real-world autonomous systems, e.g., in fleets of self-driving vehicles, which are beginning to commercially operate in major cities across the world. Despite their advancements, however, the vast majority of prediction systems are specialized to a set of well-explored geographic regions or operational design domains, co...
-
Autonomous vehicles (AVs) must share the driving space with other drivers and often employ conservative motion planning strategies to ensure safety. These conservative strategies can negatively impact AV's performance and significantly slow traffic throughput. Therefore, to avoid conservatism, we design an interaction-aware motion planner for the ego vehicle (AV) that interacts with surrounding ve...
-
The task of motion forecasting is critical for self- driving vehicles (SDV s) to be able to plan a safe maneuver. Towards this goal, modern approaches reason about the map, the agents' past trajectories and their interactions in order to produce accurate forecasts. The predominant approach has been to encode the map and other agents in the reference frame of each target agent. However, this approa...
-
Moving Object Detection (MOD) is a critical vision task for successfully achieving safe autonomous driving. Despite plausible results of deep learning methods, most existing approaches are only frame-based and may fail to reach reasonable performance when dealing with dynamic traffic participants. Recent advances in sensor technologies, especially the Event camera, can naturally complement the con...
-
A novel mechanism to derive self-entanglement-free path for tethered differential-driven robots is proposed in this work. The problem is tailored to the applications of tethered robots without an omni-directional tether re-tractor which is often encountered when an omni-directional tether retracting mechanism is incapable of being jointly equipped with other geometrically complex devices (e.g. a m...
-
Typical robotic systems rely on models for planning. Therefore, the quality of the robot's behavior is heavily dependent on how accurately the model can predict the outcome of the robot's actions in the environment. A challenge, however, is that no model is perfect; moreover, we often do not know where discrepancies between the model's prediction and the actual outcome occur prior to observing exe...
-
We study the sample placement and shortest tour problem for robots tasked with mapping environmental phenomena modeled as stationary random fields. The objective is to minimize the resources used (samples or tour length) while guaranteeing estimation accuracy. We give approximation algorithms for both problems in convex environments. These improve previously known results, both in terms of theoret...
-
This paper proposes the Real-Time Fast Marching Tree (RT-FMT), a real-time planning algorithm that features local and global path generation, multiple-query planning, and dynamic obstacle avoidance. During the search, RT-FMT quickly looks for the global solution and, in the meantime, generates local paths that can be used by the robot to start execution faster. In addition, our algorithm constantl...
-
We propose an algorithm for solving the time-dependent shortest path problem in flow fields where the FIFO (first-in-first-out) assumption is violated. This problem variant is important for autonomous vehicles in the ocean, for example, that cannot arbitrarily hover in a fixed position and that are strongly influenced by time-varying ocean currents. Although polynomial-time solutions are available...
-
In this paper, we present a novel method of motion planning for performing complex manipulation tasks by using human demonstration and exploiting the screw geometry of motion. We consider complex manipulation tasks where there are constraints on the motion of the end effector of the robot. Examples of such tasks include opening a door, opening a drawer, transferring granular material from one cont...
-
With the development of robotics, ground robots are no longer limited to planar motion. Passive height variation due to complex terrain and active height control provided by special structures on robots require a more general navigation planning framework beyond 2D. Existing methods rarely considers both simultaneously, limiting the capabilities and applications of ground robots. In this paper, we...
-
In very high-dimensional $(\gg 10)$ spaces, a collection of points generated uniformly at random will concentrate very tightly about its expected value - defying intuition developed in low-dimensional spaces. This paper explores the implications of this for two major classes of sample-based robot motion planning algorithms: Rapidly Exploring Random Trees (RRTs) and Probabilistic Road Maps (PRMs). ...
-
Learning-based trajectory planners in robotics have attracted growing interest given their ability to plan for complex tasks. These planners are typically trained in simulation under nominal conditions before being implemented on real robots. However, in real settings, the presence of motion and sensing uncertainties causes the robot to deviate from planned reference trajectories potentially leadi...
-
In many real-world robotic scenarios, we cannot assume exact knowledge about a robot's state due to unmodeled dynamics or noisy sensors. Planning in belief space addresses this problem by tightly coupling perception and planning modules to obtain trajectories that take into account the environment's stochasticity. However, existing works are often limited to tasks such as the classic reach-avoid p...
-
Uncertainty is prevalent in robotics. Due to measurement noise and complex dynamics, we cannot estimate the exact system and environment state. Since conservative motion planners are not guaranteed to find a safe control strategy in a crowded, uncertain environment, we propose a density-based method. Our approach uses a neural network and the Liouville equation to learn the density evolution for a...
-
Adaptive Informative Path Planning with Multi-modal Sensing (AIPPMS) considers the problem of an agent equipped with multiple sensors, each with different sensing accuracy and energy costs. The agent's goal is to explore the environment and gather information subject to its resource constraints in unknown, partially observable environments. Previous work has focused on the less general Adaptive In...
-
Autonomous vehicles (AVs) need to reason about the multimodal behavior of neighboring agents while planning their own motion. Many existing trajectory planners seek a single trajectory that performs well under all plausible futures simultaneously, ignoring bi-directional interactions and thus leading to overly conservative plans. Policy planning, whereby the ego agent plans a policy that reacts to...
-
In active source seeking, a robot takes repeated measurements in order to locate a signal source in a cluttered and unknown environment. A key component of an active source seeking robot planner is a model that can produce estimates of the signal at unknown locations with uncertainty quantification. This model allows the robot to plan for future measurements in the environment. Traditionally, this...
-
In autonomous robotic decision-making under uncertainty, the tradeoff between exploitation and exploration of available options must be considered. If secondary information associated with options can be utilized, such decision-making problems can often be formulated as contextual multi-armed bandits (CMABs). In this study, we apply active inference, which has been actively studied in the field of...
-
Planning and control for uncertain contact systems is challenging as it is not clear how to propagate uncertainty for planning. Contact-rich tasks can be modeled efficiently using complementarity constraints among other techniques. In this paper, we present a stochastic optimization technique with chance constraints for systems with stochastic complementarity constraints. We use a particle filter-...
-
A congestion-aware path planning method is pre-sented for mobile robots during long-term deployment in human occupied environments. With known spatial-temporal crowd patterns, the robot will navigate to its destination via less congested areas. Traditional traffic-aware routing methods do not consider spatial-temporal anomalies of macroscopic crowd behaviour that can deviate from the predicted cro...
-
In this paper, we present a novel Model Predictive Control method for autonomous robot planning and control subject to arbitrary forms of uncertainty. The proposed Risk-Aware Model Predictive Path Integral (RA-MPPI) control utilizes the Conditional Value-at-Risk (CVaR) measure to generate optimal control actions for safety-critical robotic applications. Different from most existing Stochastic MPCs...
-
We consider the problem of motion and communication planning under uncertainty with limited information from a remote sensor network. Because the remote sensors are power and bandwidth limited, we use event-triggered (ET) estimation to manage communication costs. We introduce a fast and efficient sampling-based planner which computes motion plans coupled with ET communication strategies that minim...
-
Advances in robotic skill acquisition have made it possible to build general-purpose libraries of learned skills for downstream manipulation tasks. However, naively executing these skills one after the other is unlikely to succeed without accounting for dependencies between actions prevalent in longhorizon plans. We present Sequencing Task-Agnostic Policies (STAP), a scalable framework for trainin...
-
Modeling dynamics is often the first step to making a vehicle autonomous. While on-road autonomous vehicles have been extensively studied, off-road vehicles pose many challenging modeling problems. An off-road vehicle encounters highly complex and difficult-to-model terrain/vehicle interactions, as well as having complex vehicle dynamics of its own. These complexities can create challenges for eff...
-
Evolutionary algorithms are commonly used to solve many complex optimization problems in such fields as robotics, industrial automation, and complex system design. Yet, their performance is limited when dealing with high-dimensional complex problems because they often require enormous computational resources to yield desired solutions, and they may easily trap into local optima. To solve this prob...
-
Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training [1] (ii) infer action sequences from instructions but require dense sub...
-
Robots performing mobile manipulation in unstructured environments must identify grasp affordances quickly and with robustness to perception noise. Yet in domains such as underwater manipulation, where perception noise is severe, computation is constrained, and the environment is dynamic, existing techniques fail. They are too computationally demanding, or too sensitive to noise to allow for close...
-
The success of 6-DoF grasp learning with point cloud input is tempered by the computational costs resulting from their unordered nature and pre-processing needs for reducing the point cloud to a manageable size. These properties lead to failure on small objects with low point cloud cardinality. Instead of point clouds, this manuscript explores grasp generation directly from the RGB-D image input. ...
-
The choice of a grasp plays a critical role in the success of downstream manipulation tasks. Consider a task of placing an object in a cluttered scene; the majority of possible grasps may not be suitable for the desired placement. In this paper, we study the synergy between the picking and placing of an object in a cluttered scene to develop an algorithm for task-aware grasp estimation. We present...
-
Planar grasp detection is one of the most fundamental tasks to robotic manipulation, and the recent progress of consumer-grade RGB-D sensors enables delivering more comprehensive features from both the texture and shape modalities. However, depth maps are generally of a relatively lower quality with much stronger noise compared to RGB images, making it challenging to acquire grasp depth and fuse m...
-
Contact can be conceptualized as a set of constraints imposed on two bodies that are interacting with one another in some way. The nature of a contact, whether a point, line, or surface, dictates how these bodies are able to move with respect to one another given a force, and a set of contacts can provide either partial or full constraint on a body's motion. Decades of work have explored how to ex...
-
We introduce a spherical fingertip sensor for dynamic manipulation. It is based on barometric pressure and time-of-flight proximity sensors and is low-latency, compact, and physically robust. The sensor uses a trained neural network to estimate the contact location and three-axis contact forces based on data from the pressure sensors, which are embedded within the sensor's sphere of polyurethane r...
-
We study the problem of object retrieval in scenarios where visual sensing is absent, object shapes are unknown beforehand and objects can move freely, like grabbing objects out of a drawer. Successful solutions require localizing free objects, identifying specific object instances, and then grasping the identified objects, only using touch feedback. Unlike vision, where cameras can observe the en...
-
In this paper, we address the problem of using visuo-tactile feedback for 6-DoF localization and 3D reconstruction of unknown in-hand objects. We propose FingerSLAM, a closed-loop factor graph-based pose estimator that combines local tactile sensing at finger-tip and global vision sensing from a wrist-mount camera. FingerSLAM is constructed with two constituent pose estimators: a multi-pass refine...
-
To fully explore the potential of robots for dexterous manipulation, this paper presents a whole dynamic grasping process to achieve fluent grasping of a target object by the robot end-effector. The process starts from the phase of approaching the object over the phases of colliding with the object and letting it roll about the colliding point to the final phase of catching it by the palm or grasp...
-
The Synthetic Nervous System (SNS) is a biologically inspired neural network (NN). Due to its capability of capturing complex mechanisms underlying neural computation, an SNS model is a candidate for building compact and interpretable NN controllers for robots. Previous work on SNSs has focused on applying the model to the control of legged robots and the design of functional subnetworks (FSNs) to...
-
In this paper, we propose a signed distance field (SDF)-based deep Q-learning framework for multi-object re-arrangement. Our method learns to rearrange objects with non-prehensile manipulation, e.g., pushing, in unstructured environments. To reliably estimate Q-values in various scenes, we train the Q-network using an SDF-based scene graph as the state-goal representation. To this end, we introduc...
-
Language-based communications are essential in human-robot interaction, especially for the majority of non-expert users. In this paper, we present SeeAsk, an open-world interactive visual grounding system to grasp specified targets with ambiguous natural language instructions. The main contribution of SeeAsk is that it can robustly handle open-world scenes in terms of both open-set objects and ope...
-
Generating dexterous grasping has been a long-standing and challenging robotic task. Despite recent progress, existing methods primarily suffer from two issues. First, most prior art focuses on a specific type of robot hand, lacking generalizable capability of handling unseen ones. Second, prior arts oftentimes fail to rapidly generate diverse grasps with a high success rate. To jointly tackle the...
-
The application of mechanical and other physical properties to the development of robotic systems that can easily adapt to changing external situations is known as mechanical intelligence. Following this concept, many robot hand designs can produce self-adaptive and versatile grasps with simple underactuated fingers and open-loop control, while mechanical- intelligent strategies for dexterous mani...
-
Multi-finger grasping relies on high quality training data, which is hard to obtain: human data is hard to transfer and synthetic data relies on simplifying assumptions that reduce grasp quality. By making grasp simulation differentiable, and contact dynamics amenable to gradient-based optimization, we accelerate the search for high-quality grasps with fewer limiting assumptions. We present Grasp'...
-
Robot manipulation today generally focuses on motions exclusively with a robot arm or a dexterous hand, but usually not a combination of both. However, complex manipulation tasks can require coordinating arm and hand motions that leverage capabilities of both, much like the coordinated arm and hand motions carried out by humans to perform everyday tasks. In this work, we evaluate unified manipulat...
-
Modern collaborative robotic applications require robot motions that are predictable for human coworkers. Therefore, trajectories often need to be planned in task space rather than configuration space ($\mathcal{C}$-space). While the interpolation of translations in Euclidean space is straightforward, the interpolation of rotations in $SO(3)$ is more complex. Most approaches originating from compu...
-
3D object reconfiguration encompasses common robot manipulation tasks in which a set of objects must be moved through a series of physically feasible state changes into a desired final configuration. Object reconfiguration is challenging to solve in general, as it requires efficient reasoning about environment physics that determine action validity. This information is typically manually encoded i...
-
This paper explores the problem of collision-free motion generation for manipulators by formulating it as a global motion optimization problem. We develop a parallel optimization technique to solve this problem and demonstrate its effectiveness on massively parallel GPUs. We show that combining simple optimization techniques with many parallel seeds leads to solving difficult motion generation pro...
-
Allowing Safe Contact in Robotic Goal-Reaching: Planning and Tracking in Operational and Null Spaces
In recent years, impressive results have been achieved in robotic manipulation. While many efforts focus on generating collision-free reference signals, few allow safe contact between the robot bodies and the environment. However, in human's daily manipulation, contact between arms and obstacles is prevalent and even necessary. This paper investigates the benefit of allowing safe contact during ro...
-
Rearrangement-based nonprehensile manipulation still remains as a challenging problem due to the high-dimensional problem space and the complex physical uncertainties it entails. We formulate this class of problems as a coupled problem of local rearrangement and global action optimization by incorporating free-space transit motions between constrained rearranging actions. We propose a forest-based...
-
When a robot end-effector is attached to the arm via passive joints, undesirable end-effector sway will occur. In a forestry crane, such as the log-loading or harvesting machine, this sway is problematic as it hinders the efficiency and also can harm the machine and environment. Here, we tackle the sway problem of the forestry forwarder by proposing a methodology for generating anti-sway trajector...
-
Real-world manipulation problems in heavy clutter require robots to reason about potential contacts with objects in the environment. We focus on pick-and-place style tasks to retrieve a target object from a shelf where some ‘movable’ objects must be rearranged in order to solve the task. In particular, our motivation is to allow the robot to reason over and consider non-prehensile rearrangement ac...
-
Robots often have to perform manipulation tasks in close proximity to people (Fig 1). As such, it is desirable to use a robot arm that has limited joint torques so as to not injure the nearby person. Unfortunately, these limited torques then limit the payload capability of the arm. By using contact with the environment, robots can expand their reachable workspace that, otherwise, would be inaccess...
-
This paper proposes a novel real-time semantic segmentation network via frequency domain learning, called FDLNet, which revisits the segmentation task from two critical perspectives: spatial structure description and multilevel feature fusion. We first devise an image-size convolution (IS-Conv) as a global frequency-domain learning operator to capture long-range dependency in a single shot. To mod...
-
Semantic segmentation for robotic systems can enable a wide range of applications, from self-driving cars and augmented reality systems to domestic robots. We argue that a spherical representation is a natural one for egocentric pointclouds. Thus, in this work, we present a novel framework exploiting such a representation of LiDAR pointclouds for the task of semantic segmentation. Our approach is ...
-
We propose a novel scene-robot interaction graph (SRI-Graph) that exploits the known position of a mobile manipulator for robust and accurate scene understanding. Compared to the state-of-the-art scene graph approaches, the proposed SRI-Graph captures not only the relationships between the objects, but also the relationships between the robot manipulator and objects with which it interacts. To imp...
-
Numerous applications require robots to operate in environments shared with other agents, such as humans or other robots. However, such shared scenes are typically subject to different kinds of long-term semantic scene changes. The ability to model and predict such changes is thus crucial for robot autonomy. In this work, we formalize the task of semantic scene variability estimation and identify ...
-
Wearable devices have garnered widespread attention as a mobile solution, and various intelligent modules based on wearable devices are increasingly being integrated. Additionally, image captioning is an important task in computer vision that maps images to text. Existing image captioning achievements are based on high-quality visible images. However, higher target complexity and insufficient ligh...
-
We present an approach for estimating a mobile robot's pose w.r.t. the allocentric coordinates of a network of static cameras using multi-view RGB images. The images are processed online, locally on smart edge sensors by deep neural networks to detect the robot and estimate 2D keypoints defined at distinctive positions of the 3D robot model. Robot keypoint detections are synchronized and fused on ...
-
General scene understanding for robotics requires flexible semantic representation, so that novel objects and structures which may not have been known at training time can be identified, segmented and grouped. We present an algorithm which fuses general learned features from a standard pre-trained network into a highly efficient 3D geometric neural field representation during real-time SLAM. The f...
-
We suggest the first system that runs real-time semantic segmentation via deep learning on the weak microcomputer Raspberry Pi Zero v2 (whose price was $15) attached to a toy drone. In particular, since the Raspberry Pi weighs less than 16 grams, and its size is half of a credit card, we could easily attach it to the common commercial DJI Tello toy-drone (<$100, <90 grams, 98 $\times 92.5\times 41...
-
Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques. Building on the successes of recent Transformer-based methods for object detection and image segmentation, we propose the first Transformer-based approach for 3D semantic instance segmentation. We show that we can leverage generic T...
-
Changes in topological spatial relations of objects are often strong indicators for state transitions in the underlying processes they are involved in. While various aspects of semantic mapping have been extensively researched, the reasoning about the temporal development of spatial relations of instances is often neglected. This paper presents a concept to combine a semantic map with a stream pro...
-
Dynamic scene graphs generated from video clips could help enhance the semantic visual understanding in a wide range of challenging tasks such as environmental perception, autonomous navigation, and task planning of self-driving vehicles and mobile robots. In the process of temporal and spatial modeling during dynamic scene graph generation, it is particularly intractable to learn time-variant rel...
-
A fast and accurate panoptic segmentation system for LiDAR point clouds is crucial for autonomous driving vehicles to understand the surrounding objects and scenes. Existing approaches usually rely on proposals or clustering to segment foreground instances. As a result, they struggle to achieve real-time performance. In this paper, we propose a novel real-time end-to-end panoptic segmentation netw...
-
Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are consider...
-
We present a learning algorithm, DifFAR, for human activity recognition in videos. Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras that contain a human actor along with background motion. Typically, the human actors occupy less than one-tenth of the spatial resolution. DifFAR simultaneously harnesses the benefits of frequency domain represen...
-
Our work examines the way in which large language models can be used for robotic planning and sampling in the context of automated photographic documentation. Specifically, we illustrate how to produce a photo-taking robot with an exceptional level of semantic awareness by leveraging recent advances in general purpose language (LM) and vision-language (VLM) models. Given a high-level description o...
-
Scene completion refers to obtaining dense scene representation from an incomplete perception of complex 3D scenes. This helps robots detect multi-scale obstacles and analyse object occlusions in scenarios such as autonomous driving. Recent advances show that implicit representation learning can be leveraged for continuous scene completion and achieved through physical constraints like Eikonal equ...
-
Modern autonomous systems often rely on LiDAR scanners, in particular for autonomous driving scenarios. In this context, reliable scene understanding is indispensable. Conventional learning-based methods generally try to achieve maximum performance for this task, while neglecting a proper estimation of the associated uncertainties. In this work, we introduce a novel approach for solving the task o...
-
Video frame interpolation (VFI) is a fundamental vision task that aims to synthesize several frames between two consecutive original video images. Most algorithms aim to accomplish VFI by using only keyframes, which is an ill-posed problem since the keyframes usually do not yield any accurate precision about the trajectories of the objects in the scene. On the other hand, event-based cameras provi...
-
The insufficient number of annotated thermal infrared (TIR) image datasets not only hinders TIR image-based deep learning networks to have comparable performances to that of RGB but it also limits the supervised learning of TIR image-based tasks with challenging labels. As a remedy, we propose a modified multidomain RGB to TIR image translation model focused on edge preservation to employ annotate...
-
Weakly supervised referring expression grounding aims to train a model without the manual labels between image regions and referring expressions during the training phase. Current predominant models often adopt deep structures to reconstruct the region-expression correspondence. A crucial deficiency of the existing approaches lies in that these models neglect to exploit potential valuable informat...
-
State recognition of objects and environment in robots has been conducted in various ways. In most cases, this is executed by processing point clouds, learning images with annotations, and using specialized sensors. In contrast, in this study, we propose a state recognition method that applies Visual Question Answering (VQA) in a Pre-Trained Vision-Language Model (PTVLM) trained from a large-scale...
-
Leveraging multi-modal fusion, especially between camera and LiDAR, has become essential for building accurate and robust 3D object detection systems for autonomous vehicles. Until recently, point decorating approaches, in which point clouds are augmented with camera features, have been the dominant approach in the field. However, these approaches fail to utilize the higher resolution images from ...
-
With the rapid development of autonomous driving, the need for auto-labeling reference systems is becoming increasingly urgent. 3D multiple object tracking (MOT) is one of the most critical components of the reference system. In this work, we reviewed and rethought the common failure sources and limitations of the SOTA 3D MOT methods. We propose a set of innovative 3D MOT post-processing modules a...
-
Neural Radiance Fields (NeRFs) have made great success in representing complex 3D scenes with high-resolution details and efficient memory. Nevertheless, current NeRF - based pose estimators have no initial pose prediction and are prone to local optima during optimization. In this paper, we present LATITUDE: Global Localization with Truncated Dynamic Low-pass Filter, which introduces a two-stage l...
-
LiDAR-based SLAM may easily fail in adverse weathers (e.g., rain, snow, smoke, fog), while mmWave Radar remains unaffected. However, current researches are primarily focused on 2D $(x,y)$ or 3D ($x, y$, doppler) Radar and 3D LiDAR, while limited work can be found for 4D Radar ($x, y, z$, doppler). As a new entrant to the market with unique characteristics, 4D Radar outputs 3D point cloud with adde...
-
Pairwise point cloud registration is a critical task for many applications, which heavily depends on finding correct correspondences from the two point clouds. However, the low overlap between input point clouds causes the registration to fail easily, leading to mistaken overlapping and mismatched correspondences, especially in scenes where non-overlapping regions contain similar structures. In th...
-
Segmenting unseen objects is a crucial ability for the robot since it may encounter new environments during the operation. Recently, a popular solution is leveraging RGB-D features of large-scale synthetic data and directly applying the model to unseen real-world scenarios. However, the domain shift caused by the sim2real gap is inevitable, posing a crucial challenge to the segmentation model. In ...
-
Perception is crucial for robots that act in real-world environments, as autonomous systems need to see and understand the world around them to act properly. Panoptic segmentation provides an interpretation of the scene by computing a pixelwise semantic label together with instance IDs. In this paper, we address panoptic segmentation using RGB-D data of indoor scenes. We propose a novel encoder-de...
-
Local-HDP (Local Hierarchical Dirichlet Process) is a hierarchical Bayesian method recently used for open-ended 3D object category recognition. It has been proven to be efficient in real-time robotic applications. However, the method is not robust to a high degree of occlusion. We address this limitation in two steps. First, we propose a novel semantic 3D object-parts segmentation method that has ...
-
Point cloud registration is a fundamental and challenging problem for autonomous robots interacting in unstructured environments for applications such as object pose estimation, simultaneous localization and mapping, robot-sensor calibration, and so on. In global correspondence-based point cloud registration, data association is a highly brittle task and commonly produces high amounts of outliers....
-
We study the problem of object reconstruction in a multi-agent collaboration scenario. Specifically, we focus on the reconstruction of specific goals through several cooperative agents equipped with vision sensors to achieve higher efficiency than single agents. Our main insight is that a complete 3D object can be split into several local 3D models and assigned to different agents. In addition, we...
-
Monocular depth estimation plays a critical role in various computer vision and robotics applications such as localization, mapping, and 3D object detection. Recently, learning-based algorithms achieve huge success in depth estimation by training models with a large amount of data in a supervised manner. However, it is challenging to acquire dense ground truth depth labels for supervised training,...
-
Generative adversarial imitation learning (GAIL) — a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large environments. However, GAIL shares the limitation of other imitation learning methods that they can seldom surpass the performance of demonstrations. In this paper, to address the limit of GAIL, we propose GAN-based interactiv...
-
Long-term non-prehensile planar manipulation is a challenging task for robot planning and feedback control. It is characterized by underactuation, hybrid control, and contact uncertainty. One main difficulty is to determine both the continuous and discrete contact configurations, e.g., contact points and modes, which requires joint logical and geometrical reasoning. To tackle this issue, we propos...
-
Observing a human demonstrator manipulate objects provides a rich, scalable and inexpensive source of data for learning robotic policies. However, transferring skills from human videos to a robotic manipulator poses several challenges, not least a difference in action and observation spaces. In this work, we use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-...
-
This paper proposes a probabilistic motion prediction method for long motions. The motion is predicted so that it accomplishes a task from the initial state observed in the given image. While our method evaluates the task achievability by the Energy-Based Model (EBM), previous EBMs are not designed for evaluating the consistency between different domains (i.e., image and motion in our method). Our...
-
Reinforcement learning systems have the potential to enable continuous improvement in unstructured environments, leveraging data collected autonomously. However, in practice these systems require significant amounts of instrumentation or human intervention to learn in the real world. In this work, we propose a system for reinforcement learning that leverages multi-task reinforcement learning boots...
-
The use of human demonstrations in reinforcement learning has proven to significantly improve agent performance. However, any requirement for a human to manually ‘teach’ the model is somewhat antithetical to the goals of reinforcement learning. This paper attempts to minimize human involvement in the learning process while retaining the performance advantages by using a single human example collec...
-
Dynamic Movement Primitives (DMPs) offer great versatility for encoding, generating and adapting complex end-effector trajectories. DMPs are also very well suited to learning manipulation skills from human demonstration. However, the reactive nature of DMPs restricts their applicability for tool use and object manipulation tasks involving non-holonomic constraints, such as scalpel cutting or cathe...
-
This paper presents a novel device that can be used to perform kinesthetic corrective feedback for robotic systems. KRIS (Kinesthetic Robotic Interaction System) is a device that can be mounted on the end-effector of an articulated robot. From here it can be manipulated by a human to give corrective feedback to the robot system during execution and in an intuitive way. The device can provide feedb...
-
Learning from demonstration (LfD) has the potential to greatly increase the applicability of robotic manipulators in modern industrial applications. Recent progress in LfD methods have put more emphasis in learning robustness than in guiding the demonstration itself in order to improve robustness. The latter is particularly important to consider when the target system reproducing the motion is str...
-
In this paper, we analyze the behavior of existing techniques and design new solutions for the problem of one-shot visual imitation. In this setting, an agent must solve a novel instance of a novel task given just a single visual demonstration. Our analysis reveals that current methods fall short because of three errors: the DAgger problem arising from purely offline training, last centimeter erro...
-
Automatic design is a promising approach to generating control software for robot swarms. So far, automatic design has relied on mission-specific objective functions to specify the desired collective behavior. In this paper, we explore the possibility to specify the desired collective behavior via demonstrations. We develop Demo-Cho, an automatic design method that combines inverse reinforcement l...
-
Achieving successful robotic manipulation is an essential step towards robots being widely used in industry and home settings. Recently, many learning-based methods have been proposed to tackle this challenge, with imitation learning showing great promise. However, imperfect demonstrations and a lack of feedback from teleoperation systems may lead to poor or even unsafe results. In this work we ex...
-
Quadrupedal robots resemble the physical ability of legged animals to walk through unstructured terrains. However, designing a controller for quadrupedal robots poses a significant challenge due to their functional complexity and requires adaptation to various terrains. Recently, deep reinforcement learning, inspired by how legged animals learn to walk from their experiences, has been utilized to ...
-
Robotic locomotion is often approached with the goal of maximizing robustness and reactivity by increasing motion control frequency. We challenge this intuitive notion by demonstrating robust and dynamic locomotion with a learned motion controller executing at as low as 8 Hz on a real ANYmal C quadruped. The robot is able to robustly and repeatably achieve a high heading velocity of 1.5 ms-1, trav...
-
Reinforcement Learning (RL) has seen many recent successes for quadruped robot control. The imitation of reference motions provides a simple and powerful prior for guiding solutions towards desired solutions without the need for meticulous reward design. While much work uses motion capture data or hand-crafted trajectories as the reference motion, relatively little work has explored the use of ref...
-
We tackle the problem of perceptive locomotion in dynamic environments. In this problem, a quadrupedal robot must exhibit robust and agile walking behaviors in response to environmental clutter and moving obstacles. We present a hierarchical learning framework, named PRELUDE, which decomposes the problem of perceptive locomotion into high-level decision-making to predict navigation commands and lo...
-
Locomotion has seen dramatic progress for walking or running across challenging terrains. However, robotic quadrupeds are still far behind their biological counterparts, such as dogs, which display a variety of agile skills and can use the legs beyond locomotion to perform several basic manipulation tasks like interacting with objects and climbing. In this paper, we take a step towards bridging th...
-
This work presents a simple linear policy for direct force control for quadrupedal robot locomotion. The motivation is that force control is essential for highly dynamic and agile motions. We learn a linear policy to generate end-foot trajectory parameters and a centroidal wrench, which is then distributed among the legs based on the foot contact information using a quadratic program (QP) to get t...
-
Reinforcement learning (RL) has emerged as a powerful approach for locomotion control of highly articulated robotic systems. However, one major challenge is the tedious process of tuning the reward function to achieve the desired motion style. To address this issue, imitation learning approaches such as adversarial motion priors have been proposed, which encourage a pre-defined motion style. In th...
-
This paper introduces intelligent central pattern generators (iCPGs) that can plan personalized walking trajectories for lower-limb exoskeletons. This can make walking more comfortable for the users by resolving one of the significant shortcomings of most commercially available exoskeletons, which is the use of pre-defined fixed trajectories for all users. The proposed method combines reinforcemen...
-
This paper proposes the transition-net, a robust transition strategy that expands the versatility of robot locomotion in the real-world setting. To this end, we start by distributing the complexity of different gaits into dedicated locomotion policies applicable to real-world robots. Next, we expand the versatility of the robot by unifying the policies with robust transitions into a single coheren...
-
Deep reinforcement learning has recently emerged as an appealing alternative for legged locomotion over multiple terrains by training a policy in physical simulation and then transferring it to the real world (i.e., sim-to-real transfer). Despite considerable progress, the capacity and scalability of traditional neural networks are still limited, which may hinder their applications in more complex...
-
This work developed a kernel-based residual learning framework for quadrupedal robotic locomotion. Ini-tially, a kernel neural network is trained with data collected from an MPC controller. Alongside a frozen kernel network, a residual controller network is trained using reinforcement learning to acquire generalized locomotion skills and robust-ness against external perturbations. The proposed fra...
-
DribbleBot (Dexterous Ball Manipulation with a Legged Robot) is a legged robotic system that can dribble a soccer ball under the same real-world conditions as humans. We identify key challenges of in-the-wild soccer ball manipulation, including variable ball motion dynamics and perception using body-mounted cameras. To overcome these challenges, we propose a domain and task specification for learn...
-
In recent years, learning-based feature detection and matching have outperformed manually-designed methods in in-air cases. However, it is challenging to learn the features in the underwater scenario due to the absence of annotated underwater datasets. This paper proposes a cross-modal knowl-edge distillation framework for training an underwater feature detection and matching network (UFEN). In pa...
-
Oysters play a pivotal role in the bay living ecosystem and are considered the living filters for the ocean. In recent years, oyster reefs have undergone major devastation caused by commercial over-harvesting, requiring preservation to maintain ecological balance. The foundation of this preservation is to estimate the oyster density which requires accurate oyster detection. However, systems for ac...
-
Underwater image enhancement (UIE) is vital for high-level vision-related underwater tasks. Although learning-based UIE methods have made remarkable achievements in recent years, it's still challenging for them to consistently deal with various underwater conditions, which could be caused by: 1) the use of the simplified atmospheric image formation model in UIE may result in severe errors; 2) the ...
-
This paper addresses real-time dense 3D reconstruction for a resource-constrained Autonomous Underwater Vehicle (AUV). Underwater vision-guided operations are among the most challenging as they combine 3D motion in the presence of external forces, limited visibility, and absence of global positioning. Obstacle avoidance and effective path planning require online dense reconstructions of the enviro...
-
This paper addresses the robustness problem of visual-inertial state estimation for underwater operations. Underwater robots operating in a challenging environment are required to know their pose at all times. All vision-based localization schemes are prone to failure due to poor visibility conditions, color loss, and lack of features. The proposed approach utilizes a model of the robot's kinemati...
-
Confined and cluttered aquatic environments present a number of significant challenges with respect to inspection by robotic platforms, including localisation and communications. Some of these can be mitigated by using collaborative heterogeneous multi-robot teams. An important element of such a system is collaborative control. This paper addresses this challenge by presenting an Image-Based Visua...
-
We present the first free-floating autonomous underwater construction system capable of using active bal-lasting to transport cement building blocks efficiently. It is the first free-floating autonomous construction robot to use a paired set of resources: compressed air for buoyancy and a battery for thrusters. In construction trials, our system built structures of up to 12 components and weighing...
-
Mobile robots are well suited for environmental surveys because they can travel to any area of interest and react to observations without the need for pre-existing infrastructure or significant setup time. However, vehicle motion constraints limit where and when measurements occur. This is challenging for a single vehicle observing a time-varying phenomenon, such as coastal waves, but the ability ...
-
In this work a control scheme is proposed to enforce dynamic obstacle avoidance constraints to the full body of actively compliant robots. We argue that both compliance and accuracy are necessary to build safe collaborative robotic systems; obstacle avoidance is usually not enough, due to the reliance on perception systems which exhibit delays and errors. Our scheme is able to successfully avoid o...
-
Admittance and impedance controllers are often purely feedforward, using measured external force or motion, respectively, to generate a reference for an inner-loop controller. In this case, the range of dynamics which can be rendered is limited by the inner-loop, which causes, e.g. contact stability issues for low admittance industrial robots in stiff contact. When both position and force are meas...
-
This paper proposes a control structure for accurate tracking and compliant behavior of industrial manipulators without additional sensors. To achieve control objectives, friction, one of the biggest causes of performance degradation, should be compensated. For tracking performance, the estimated friction cancels most friction effects as a feed-forward, and the modified robust control structure el...
-
As robots become more and more intelligent, the complexity of the algorithms behind them is increasing. Since these algorithms require high computation power from the onboard robot controller, the weight of the robot and energy consumption increases. A promising solution to tackle this issue is to relocate the expensive computation to the cloud. In this pioneering work, the possibility of relocati...
-
This article presents a novel framework called Simultaneous Registration and Machining (SRAM), a generalized method to improve workpiece registration using real-time acquired data in robotic contouring applications. The method allows for online corrections to the toolpath, while a live covariance estimate is simultaneously leveraged to adaptively tune the force controller aggressively when uncerta...
-
This paper presents a novel robotic leg design and an associated control approach, which aims at providing an extension to the classical series elastic actuation concept. We propose to directly integrate the series compliance into the structure of the robotic leg itself, as opposed to co-locating spring and motor as done in traditional series elastic actuators. Our approach will eliminate mechanic...
-
Physical Human-Robot Interaction(pHRI) re-quires taking safety into account from the design board to the collaborative operation of any robot. For collaborative robotic environments, where human and machine are sharing space and interacting physically, the analysis and quantification of impacts becomes very relevant and necessary. Furthermore, analyses of this kind are a valuable source of informa...
-
The robotic hand is still no match for the human hand on many skills. Manipulation of hand tools, which usually requires sophisticated finger movements and fine controls, not only poses a clear technical challenge but also carries a great potential for enabling the robot to assist humans in a wide range of tasks accomplishable using tools. This paper takes a first step to investigate how a robotic...
-
Autonomous racing is a research field gaining large popularity, as it pushes autonomous driving algorithms to their limits and serves as a catalyst for general autonomous driving. For scaled autonomous racing platforms, the computational constraint and complexity often limit the use of Model Predictive Control (MPC). As a consequence, geometric controllers are the most frequently deployed controll...
-
This paper proposed an adaptive robust sliding mode control (SMC) with a nonlinear sliding perturbation observer (SPO) for robot manipulators. SPO estimates the perturbation (nonlinearities, uncertainties, and disturbances) with minimal system information and enhances the controller performance. The estimation is mainly dependent on the selection of SMCSPO gain, and if not tuned well, it might res...
-
Performing precise, repetitive motions is essential in many robotic and automation systems. Iterative learning control (ILC) allows determining the necessary control command by using a very rough system model to speed up the process. Functional iterative learning control is a novel technique that promises to solve several limitations of classic ILC. It operates by merging the input space into a la...
-
This paper presents a robust output feedback controller for a n-link serial robotic manipulator with unknown dynamics and external disturbances. First, the robotic manipulator's model is formulated with unknown dynamics, including joint coupling, nonlinearities, and external disturbances. Second, an output feedback controller is proposed by combining a backstepping controller and an extended high-...
-
This paper presents a collaborative control method based on payload-leading for the multi-quadrotor transportation systems. The goal is to keep the relative distance between the quadrotors and the payload as constant as possible during the transportation, so as to ensure the stable attitude of the payload. The control mechanism consists of a guidance control law that generates the common desired v...
-
The design of a control architecture for providing the desired motion along with the realization of the joint limitation of a robotic system is still an open challenge in control and robotics. This paper presents a torque control architecture for fully actuated manipulators for tracking the desired time-varying trajectory while ensuring the joints position and velocity limits. The presented archit...
-
The dynamics of all real quadrotors inevitably differ even if they are the same product. In particular, the dynamics can change significantly during the flight due to additional device attachments or overheating motors. In this study, we focus on training a low-level controller, which operates in response to dynamics-changes without prior knowledge or fine-tuning of the parameters, using reinforce...
-
Forest canopies are vital ecosystems, but remain understudied due to difficult access. Forests could be monitored with a network of biodegradable sensors that break down into environmentally friendly substances at the end of their life. As a first step in this direction, this paper details the development of a biodegradable origami gripper to attach conventional sensors to branches, deployable wit...
-
PARSEC (Payload Anchoring Robotic System for the Exploration of Cliffs) is an autonomy-equipped aerial manipulator that can deploy self-anchoring payloads on rocky vertical surfaces. It consists of a hexacopter and a two Degrees of Freedom (2 DoF) mass balancing manipulator, which can autonomously deploy a self-anchoring payload from its custom end-effector. The payload anchors itself via an actua...
-
We present a novel controller for fixed-wing UAVs that enables autonomous soaring in an orographic wind field, extending flight endurance. Our method identifies soaring regions and addresses position control challenges by introducing a target gradient line (TGL) on which the UAV achieves an equilibrium soaring position, where sink rate and updraft are balanced. Experimental testing validates the c...
-
This study aims to design a motion/force controller for an aerial manipulator which guarantees the tracking of time-varying motion/force trajectories as well as the stability during the transition between free and contact motions. To this end, we model the force exerted on the end-effector as the Kelvin-Voigt linear model and estimate its parameters by recursive least-squares estimator. Then, the ...
-
This work presents the mechanical design and control of a novel small-size and lightweight Micro Aerial Vehicle (MAV) for aerial manipulation. To our knowledge, with a total take-off mass of only 2.0 kg, the proposed system is the most lightweight Aerial Manipulator (AM) that has 8-DOF independently controllable: 5 for the aerial platform and 3 for the articulated arm. We designed the robot to be ...
-
Aerial manipulation describes a process that includes physical interaction between an unmanned aircraft system (UAS) and its environment. We aim to apply aerial manipulation to sample leaves and small branches from rain forest trees. Current approaches to aerial manipulation involve extended periods of UAS-environment interaction, during which forces and moments can lead to a loss in attitude or p...
-
During operation, aerial manipulation systems are affected by various disturbances. Among them is a gravitational torque caused by the weight of the robotic arm. Common propeller-based actuation is ineffective against such disturbances because of possible overheating and high power consumption. To overcome this issue, in this paper we propose a winch-based actuation for the crane-stationed cable-s...
-
The use of aerial vehicles for exploration and data collection has the potential to significantly aid environmental monitoring in environments which are dangerous and hard to navigate. However, within these environments navigation can often be restricted by overhangs which are challenging to navigate, particularly so with the high payloads required for environmental monitoring. We propose utilizin...
-
Here we describe how a 10-gram Flapping-Wing Micro Aerial Vehicle (FWMAV) was able to perform an automatic trajectory tracking task based on a vector field method. In this study, the desired heading was provided by a vector field which was computed depending on the desired trajectory. The FWMAV's heading was changed by a rear steering mechanism. This rear mechanism simultaneously (i) tenses one wi...
-
This study presents a globally defined dynamics for a conventional multirotor equipped with a single $n\mathbf{-DOF}$ manipulator using modified Lagrangian dynamics. This enables the reformulation of entire dynamics directly on $\text{SO}(3)$ without exploiting any local coordinates, and thus problems such as the singularity of Euler angles can be avoided. Since skew-symmetric property of Coriolis...
-
Unmanned aerial vehicles (UAVs) are finding use in applications that place increasing emphasis on robustness to external disturbances including extreme wind. However, traditional multirotor UAV platforms do not directly sense wind; conventional flow sensors are too slow, insensitive, or bulky for widespread integration on UAVs. Instead, drones typically observe the effects of wind indirectly throu...
-
Battery endurance represents a key challenge for long-term autonomy and long-range operations, especially in the case of aerial robots. In this paper, we propose AutoCharge, an autonomous charging solution for quadrotors that combines a portable ground station with a flexible, lightweight charging tether and is capable of universal, highly efficient, and robust charging. We design and manufacture ...
-
Untethered magnetic microrobots with control-lable locomotion property and multiple functions have attracted lots of attention in recent years. Owing to the small scale, micro-robots with automatic navigation possess a promising perspec-tive for biomedical applications including precise delivery and targeted therapy in confined and narrow space, especially for in-vivo scenario. However, the practi...
-
Due to the limited cargo/functional element loading and other capabilities of individual microrobots, assembling them for locomotion and disassembling them as arriving at the target is more effective. An approach called rendezvous and docking is proposed in this paper to control the assembly and disassembly of helical microrobots actuated by a uniform rotating magnetic field. Docking is realized a...
-
Magnetic resonance imaging (MRI)-guided robotic systems offer great potential for new minimally invasive medical tools, including MRI-powered miniature robots. By re-purposing the imaging hardware of an MRI scanner, the magnetic miniature robot could be navigated into the remote part of the patient's body without needing tethered endoscopic tools. However, state-of-art MRI-powered magnetic miniatu...
-
Vibrational piezoelectric energy harvesters (vPEH) are of great interest in several fields such as autonomous sensors and wireless sensor networks, bird tracking devices, or autonomous miniaturized robotic systems. They capture energy from mechanical vibrations available in the ambient environment and convert it into electrical one to power those systems. Basically, a vPEH is composed of three mai...
-
To date, all previous research in the wireless magnetic actuation of untethered helical devices has achieved motion stability using feedback control in vitro. However, feedback control systems are likely to be affected by the increased sensory uncertainty during in vivo trials. In this study we investigate the input-output boundedness of an interconnection between a helical device and a single rot...
-
Field-control-based nanorobotic manipulation of ions at the single atomic level is an enabling technique for such applications as in-situ prototyping and characterization for fundamental research and rapid product development of nanoscale and quantum devices such as sensors, batteries, neuromorphic devices, and neuro/brain interfaces. Taking the motion of quantum dots (QDs) manipulated by an elect...
-
Creating microscale actuated mechanisms in 3D space is extremely challenging due to limitations in microfabrication processes. In this work, we present a 3D-printed adaptive microgripper that is driven by thin-film NiTi microactuators with 3D-printed linkage mechanisms. The microgripper's fingers are passively adaptive so that the microgripper can provide conformal gripping on 3D objects. The micr...
-
Cell rotation is widely used to adjust cell posture in sub-cellular micromanipulations. The trajectory planning of the injection micropipette is needed, so that the cells can be rotated with the minimum deformation to reduce cell damage and keep cell viability. Due to the uncertainty of cell properties and manipulation environment, it is difficult to identify the parameters of the mechanical model...
-
Noncontact particle manipulation (NPM) shows great application potential than its conventional counterpart particularly in terms of non-invasiveness, and thus has significantly extended robotic manipulation capacity into bio- medical engineering, material science, etc. As NPM by means of electric, magnetic, and optical field has successfully demonstrated powerful strength in both academia and indu...
-
Real-time Acoustic Holography with Iterative Unsupervised Learning for Acoustic Robotic Manipulation
Phase-only acoustic holography is a fundamental and promising technique for contactless robotic manipulation. Through independently controlling phase-only hologram (POH) of phase array of transducers (PAT) and simultaneously driving each channel by sophisticated circuits, a certain acoustic field is dynamically generated in working medium (e.g., air, water or biological tissues) at certain moment....
-
Heterogeneous teams of multiple mobile robots will be important for future scientific explorations of extraterrestrial surfaces or hazardous areas. Mission operation in such harsh, unknown environments poses diverse challenges. Robots need to cooperate autonomously due to the large network latency to the ground station while operators need to adapt the ongoing mission flexibly based on new discove...
-
This paper investigates the stochastic moving target encirclement problem in a realistic setting. In contrast to typical assumptions in related works, the target in our work is non-cooperative and capable of escaping the circle containment by boosting its speed to maximum for a short duration. In extreme conditions, where GPS signals are not available, weight restrictions are present, and ground g...
-
Collective decision-making is an essential capability of large-scale multi-robot systems to establish autonomy on the swarm level. A large portion of literature on collective decision-making in swarm robotics focuses on discrete decisions selecting from a limited number of options. Here we assign a decentralized robot system with the task of exploring an unbounded environment, finding consensus on...
-
Mobility, power, and price points often dictate that robots do not have sufficient computing power on board to run contemporary robot algorithms at desired rates. Cloud computing providers such as AWS, GCP, and Azure offer immense computing power and increasingly low latency on demand, but tapping into that power from a robot is non-trivial. We present FogROS2, an open-source platform to facilitat...
-
Autocurricular training is an important sub-area of multi-agent reinforcement learning (MARL) that allows multiple agents to learn emergent skills in an unsupervised co-evolving scheme. The robotics community has experimented auto-curricular training with physically grounded problems, such as robust control and interactive manipulation tasks. However, the asymmetric nature of these tasks makes the...
-
Legible motion is intent-expressive, which when employed during social robot navigation, allows others to quickly infer the intended avoidance strategy. Predictable motion matches an observer's expectation which, during navigation, allows others to confidently carryout the interaction. In this work, we present a navigation framework capable of reasoning on its legibility and predictability with re...
-
Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. An expert teacher provides advice to a student during training in order to improve the student's sample efficiency and policy performance. Such advice is commonly given in the form of state-action pairs. However, it makes it difficult for the student to reason with and apply to novel...
-
The topology of a robotic swarm affects the convergence speed of consensus and the mobility of the robots. In this paper, we prove the existence of a complete set of local topology manipulation operations that allow the transformation of a swarm topology. The set is complete in the sense that any other possible set of manipulation operations can be performed by a sequence of operations from our se...
-
We consider the problem of decentralized multiagent environmental learning through maximizing the joint information gain among a team of agents. Inspired by subsea applications where bandwidth is severely limited, we explicitly consider the challenge of restricted communication between agents. The environment is modeled as a Gaussian process (GP), and the global information gain maximization probl...
-
In this paper, we propose a distributed algorithm to control a team of cooperating robots aiming to protect a target from a set of intruders. Specifically, we model the strategy of the defending team by means of an online optimization problem inspired by the emerging distributed aggregative framework. In particular, each defending robot determines its own position depending on (i) the relative pos...
-
We introduce and investigate the recharging rendezvous problem for a collaborative team of Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs), in which UAVs with limited battery capacity and UGVS persistently monitor an area. The UGVs also act as mobile recharging stations for the UAVs. In contrast to prior work on such problems, we consider the challenge of dealing with stochasti...
-
State-of-the-art decentralized collaborative Simultaneous Localization And Mapping (SLAM) systems crucially lack the ability to effectively use well-mapped areas generated by other agents in the team for relocalization. This often leads to map redundancy between agents, inefficient communication, and the need for costly re-mapping of areas previously mapped by other agents. In this work, we propos...
-
Reasoning with occluded traffic agents is a significant open challenge for planning for autonomous vehicles. Recent deep learning models have shown impressive results for predicting occluded agents based on the behaviour of nearby visible agents; however, as we show in experiments, these models are difficult to integrate into downstream planning. To this end, we propose Bi-Ievel Variational Occlus...
-
Modern robots require accurate forecasts to make optimal decisions in the real world. For example, self-driving cars need an accurate forecast of other agents' future actions to plan safe trajectories. Current methods rely heavily on historical time series to accurately predict the future. However, relying entirely on the observed history is problematic since it could be corrupted by noise, have o...
-
Communication technologies enable coordination among connected and autonomous vehicles (CAVs). However, it remains unclear how to utilize shared information to improve the safety and efficiency of the CAV system in dynamic and complicated driving scenarios. In this work, we propose a framework of constrained multi-agent reinforcement learning (MARL) with a parallel Safety Shield for CAVs in challe...
-
Recently, Vehicle-to-Everything (V2X) cooperative perception has attracted increasing attention. Infrastructure sensors play a critical role in this research field; however, how to find the optimal placement of infrastructure sensors is rarely studied. In this paper, we investigate the problem of infrastructure sensor placement and propose a pipeline that can efficiently and effectively find optim...
-
Sharing information between connected and autonomous vehicles (CAVs) fundamentally improves the performance of collaborative object detection for self-driving. However, CAVs still have uncertainties on object detection due to practical challenges, which will affect the later modules in self-driving such as planning and control. Hence, uncertainty quantification is crucial for safety-critical syste...
-
Compared to 2D lanes, real 3D lane data is difficult to collect accurately. In this paper, we propose a novel method for training 3D lanes with only 2D lane labels, called weakly supervised 3D lane detection WS-3D-Lane. By assumptions of constant lane width and equal height on adjacent lanes, we indirectly supervise 3D lane heights in the training. To overcome the problem of the dynamic change of ...
-
Current on-board chips usually have different computing power, which means multiple training processes are needed for adapting the same learning-based algorithm to different chips, costing huge computing resources. The situation becomes even worse for 3D perception methods with large models. Previous vision-centric 3D perception approaches are trained with regular grid-represented feature maps of ...
-
Manually specifying features that capture the diversity in traffic environments is impractical. Consequently, learning-based agents cannot realize their full potential as neural motion planners for autonomous vehicles. Instead, this work proposes to learn which features are task-relevant. Given its immediate relevance to motion planning, our proposed architecture encodes the probabilistic occupanc...
-
Lane detection is one of the fundamental modules in self-driving. In this paper we employ a transformer-only method for lane detection, thus it could benefit from the blooming development of fully vision transformer and achieve the state-of-the-art (SOTA) performance on both CULane and TuSimple benchmarks, by fine-tuning the weight fully pre-trained on large datasets. More importantly, this paper ...
-
Prior work has looked at applying reinforcement learning (RL) approaches to autonomous driving scenarios, but the safety of the algorithm is often compromised due to instability or the presence of ill-defined reward functions. With the use of control barrier functions embedded into the RL policy, we arrive at safe policies to optimize the performance of the autonomous driving vehicle through the a...
-
In this work, we propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents using black-box verification methods. RL algorithms have become popular in AD applications in recent years. However, the performance of existing RL algorithms heavily depends on the diversity of training scenarios. A lack of ...
-
The performance of road defect segmentation (a.k.a. pixel-level road defect detection) has been improved alongside with remarkable achievement of deep learning. Those improvements need a large-scale and well-constructed dataset. However, road surface materials or designs vary from country to country, and the patterns of defects are hard to pre-define. In this paper, we propose a novel multi-source...
-
Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object. Object-nav has been extensively studied by the Embodied-AI community, but most solutions are often restricted to considering static objects (e.g., television, fridge, etc.), We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static obj...
Introduction
Conference ICRA2023 accepted paper complete List. Top ranking conferences for AI and Robotics communities. Total Accepted Paper Count 995
