Rollins School of Public Health at Emory University
Instructor: Dr. Anthony G. Francis, Jr.
Lecture 13: Robotics and Vision
Robotics is the development of general-purpose machines. Unlike
industrial robotics, which focuses on manipulator design and
control theory, AI robotics focuses on sensing, thinking and
acting in mobile autonomous systems. Robotic architectures
specify how the modules that perform these tasks are organized;
architectural paradigms have evolved from deliberative through
reactive approaches into a hybrid, multilayered approach.
Vision is one of the most important robotic senses, involving image
processing and scene modeling. Originally thinking involved
deliberative means-ends analysis but now incorporates techniques
from operations research. Robotic action has moved from planned
sequences of actions through reactive control. Modern autonomous
robots are used for everything from scientific exploration to
cognitive experimentation to pure entertainment.
Outline
- What is robotics?
- Robotic Architecture
- Sensing: Vision
- Thinking: Planning
- Acting: Reactive
- Learn: Decision Theory
- State of the Art: Cog, Asimo, Sojourner
Readings:
- Artificial Intelligence: Chapters 6 and 10 (21, 22 optional)
- Machines Who Think: Chapters 10 and 11
Robotics
What is robotics?
- The design of general purpose machines
- Multifunction
- Programmable
- Often perform human tasks in a human-like way
- Two major branches
- Industrial robotics for manufacturing
- Autonomous robotics for space exploration
Types of Robots
- Industrial Robotics
- Programmable Arms
- Mobile Carts
- Teleoperated Systems
- Autonomous Robotics
- Research Platforms
- Planetary Rovers
- Unmanned Autonomous Vehicles (UAVs)
- Spinoffs: Softbots and Immobots
- Spinoffs
- Softbots: web agents, chatterbots
- Immobots: building HVAC systems, HAL
- Graphics: orcs, spectators
- Games: NPCs, opponents
Robotic Architectures
Information Processing in Robotics
- Sense - Extract information from the world
- Think - Decide what to do in the world
- Act - Make changes to the world
- Learn - Improve based on experience
Robotic Architectures
- The Hierarchical Paradigm
- Examples: Shakey
- Stages in the paradigm:
- Sense the world
- Plan your actions
- Act on your plan
- Relies on a world model representation
- Problems
- Major Contributions
- STRIPS operators
- World modeling
- The Reactive Paradigm
- Examples: Walter's machine, Braitenberg's machines, Rodney Brooks' work
- Stages in the paradigm:
- Act based on your sensations
- No explicit representation of the world!
- Problems
- No foresight - easy to get trapped
- Hard to program
- Major Contributions
- Schema-based reactive control
- Subsumption architectures
- The Hybrid Paradigm
- Examples: Most modern robots
- Stages in the paradigm:
- Active behaviors react to sensations
- Dynamic controller configures behaviors based on plan
- Planner updates plan based on changes in world
- Combines best of fast reaction and deliberative foresight
- Problems
- No universal consensus on the "right" architecture
- Learning is still being integrated into the paradigm
- Major Contributions
- Reactive Action Packages (RAPs)
- Three Level Architectures (TLA)
- The Horizon: Learning and Modeling
Sensing
Issues
- The Traditional Senses
- Vision - one of the most important robot senses
- Hearing - also important
- Touch - currently very simple
- Smell - not integrated into robotics
- Taste - little research
- Other Senses used in Robotics
- Proprioception - position of body parts
- Kinesthetics - dead reckoning based on movement
- Location senses - e.g., global positioning system (GPS)
- Ranging senses - e.g., sonar, lidar, occasionally radar
- Challenges of Sensation
- Dataflow: huge amounts of input data
- Processing: complicated algorithms
- Noise: grainy, low resolution cameras
- Errors: sonar holes, lidar spikes
- Ghosts: reflections (sonar or otherwise)
- Reality: shadows, textures, fog
Case Study: Vision
- Stages in Vision
- Image Processing
- Digitization
- Feature Extraction
- Scene Analysis
- Object Recognition
- Motion Detection
- Marr's Model of Vision
- 2D Image - the image as intensitiy values
- 2½ Sketch - extract the edges and regions
- 3D Model - geometric model of objects in scene
- Extracting the 2½ Sketch
- Eliminating Noise
- Images may contain lots of irrelevant details
- "Smoothing" - averaging over pixes - makes algorithms more robust
- Finding the edges
- Edges:
- Capture strong transitions in color, intensity or texture
- Do not exist in nature - must be extracted from the image
- First derivative of an image shows the peaks where things change fast
- Second derivative shows the zero crossings that represent edges
- Convolution: Implementing smoothing and edge-finding
- Apply a pixel-combining operator over a whole image
- Convolution is extremely processing intensive!
- Neuromorphic chips under development promise improvements
- Common Image Operators
- Identity - I(x,y) - the original image
- Gaussian - G(x,y) - the "blur" operator
G(x,y) = (1/2πο²)*e^((x²+y²)/(2ο²))
- Lapacian - the "edge" operator
L(x,y) = ∂²/∂x² + ∂²/∂y²
- Performs the second derivative in all directions
- Turns edges into zeros
- Still sensitive to noise
- Sombrero - effective edge+blur operation
S(x,y) = ∂²G(x,y)/∂x² + ∂²G(x,y)/∂y²
- Zero-crossing - find the edges in an image
MH(x,y) = (S(x,y)==0)
- AKA "Marr-Hildreth Operator"
- Combines smoothing, edge enhancement, and edge extraction
- Similar to processing that goes on in retinas!
- Threshholding: Accept values only over a certain cutoff
- Other, more sophisticated operator exist
- Detecting Nonlocal Features
- Region Detection: combine pixels that are similar into areas
- Line Detection: aggregate zero crossings into lines
- Texture Detection: look for repeated patterns
- Extracting the 3D Model
- Labeling Intersections to Model Surfaces
- Building World Models out of Generalized Cylinders
- The Object Libarary
- More Advanced Vision Models
- Feedback between stages improves performance
- Stereo vision combines information from multiple sources
- "Cognitive" and "4D" vision models exploit optical flow information
- Dynamic gaze with a high-resolution fovea can resolve ambiguities
Thinking
Issues
- Dynamic worlds - cannot sit and think forever
- Unreliable inputs - cannot trust your sensors
- Unreliable outputs - cannot trust your actions
- Unclear which actions are even appropriate
Case Study: Planning
- Early Planning: The General Problem Solver
- Planning Method: means-ends analysis
- Knowledge: difference table ordered problems and operations that resolve them
- Type of difference resolved
- Type of operation
- Precondition of operation
- Add list for conditions added
- Delete list for differences removed
- Example:
Difference | Operation | Precondition | Delete | Add |
< 100 yards | Walk | At Start Not Raining | At Start | At Destination |
< 100 miles | Drive | At Start Have Car | At Start | At Destination |
< 100 miles | Taxi | At Start Have Money | At Start Have Money | At Destination |
> 100 miles | Fly | At Start Have Ticket | At Start Have Car Have Ticket | At Destination |
- Issues: Sussman anomaly was too hard to solve
- Initial state: block B on floor, block C on block A
- Goal state: block A on block B on block C
- Problem: MEA would put B on C, then remove B from C, C from A,
- Linear Planning: STRIPS
- Planning Method: search in state space
- Knowledge: STRIPS operators encapsulated knowledge in earlier difference table
- Operator name and variables
- Precondition list
- Add list for conditions added
- Delete list for differences removed
- Predicate calculus can be used in precondition, add and delete list
- Example:
put block1 block2
pre: clear block1
clear block2
add: on block1 block2
del: clear block1
clear block2
- Issues:
- Still could do a lot of extra work
- Expensive to come up with the best plan
- Made unnecessary commitments to action ordering
- Nonlinear planning: NONLIN, UCPOP
- Planning Method: search in plan space
- Knowledge: STRIPS operators
- Store partial plans as lists of dependencies based on operations
- Example: If you need to get car keys, wallet and ticket to fly to the airport, it doesn't matter which you pick up first
- Issues:
- Still uses a low-level grain of detail
- Does not address the issue of unreliable action
- Hierarchical planning: SIPE
- Planning Method: search in space of task networks
- Knowledge: Decomposable operators:
- Look like STRIPS operators
- High-level versions do not capture many details
- Can be decomposed into lower level operations
- Example: If you need to get car keys, wallet and ticket to fly to the airport, it doesn't matter which you pick up first
- High level operation: Fly from start to destination
- Decomposition: Fly: get ticket, get bags, drive to airport, get on plane ...
- Decomposition: Get Ticket: go online, select start, select destination ...
- Issues:
- Current frontier in deployed planning systems
- People make money with this
- Extremely powerful formalism
- Theoretical results now just coming in
- Graph Planning: GRAPHPLAN
- Planning Method: construction of planning graph
- Knowledge: STRIPS operators
- Planning graph encapsulates whole familes of operations
- Issues:
- Current frontier in theoretical planning systems
- Generalizes over nonlinear planning approaches
- Now extending to handle range of existing systems
- Decision-Theoretic Planning
- Planning Method: partially observable markov decision processes
- Knowledge: probabilistic relationships
- Planning now over states of possible actions
- Issues:
- Current frontier in robotic planning systems
- Incorporates learning directly into framework
- Now extending to handle range of existing systems
Acting
Issues
- Responsiveness is crucial
- Effectors are unreliable
- Thinking is not always needed
Case Study: Schema-based reactive control
- Control method: sensor-based vector computation
- Knowledge: behaviors:
- releaser that determines when behavior is active
- perceptual schema determines what to pay attention to
- motor schema: determines what to do in world
- Motor schema: often implemented as superimposable vector fields
- Wandering: random motion
- Approach: vectors towards target
- Avoidance: vector away from target
- Navigation: a "lane" of forward motion with wall repulsion
- Formations: compute based on vectors of opponents
- Issues:
- Can be computed quickly from sensor data
- At low level, vulnerable to box canyons and other traps
- Can be orchestrated by higher level planning systems
- Can be learned through a variety of approaches
- Neural networks
- Genetic algorithms
- Decision theory
- Case-based reasoning
Learning
Issues
- Dynamic worlds - what's there today can be gone tomorrow
- Don't want to learn sensor ghosts and input noise
- Vast amounts of "normal" input can swamp edge cases
- Decision theoretic planning helps but isn't the end of the story
- Hierarchical level-of-detail world models from computer games and graphics
State of the Art
Practical Robots
- Vacuum cleaners - Roomba
- Automated carts
- Kiosk robots
Planetary Rovers
- Mars Pathfinder and the Sojourner Rover
- Spirit and Opportunity Rovers
- More to come
Unmanned Autonomous Vehicles (UAVs)
- Underwater Autonomous Vehicles - successful
- Underwater Aerial Vehicles - less so
- The Darpa Grand Challenge - making progress
Humanoid Robots
- Cog
- Asimo
- Qrio
- HRP2
- Sumobots
Other Robots and Spinoffs
- Mecha
- Powered Suits
- Powered Chairs
- Enryu
- Others
- Dogs
- Games
- Computer Graphics
- Harry Potter
- Lord of the Rings
Research Frontiers
- Biorobotics
- Robosoccer
- Emotional Robots
Resources
|
Research
Articles
Classes
Software
Classic
Weblog
Wiki
Store
f@nu fiku
Fiction
Personal
About
Contact
|