Introduction to Artificial Intelligence with Applications to Public Health
Lecture 1 - Overview
Dr. Anthony G. Francis, Jr.
January 24, 2005
Deep Thought, Deep Blue, and Deep Space One:
Artificial Intelligence in Science Fiction, in the Laboratory, and Reality.
· Course Outline and Expectations
· What is Artificial Intelligence?
· Philosophy, Science and Engineering
· Enabling Agents to Perform in Environments
· Perceiving the World
· Acting in the World
· Choosing Actions
· Representing Choices
· Building Plans
· Representing Uncertainty
· Learning
· Communicating
· Major Areas of Artificial Intelligence
· Natural Language Processing
· Knowledge Representation
· Planning and Problem Solving
· Machine Learning
· Vision
· Robotics
· Philosophy
· Views of Artificial Intelligence
· In Science Fiction: Deep Thought, HAL, I Robot
· In the Laboratory: Deep Blue, ISAAC
· In Reality: Deep Space One, Crime Systems
· How can Artificial Intelligence Support Public Health?
· Data Processing: Language Understanding and Vision
· Data Mining: Machine Learning and Information Retrieval
· Knowledge Representation: Vocabularies and Ontologies
·
· Begin readings for Week 2
· Set up Python on your computing environment
Introduction to Artificial Intelligence with Applications to Public Health surveys the concepts, technologies and issues of artificial intelligence and how AI can be applied to improve public health outcomes. The course will begin with lectures on the scope and history of artificial intelligence, will examine popular AI techniques and subject areas in a series of topical lectures, and then discuss applications of AI to public health. The course will conclude with a discussion of the philosophy of artificial intelligence.
Winter 2005 is the first time this Introduction to AI has been offered at the School of Public Health, so we will try out several ideas to make this course valuable to the public health community. Each class will begin with a self-contained discussion of a topic in artificial intelligence. After this introduction the lecture will then focus on more detailed topics that build on previous lectures. Finally, because artificial intelligence is not just the science of intelligent systems, but also the engineering of intelligent software, the lectures will conclude with a discussion of software engineering techniques relevant to the class assignments.
The class includes three primary forms of evaluation: tests, essays, and programming. However, studying artificial intelligence programming per se is usually a course in and of itself, so we will attempt to develop a series of programming exercises that will enable you to get a good conceptual understanding of the guts of artificial intelligence techniques without getting bogged down in irrelevant programming details. The programming language for this course is Python, a modern, powerful, and above all readable scripting language available on Windows, Macintosh, UNIX and the Java platform. If you do any significant data manipulation a scripting language like Python, Perl or Ruby can help you, so hopefully the programming in this course will add a useful tool to your toolkit in addition to exposing you to AI concepts.
What is artificial intelligence? Well, what is intelligence? What does it mean for a person to be intelligent? Is intelligence an ineffable property of the human spirit, or the byproduct of physical and biological principles that we can understand? In other words, are people natural machines? On that note, what is artificial? What are the properties of artificial systems such as machines? Can the behavior of a machine designed by the hand of man be called in any way “intelligent”, or should we ascribe that intelligence to the person who designed it? Even if we assume that people are natural machines, and that intelligence was simply a product of a process ongoing in these natural machines, can we replicate that process in an artificial system — and even if we could, should we?
The short answer is, I don’t care.
Now that’s not really true; I do care about all of those questions, especially the last one — what is the proper use of the technology of artificial intelligence. And we’re going to come back to all of these questions at the end of the course — because these are primarily philosophical questions, not scientific or engineering ones. For almost every science, such as physics, biology, or psychology, there are significant philosophical questions at the core of the topic — such as “what are space and time”, “what constitutes life”, and “what is the scientific status of inner mental life” — which are continually re-examined and addressed as the science evolves.
However, as sciences mature these questions begin to turn from mysteries that can’t be understood into problems that can be addressed using models and experiments — for example, in physics, space and time, which once were very mysterious, are now understood as two sides of the same coin — or at least can be viewed very productively as two elements of the same model.
Science is a method for understanding the world which involves collecting evidence, organizing it into categories, and developing models that explain the observations we have seen. As science progresses the models we use have become increasingly expansive and robust, and we are able to rely on these models to predict many features of the world and to develop amazing technological advances. This kind of revolution has already happened with physics, chemistry and biology, and is now happening with intelligence.
Originally intelligence was a mystery. Then philosophers established that it was a mystical property of human agents, though perhaps divided into faculties of memory, reasoning, imagination and so on. As the study of mind shifted from philosophy to psychology, scientists began to realize that intelligence might not be a mystical property of only humans but a standalone phenomenon in its own right — first through analogy with the behavior of animals, and then later through analogy with the information processing of machine systems. Researchers began to focus on the unique properties of many different areas of interest in both human and machine behavior — nevertheless, until fairly recently intelligence was still viewed, at least intuitively, as a largely unitary entity that you either had, or you didn’t.
The modern science of intelligence — cognitive science, which combines psychology, neurology, sociology, anthropology, linguistics, and artificial intelligence — has largely abandoned the unified view of intelligence in favor of a view of intelligence as the action of many specialized, cooperating perception, reasoning and action systems which are evolutionarily adapted to the tasks humans encounter living in their environments. While there are definitely general principles of behavior and information processing that cut across all intelligent systems, modern psychological studies have revealed that the human brain has an extraordinarily wide range of highly specialized inference mechanisms which help us understand the world — just listening to this speech recruits systems for eye tracking, face recognition, facial feature differentiation, hearing, speech prosody, word segmentation, word meanings, grammar, pragmatics, social interactions, and so on — all of which can be teased apart experimentally through experiment, brain lesion or functional magnetic resonance imaging.
Ok, that’s the current state of cognitive psychology — what does that have to do with artificial intelligence? It’s this. Studies of artificial intelligence have traditionally focused on one path towards making intelligent systems — control theory, information theory, logic, symbolic processing, production systems, neural nets, reactive systems, genetic algorithms — but a review of the actual practice of the field reveals the same thing that cognitive science reveals: successful intelligent systems combine many different kinds of information processing systems together to perform their tasks. Generalizing from the human case, artificial intelligence is the science of understanding the information processing necessary for an agent to function in its environment, and the engineering practice of using that understanding to create useful software artifacts. While there are definitely cross cutting concerns that operate across the breadth of artificial intelligence — computer engineering, programming languages, information theory, formal logic, state-space search, graphs and networks, grammar and automata — the main bulk of artificial intelligence consists of understanding specific reasoning tasks, such as visual analogy, scene processing, or pathfinding — and then applying them to specific problems, such as predicting protein structure from visual similarity, detecting lanes in a road, or tracking a human opponent in a computer game.
So, in at least a limited sense, we can view both humans and the machines they create as agents — self-contained three-dimensional patterns persisting through a roughly four-dimensional space time, which can be abstracted as systems with some internal state interacting with an environment through inputs and outputs. And we can scope what constitutes artificial intelligence, rather than physics, chemistry, biology or engineering: we want to focus on the information processing that the agent needs to do, rather than the general laws it obeys, the substances it is made from, and the processes that enable it to function. So what are the major kinds of information processing that an agent needs to do?
·
Perceiving the World
The first kinds of information processing that an agent needs to perform
are sensing — detecting physical changes in the environment
corresponding to the agent’s inputs — and perception — transforming
those physical signals into information representing light, sound, touch, and
so on. Sensation and perception are
closely aligned and tied into the general fields of signal processing and
information theory, but in practice there are specific problems to be
solved in machine vision, speech understanding, the tactile
perception and so on.
·
Acting in the World
The next kind of information processing needs to perform is actions —
transforming internal agent state into effected physical changes corresponding
to the agent’s outputs. The primary
systems that perform this are muscles in humans and motors in machines. The simplest possible agents can wire
perception and action directly together — enabling them to move towards food
sources but away from light, for example. Some of the earliest attempts at
artificial life were simple photosensitive robots, and some modern robots use
this reactive systems approach for goal seeking and obstacle
avoidance. There are also considerable
subtleties in building complicated robotic action systems.
·
Choosing Actions
More complicated agents will have more than one potential action to perform
and thus must decouple perception and action.
At the most basic level this can operate in the form of one behavior
suppressing the other — for example, a cockroach might be modeled three
reactive systems: a food-seeking system, a darkness-seeking system, and an
inhibitor that suppresses food-seeking and activates darkeness seeking in the
presence of light. Modern robots with subsumption
architectures use this kind of approach to build more robust behaviors on
top of simple schemas.
·
Representing Choices
As environments become more complicated, the kinds of actions an agent needs to
perform can no longer be specified as simple reactive behaviors. For example, any given move in chess can be
executed by a range of arm positions, but subtle differences in the final arm
position can make the difference between checkmating your opponent and facing
mate in two. Considering these
different possibilities requires creating and manipulating representations of
the world state and potential actions.
While there are many different kinds of knowledge representations,
some of the most prominent kinds are based on symbolic structures, formal
logic, neural networks or potential fields.
·
Building Plans
A chess player at my level can hold their own against a peer or someone who
just learned the game, but a true master-level player will wipe the floor with
me. Why? My level of chess is basically reactive, attempting to find
available choices that will help me and hurt my opponent. A master-level player
will instead develop a plan, a sequence of actions that he selects and
evaluates as a whole. Most real-world
environments require planning for success.
Planning can be split into two major groups: symbolic planning which
focuses on finding the right knowledge representation for a plan, and decision-theoretic
planning which focuses on learning the probabilities of successful actions.
Decision-theoretic planning is slowly subsuming the symbolic planning approach.
·
Representing Uncertainty
Unlike chess, most real world domains often have considerable uncertainty.
If you hear a sudden snap in the woods, it may not be clear whether that’s a
falling pine cone or an approaching bear; and if you decide it is a bear, you
may not be certain whether or not you can leap across the nearby ravine or, if
you are successful, whether the crumbly bank on the other side will support
you. Bayesian reasoning is the
most prominent symbolic method for representing uncertainty, whereas action
choices depend on decision theory, which draws on game theory, statistics
and operations research.
·
Learning
For a simple game like tic-tac-toe it’s in theory possible to enumerate all
the possible states of play and to code every decision by hand, but in general
the possible actions and combinations of actions available to an agent cannot
be specified in advance. Agents need to
learn from their environment in order to function — where learning can be
defined as the acquisition of symbolic information, such as Marvin Minsky’s
phone number, or alternatively as the improvement of task performance, such as facility
with using the phone systems in Japan after repeated use. Machine learning is
a distinct subfield of artificial intelligence, and includes symbolic, neural
and genetic approaches.
·
Communicating
Simpler organisms learn only within the scope of their species-specific
behaviors, and their growth as a species is limited by the speed of
genetics. Humans have the ability to
communicate what they have learned to each other, enabling faster cultural
evolution. Language is a key aspect of this approach, enabling humans to
serialize their thoughts into a stream of sound or visual symbols which other
humans can decode and turn into new thoughts in their own heads. Natural
language understanding and agent theory are two of the tools artificial
intelligence brings to bear on these problems.
These areas are drawn with broad brush strokes and are not meant to be exhaustive of the problems that human and machine agents have to solve to navigate their environments. Humans also have emotional systems, which many researchers argue will be key to developing reliable robots that will have a healthy fear of tumbling down an open stairwell — and simulated emotion models which have already been key to certain computer game and computer theater applications. Humans are conscious, and other researchers argue that at least some of the properties of consciousness stem from our ability to integrate information across all our different sensations, perceptions and cognitions into a unitary state that can serve as a key for fast attention focusing, decision making and memory retrieval — clearly another area that we could exploit in the construction of better machines.
Just as clearly, we could flesh out all of these systems and wire them correctly in a machine and still find that we’ve missed something — the subtle quality of feeling we get when we see red, the fine shades of human judgement, the true spark of insight. I’m not being poetic — these are genuine philosophical issues called qualia, discretion and creativity. The ultimate failure to address these problems with an information processing model would be a great victory for the scientific method, akin to finding that Maxwell’s laws of electromagnetism can’t explain the red glow of a furnace, or that electricity and gravity together can’t explain why all the mass of an atom is located in its nucleus.
The scientific method progresses not by declaring that a problem is unsolvable and avoiding it, but instead by building the best models we can and pushing them as far as we can go. Along the way, the work scientists have done to break bad models like Newtonian mechanics and classical electromagnetism have produced deep insights about the world which we have exploited to produce astounding technical achievements. The study of mind through information processing has not yet pushed far enough to find its breaking point, but it too has produced astounding technical achievements: even in your humble word processor can probably check your spelling and your grammar, help you write a fax or letter, or even summarize a long document for you. Some of these technologies will be of great use for public health.
While the project of understanding intelligence spans physics, chemistry, biology, psychology, and sociology, engineering, and computation, there are several areas that are generally considered to be a part of artificial intelligence proper. These areas include:
·
Natural Language Processing
Natural language understanding pioneered one of the powerful tools of the
computer science arsenal: formal grammars, which are closely related to automata.
In addition, natural language employs a variety of specialized algorithms for
perceiving speech and vision (combining multiple sources of evidence via neural
networks or blackboard systems), breaking language streams down into
words (word segmentation), grouping words into sentences (sentence
processing), determining the meaning of sentences (semantic analysis)
and understanding the emotional affect, practical aspects and purposive role of
sentences in larger structures (prosodic analysis, pragmatic analysis,
speech acts and discourse theory).
·
Knowledge Representation
Knowledge representation builds on the principles of formal logic,
including Boolean algebra, prepositional logic, predicate
calculus, modal logic, and fuzzy logic, and on the principles
of programming languages, which introduce ideas such as symbols, variables
and bindings. These are in
turn used as the foundations for representation schemes such as description
logics, conceptual graphs, semantic networks, frame
systems, and production systems. There are also a variety of
subsymbolic approaches, such as neural networks, genetic algorithms,
and the Copycat architecture.
·
Planning and Problem Solving
Planning, like knowledge representation, is one of the areas of AI that are
distinctively AI. Simple planning
systems involve search through state spaces using algorithms such
as depth-first search, breadth-first search, iterative deepening, and A*; A* in
particular is used in pathfinding in many game playing systems. Game systems
also use strategic tool called an influence map that enables a machine
to understand how much control of the game field both it and its opponents
have. More complicated planning systems
use the STRIPS representation or hierarchical task networks.
·
Machine Learning
Machine learning is a distinct field in its own right. The science of machine learning relies
heavily on standardized problems which algorithms can be tested against empirically. Machine learning includes both supervised
and unsupervised approaches. In addition to basic techniques such as
decision trees and clustering algorithms, machine learning
approaches also include neural networks, genetic algorithms, case-based
reasoning, self-organizing maps, and a host of other approaches.
·
Vision
Because of its high data requirements, vision has many unique algorithms
primarily based on signal processing research.
Vision systems attempt to analyze scenes by detecting edges, regions
and connectivity using technology such as Gaussian and Laplacian
filters, building up 2D, 2½D and 3D models of the
world using both single-image and stereo vision; while originally
data was seen flowing from the raw image up to the 3D model, now information is
seen flowing both ways. Similarly,
while much vision research was once conducted with static images, more recent
results seem to indicate that correct vision models for active robotic systems
need to incorporate both a dynamic model of the moving agent and the ability to
actively track objects with a mobile camera.
·
Robotics
Robotic systems have made many contributions to artificial
intelligence. On a basic level robotics
made artificial intelligence researchers aware that simple simulation-based
models were “doomed to succeed” and forced researchers to deal with the issues
of error-prone sensors, sloppy effectors and buggy software. There are a number of purely mathematical
issues involved with the design and implementation of robotic hardware; the
biggest contributions to AI, however, are in the areas of control regimes,
including the subsumption architecture, the animat approach, the AURA
architecture, and fuzzy state machines, the last of which have gotten good use
in game artificial intelligence.
·
Related Areas
Not usually considered part of artificial intelligence, the following areas
are nonetheless part of an serious attempt to build a competent artificial
intelligence system:
· Computer Engineering
· Programming Languages
· Information Theory and Signal Processing
· Formal Logic
· Discrete Mathematics and Probability
· Linguistics and Automata Theory
· Economics, Decision Theory and Operations Research
· Graph and Network Theory
· Evolutionary Theory
· Fuzzy Logic
· Pattern Matching
Artificial intelligence lives a colorful life in movies and television shows, all of which give an exaggerated view of both the capabilities and hazards of AI systems. Real-world artificial intelligence systems are more prosaic, limited, and safe, though they have achieved a variety of behaviors you’d never expect from a typical science fiction movie. An even smaller subset of behaviors have been deployed in the field — sometimes so seamlessly that most people aren’t even aware that artificial intelligence is involved.
The image of artificial intelligence in popular culture ranges from friendly buffoons like C3PO and R2D2 and murderous psychopaths like HAL and the Terminator. This isn’t a literature or film studies class, but the following intelligent machines are a fair sample:
· Giant Brains
· HAL 9000 (2001, 2010)
·
Colossus (Colossus, the Forbin Project)
·
PL/1 (The Adolescence of PL/1)
·
LCARS (Star Trek: The Original Series)
·
Robots
·
C3PO and R2D2 (The Star Wars series)
·
Asimov’s Robots (I, Robot, The Bicentennial
Man)
· Dr. Theopolis and Twiki (Buck Rogers and the 25th Century)
·
Androids
· The Terminator (The Terminator series)
· Commander Data (Star Trek: The Next Generation)
· The Doctor (Star Trek: Voyager)
These fictional artificial intelligences and robots are often entertaining and certainly inspiring to aspiring researchers in the field — I know more than one researcher in artificial intelligence inspired by the example of HAL, or Commander Data — but present a distorted picture of what AI is capable of, what it’s not capable of, and what hazards it poses.
Artificial intelligences in movies are shown to have vast intellectual powers — they can read vast amounts of information quickly, draw relevant conclusions instantly, read shadowed lips through distorted glass, and, if they have bodies, can perform arbitrary martial arts moves with expert precision. You never see HAL making a mistake in chess or the Terminator slipping on a banana except for humor. Yet in the real world there are more positions in the chess than atoms in the known universe, and any given floor’s slipperiness is an unknown quantity until you put your foot down on it. This is why I like the scene from Terminator 2 when Robert Patrick’s character, the T1000, is distracted for a moment in a fight by a silvery department-store mannequin. In the real world, agents have noisy sensors and a finite amount of time to process the data they do receive, and the T1000’s moment of confusion is precisely the kind of error we would expect to see for that kind of robot in the real world.
Artificial intelligences are drawn in broad brushes — in Star Trek, Commander Data is said to have no feelings, and the writers make many funnies based on his lack of understanding of human emotion. He’s shown to be incredibly awkward in conversation and on the dance floor, yet at the same time he is shown to have attachments, friends, and even the occasional lover. Writers use lack of emotion, feeling or social graces as a shorthand for the concept of “machine”, but in reality artificial intelligences in the laboratory have been shown to exhibit a wide range of human behaviors, including simulated emotional responses, showing surprise at betrayal, improvising jazz, write short stories, drawing pictures, and remembering favorably the researcher who scratched you behind your ears. It would be unsurprising if a machine person like Commander Data had no cognitive deficits — his inability to use contractions was supposedly deliberately programmed in — but just as machine intelligences won’t be able to solve all problems with a wave of a metal paw, machine intelligences will be able to guess that someone who’s crying has had their feelings hurt and might need a shoulder to lean on — or at least a mocha.
In the Terminator, Skynet spontaneously becomes self-aware after having been granted control over all of America’s weaponry and decides to wipe us out in a millisecond. Ten years earlier, the movie Colossus: The Forbin Project depicts a far smarter yet more benign machine seizing control of the world — after having been sealed in an impregnable cave and granted control of all of America’s weaponry. Both systems seized control after their complexity transcended their designer’s intentions. Meanwhile, at the same time back on planet reality, the people actually in control of that weaponry, the Defense Advanced Projects Research Agency, or DARPA, was cutting funding to research in artificial intelligence because the robots of the time — shaky affairs which could barely make it down the hall under their own power without breaking down — were considered too much of a threat. At the same time, larger and larger computer systems were being built, and constantly transcended their designer’s intentions by becoming too buggy to run and too large to fix.
It’s true that some artificial intelligence researchers are actively pursuing ways to create “emergent behavior” where the system performs behaviors beyond their initial expectations, particularly in areas known as genetic algorithms. However, the emergent behavior we have seen so far is extremely limited, and more and more complex systems tend to get crufty and break down, rather than become superintelligent and supercapable. It’s also true that people have died because perfectly ordinary computer systems, not normally viewed as artificial intelligences, have behaved in unexpected ways that led to tragedy. An incorrect amount of radiation in an X-ray machine or a wrong move on the controls of an airplane can be deadly. But these instances are rare because humans do not generally place machinery that can threaten human life in the hands of any system whose behavior can’t be reliably predicted! So it’s much more likely that AIs of the future will fail not by becoming superintelligent and filling our skies full of death, but instead by becoming superunstable and crashing our computers with the blue screen of death — or by having their funding pulled before they have the chance to become too “interesting”.
This is one area where I feel the philosophy of artificial intelligence is important: deciding what we should do with artificial intelligence. For example, in Computer Power and Human Reason Weizenbaum argued that we should not place tasks that require human judgment or decision making in the hands of machines — and indeed we’ll find as we go through the course that this is one of the current weak points of artificial intelligence systems. In another book called Mind Children Hans Moravec that eventually one day we will develop machines with same mental capacities as humans and will have to decide their moral status. More simply, I look at the lessons of Commander Data and the Terminator: if you develop a machine intelligence as smart as a person, by all means treat it like a person — but don’t give it the keys to our nuclear arsenal.
In the laboratory, artificial intelligence work is generally more focused and prosaic. There are three m
One surprising use of visual reasoning is in predicting protein structure. I’m currently corresponding with a researcher who is predicting the structure of new proteins based on the visual similarity of structures they share with previously analyzed proteins and visual analogy to those protein’s known structures.
One of the
Artificial intelligence can serve public health in four primary ways: helping public health professionals get data, helping us analyze data, and helping us share data with each other in the community.
Optical character recognition and computer speech recognition systems have become so much a part of our modern computer landscape that we forget that they were originally artificial intelligence technologies. Future applications of artificial intelligence to the data processing problem in public health can come from exploiting natural language understanding — first to aid in the coding of medical records in the form of standard vocabularies, and later in the form of extracting the actual semantic content of the records. This extraction might be only 90 percent accurate, but machines can flag ambiguous texts for further human processing at a great savings of human time and effort. Similarly, machine vision technologies are becoming increasingly robust and will provide a foundation for extracting even more information from medical records.