Press "Enter" to skip to content

Posts published in “Research”

Posts and pages about my scientific research and the scientific topics I’m interested in.

[twenty twenty-five day two six nine]: it’s dangerous to slog alone, take this stack of textbooks

centaur 0

So I wasn't kidding about the long slog: I am still chewing through the classic textbook Pattern Recognition and Machine Learning (PDF) by Christopher Bishop, day and night, even though he's got a newer book out. This is in part because I'm almost done, and in part because his newer book focuses on the foundations of deep learning and is "almost entirely non-Bayesian" - and it's Bayesian theory I'm trying to understand.

This, I think, is part of the discovery I've made recently about "deep learning" - by which I mean learning in depth by people, as opposed to deep learning by machines: hard concepts are by definition tough nuts to crack, and to really understand them, you need to hit them coming and going - to break apart the concept in as many ways as possible to ensure you can take it apart and put it back together again. As Marvin Minsky once said, "You don't understand anything until you learn it more than one way."

To some people, that idea is intuitive; to others, it is easy to dismiss. But if you think about it, when you're learning a subject you don't know, it's like going in blind. And like the parable of the blind men and the elephant - each of whom touched one part of an elephant and assumed they understood the whole - if you dig deeply into a narrow view of a subject, you can get a distorted view, like extrapolating a giant snake from an elephant's trunk, or a tall tree from its leg, or a wide fan from its ear, or a long rope from its tail.

Acting as if those bad assumptions were true could easily get you stomped on - or skewered, by the elephant's tusk, which is sharp like a spear.

So back to Bayesian theory. Now, what the hell is a "Bayes," some of you may ask? (Why are you reviewing the obvious, others of you may snark). Look, we take chances every day, don't we? And we blame ourselves for making a mistake if we know that something is risky, but not so much if we don't know what we don't know - even though we intuitively know that the underlying chances aren't affected by what we know. Well, Thomas Bayes not only understood that, he built a framework to put that on a solid mathematical footing.

Some people think that Bayes' work on probability was trying to refute Hume's argument against miracles, though that connection is disputed (pdf). But the big dispute that arose was between "frequentists" who want to reduce probability to statistics, and "Bayesians" who represent probability as a statement of beliefs. Frequentists incorrectly argued that Bayesian theory was somehow "subjective", and tried to replace Bayesian reasoning with statistical analyses of imaginary projections of existing data out to idealized collections of objects which don't exist. Bayesians, in contrast, recognize that Bayes' Theorem is, well, a theorem, and we can use it to make objective statements of the predictions we can make over different statements of belief - statements which are often hidden in frequentist theory as unstated assumptions.

Now, I snark a bit about frequentist theory there - and not just because the most extreme statements of frequentist theory are objectively wrong, but because some frequentist mathematicians around the first half of the twentieth century engaged in some really shitty behavior which set mathematical progress back decades - but even the arch-Bayesian, E. T. Jaynes, retreated from his dislike of frequentist theory. In his perspective, frequentist methods are how we check the outcome of Bayesian work, and Bayesian theory is how we justify and prove the mathematical structure of frequentist methods. They're a synergy of approaches, and I use frequentism and the tools of frequentists in my research, um, frequently.

But my point, and I did have one, is that even something I thought I understood well is something that I could learn more about. Case in point was not, originally, what I learned about frequentism and Bayesianism a while back; it was what I learned about principal component analysis (PCA) at the session where I took the picture. (I was about to write "last night", but, even though this is a "blogging every day" post, due to me getting interrupted when I was trying to post, this was a few days ago).

PCA is another one of those fancy math terms for a simple idea: you can improve your understanding by figuring out what you should focus on. Imagine you're firing cannon, and you want to figure out where the cannonballs are going to land. There are all sorts of factors that affect this: the direction of the wind, the presence of rain, even thermal noise in the cannon if you wanted to be super precise. But the most important variables in figuring out where the cannonball is going to land is where you're aiming the thing! Unless you're standing on Larry Niven's We Made It in the windy season, you should be far more worried about where the cannon is pointed than the way the wind blows.

PCA is a mathematical tool to help you figure that out by reducing a vast number of variables down to just a small number - usually two or three dimensions so humans can literally visualize it on a graph or in a tank. And PCA has an elegant mathematical formalism in terms of vectors and matrix math which is taught in schools. But it turns out there's an even more elegant Bayesian formalism which models PCA as a process based on "latent" variables, which you can think about as the underlying process behind the variables we observe - using our cannonball example, that process is again "where they're aiming the thing," even if we ultimately just observe where the cannonballs land.

Bayesian PCA is equivalent (you can recover the original PCA formalism from it easily) and elegant (it provides a natural explanation of the dimensions PCA finds as the largest sources of variance) and extensible (you can easily adapt the number of dimensions to the data) and efficient (if you know you just want a few dimensions, you can approximate it with something called the expectation-maximization algorithm, which is way more efficient than the matrix alternative). All that is well and good.

But I don't think I could have even really understood all that if I hadn't already seen PCA in half a dozen other textbooks. The technique is so useful, and demonstrations about it are so illuminating, that I felt I had seen it before - so when Bishop cracked open his Bayesian formulation, I didn't feel like I was just reading line noise. Because, let me tell you, the first time I read a statistical proof, it often feels like line noise.

But this time, I didn't feel that way.

I often try to tackle new problems by digging deep into one book at a time. And I've certainly learned from doing that. But often, after you slog through a whole textbook, it's hard to keep everything you've learned in your head (especially if you don't have several spare weeks to work through all the end-of-chapter exercises, which is a situation I find myself in more often than not).

But more recently I have found going through books in parallel has really helped me. Concepts that one book flies over are dealt with deeply in another. Concepts that another book provides one angle on are tackled from a completely different one in another. Sometimes the meaning and value of concepts are different between different authors. Even intro books sometimes provide crucial perspective that helps you understand some other, deeper text.

So if you're digging into something difficult ... don't try to go it alone. When you reach a tough part, don't give up, search out other references to help you. At first it may seem an impossible nut to crack, but someone, somewhere, may have found the words that will help you understand.

-the Centaur

Pictured: Again Bishop, and again pound cake.

[twenty twenty-five day two six eight]: the long, long slog

centaur 0

I once told my wife I was patient - and it was indeed four years from our first meeting to our marriage - but the truth of the matter is that I'm terrible at delayed gratification. I have a kazillion things I want to do and I want them all done now, now, now - but if these things I want done are MY creative projects, then I can't really hire anyone else to do them. I've got to do them myself.

This is a big bottleneck if I haven't yet learned the skill to my own satisfaction.

I've talked before about one of the techniques I use - reading the difficult book at the dinner table. I eat out a lot, and do a lot of my reading either in coffeehouses, at dinnertime, or sitting on a rocking chair near my house. But those places are useful for books that can be read in pieces, in any order. At the dinner table, I have one book set aside - usually the most difficult or challenging thing I am reading, a book which I take in a little bit at breakfast, a little bit at late night milk and pound cake, one bite-sized step at a time.

At the dinner table, I have read Wolfram's A New Kind of Science and Davies' Machine Vision and Jayne's Probability Theory: The Logic of Science and even an old calculus textbook from college that I was convinced I had failed to fully understand on the first readthrough (hint: I hadn't; I had inadvertently skipped one part of a chapter which unlocked a lot of calculus for me). And now I'm going through Bishop's Pattern Recognition and Machine Learning, which has taught me much that I missed about deep learning.

Here's the thing: having gone through (most of) two whole probability textbooks and a calculus textbook that I read to help support the probability textbooks, I no longer feel as unexpert about probability as I once did. It was my worst subject in college, hands down, but I have reached the point where I understand what I did not understand and why I didn't understand it, I know how to solve certain problems that I care about, I know where to look to get help on problems that I can't solve, and I have realized the need to be humble about problems that are currently beyond my framework of understanding.

[Whew! I almost said "I have learned to be humble" there. Ha! No, I don't think you can really learn to actually be humble. You can however learn the need to be humble and then try to achieve it, but humility is one thing that it is really difficult to actually have - and if you claim you have it, you probably don't.]

Now, I know this seems obvious. I know, I know, I know, if you read a buncha textbooks on something and are actually trying to learn, you should get better at it. But my experience is that just reading a textbook doesn't actually make you any kind of expert. At best, it can give you a slightly better picture of the subject area. You can't easily train yourself up for something quickly - you've got to build up the framework of knowledge that you can then use to actually learn the skill.

Which can lead you to despair. It feels like you read a buncha textbooks about something and end up more or less where you started, minus the time you spent reading the textbooks.

But that's only because the process of learning something complex can indeed be a really long slog.

If you keep at it, long enough, you can make progress.

You just have to be patient ... with yourself.

-the Centaur

Pictured: Pattern Recognition and Machine Learning by Bishop, sitting next to my breakfast a few days ago.

[twenty twenty-four post one hundred]: trial runs

centaur 0

Still hanging in there apparently - we made it to 100 blogposts this year without incidents. Taking care of some bidness today, please enjoy this preview of the t-shirts for the Embodied Artificial Intelligence Workshop. Still trying out suppliers - the printing on this one came out grey rather than white.

Perhaps we should go whole hog and use the logo for the workshop proper, which came out rather nice.

-the Centaur

Picture: Um, I said it, a prototype t-shirt for EAI#5, and the logo for EAI#5.

[twenty twenty-four day ninety-four]: to choke a horse

centaur 0

What you see there is ONE issue of the journal IEEE Transactions on Intelligent Vehicles. This single issue is two volumes, over two hundred articles, comprising three THOUSAND pages.

I haven't read the issue - it came in the mailbox today - so I can't vouch for the quality of the articles. But, according to the overview article, their acceptance rate is down near 10%, which is pretty selective.

Even that being said, two hundred articles seems excessive. I don't see how this is serving the community; you can't read two hundred papers, nor skim two hundred abstracts to see what's relevant - at least, not in a timely fashion. Heck, you can't even fully search that, as some articles might use different terminology for the same thing (e.g., "multi-goal reinforcement learning" for "goal-conditioned reinforcement learning" or even "universal value function approximators" for essentially the same concept).

And the survey paper itself needs a little editing. The title appears to be a bit of a word salad, and the first bullet point duplicates words ("We have received 4,726 submissions have received last year.") I just went over one of my own papers with a colleague, and we found similar errors, so I don't want to sound too harsh, but I still think this needed a round of copyedits - and perhaps needs to be forked into several more specialized journals.

Or ... hey ... it DID arrive on April 1st. You don't think ...

-the Centaur

Pictured: the very real horse-choking tome that is the two volumes of the January 2024 edition of TIV, which is, as far as I can determine, not actually an April Fool's prank, but just a journal that is fricking huge.

Announcing the 5th Annual Embodied AI Workshop

centaur 0

Thank goodness! At last, I'm happy to announce the Fifth Annual Embodied AI Workshop, held this year in Seattle as part of CVPR 2024! This workshop brings together vision researchers and roboticists to explore how having a body affects the problems you need to solve with your mind.

This year's workshop theme is "Open-World Embodied AI" - embodied AI when you cannot fully specify the tasks or their targets at the start of your problem. We have three subthemes:

  • Embodied Mobile Manipulation: Going beyond our traditional manipulation and navigation challenges, this topic focuses on moving objects through space at the same time as moving yourself.
  • Generative AI for Embodied AI: Building datasets for embodied AI is challenging, but we've made a lot of progress using "synthetic" data to expand these datasets.
  • Language Model Planning: Lastly but not leastly, a topic near and dear to my heart: using large language models as a core technology for planning with robotic systems.

The workshop will have six speakers and presentations from six challenges, and perhaps a sponsor or two. Please come join us at CVPR, though we also plan to support hybrid attendance.

Presumably, the workshop location will look something like the above, so we hope to see you there!

-the Centaur

Pictured: the banner for EAI#5, partially done with generative AI guided by my colleague Claudia Perez D'Arpino and Photoshoppery done by me. Also, last year's workshop entrance.

[twenty twenty-four day sixty-one]: the downside is …

centaur 0

... these things take time.

Now that I’m an independent consultant, I have to track my hours - and if you work with a lot of collaborators on a lot of projects like I do, it doesn’t do you much good to only track your billable hours for your clients, because you need to know how much time you spend on time tracking, taxes, your research, conference organization, writing, doing the fricking laundry, and so on.

So, when I decided to start being hard on myself with cleaning up messes as-I-go so I won’t get stressed out when they all start to pile up, I didn’t stop time tracking. And I found that some tasks that I thought took half an hour (blogging every day) took something more like an hour, and some that I thought took only ten minutes (going through the latest bills and such) also took half an hour to an hour.

We’re not realistic about time. We can’t be, not just as humans, but as agents: in an uncertain world where we don’t know how much things will cost, planning CANNOT be performed correctly unless we consistently UNDER-estimate the cost or time that plans will take - what’s called an “admissible heuristic” in artificial intelligence planning language. Overestimation leads us to avoid choices that could be the right answers.

So we “need” to lie to ourselves, a little bit, about how hard things are.

But it still sucks when we find out that they are pretty fricking hard.

-the Centaur

P.S. This post, and some of the associated research and image harvesting, I expected to take 5 minutes. It took about fifteen. GO figure. Pictured: the "readings" shelves, back from the days when to get a bunch of papers on something you had to go to the library and photocopy them, or buy a big old book called "Readings in X" and hope it was current enough and comprehensive enough to have the articles you needed - or to attend the conferences themselves and hope you found the gold among all the rocks.

[twenty twenty-four day nineteen]: our precious emotions

centaur 0

It's hard to believe nowadays, but the study of psychology for much of the twentieth century was literally delusional. The first half was dominated by behaviorism, a bad-faith philosophy of psychology - let's not stoop to calling it science - which denied the existence of internal mental states. Since virtually everyone has inner mental life, and it's trivial to design an experiment which relies on internal mental reasoning to produce outcomes, it's almost inconceivable that behaviorism lasted as long as it did; but, it nevertheless contributed a great understanding of stimulus-response relationships to our scientific knowledge. That didn't mean it wasn't wrong, and by the late twentieth century, it had been definitively refuted by cognitive architecture studies which modeled internal mental behavior in enough detail to predict what brain structures were involved with different reasoning phenomena - structures later detected in brain scans.

Cognitive science had its own limits: while researchers such as myself grew up with a very broad definition of cognition as "the processes that the brain does when acting intelligently," many earlier researchers understood the "cognitive" in "cognitive psychology" to mean "logical reasoning". Emotion was not a topic which was well understood, or even well studied, or even thought of as a topic of study: as best I can reconstruct it, the reasoning - such as it was - seems to have been that since emotions are inherently subjective - related to a single subject - then the study of emotions would also be subjective. I hope you can see that this is just foolish: there are many things that are inherently subjective, such as what an individual subject remembers, which nonetheless can be objectively studied across many individual subjects, to illuminate solid laws like the laws of recency, primacy, and anchoring.

Now, in the twenty-first century, memory, emotion and consciousness are all active areas of research, and many researchers argue that without emotions we can't reason properly at all, because we become unable to adequately weigh alternatives. But beyond the value contributed by those specific scientific findings is something more important: the general scientific understanding that our inner mental lives are real, that our feelings are important, and that our lives are generally better when we have an affective response to the things that happen to us - in short, that our emotions are what make life worth living.

-the Centaur

[seventy-nine] minus ninety-two: it’s OVER

centaur 0

After almost a year's worth of work, at last, the Fourth Annual Embodied Artificial Intelligence Workshop is OVER! I will go collapse now. Actually, it was over last night, and I actually did collapse, briefly, on the stairs leading up to my bedroom after the workshop was finally done. But don't worry, I was all right. I was just so relieved that it was good to finally, briefly, collapse. A full report on this tomorrow. Off to bed.

-the Centaur

Pictured: A rainbow that appeared in the sky just as the workshop was ending. Thanks, God!

Announcing the Embodied AI Workshop #4 at CVPR 2023

centaur 0

Hey folks, I am proud to announce the 4th annual Embodied AI Workshop, held once again at CVPR 2023! EAI is a multidisciplinary workshop bringing computer vision researchers, machine learning researchers and roboticists to study the problem of creating intelligent systems that interact with their worlds.

For a highlight of previous workshops, see our Retrospectives paper. This year, EAI #4 will feature dozens of researchers, over twenty participating institutions, and ten distinct embodied AI challenges. Our three main themes for this year's workshop are:

  • Foundation Models: large, pretrained models that can solve many tasks few-shot or zero-shot
  • Generalist Agents: agents capable of solving a wide variety of problems
  • Sim to Real Transfer: learning in simulation but deploying in reality.

We will have presentations from all the challenges discussing their tasks, progress in the community, and winning approaches. We will also have six speakers on a variety of topics, and at the end of the workshop I'll be moderating a panel discussion among them.

I hope you can join us, in the real or virtually, at EAI #4 at CVPR 2023 in Vancouver!

-the Centaur

[forty-seven] minus twenty-one: i hear there’s a new ai hotness

centaur 0

SO automatic image generation is a controversial thing I think about a lot. Perhaps I should comment on it sometime. Regardless, I thought I'd show off the challenges that come from using this technology using a simple example. If you recall, I did a recent post with a warped bookstore picture, and attempted to regenerate it using generative AI with Midjourney. Unfortunately, the prompt

a magical three-dimensional impossible bookstore in the style of M.C. Escher

me

failed to pick up the image for some reason. After a few iterations with the Midjourney Discord interface, I got the very nice, but nonsensical and generic, AI generated image you see up top. After playing around with the API, I realized that I likely had formulated my prompt wrong, and tried again to include this image:

On the second pass, I got another, more on-point, yet still nonsensical image as you see below:

These systems do LOOK impressive. But they work like ... amateurs who've learned to render well. They can produce things that are cool, but it's very hard to make them produce something on point.

And this is above and beyond the massive copyright issues that arise from a system that regurgitates other people's copyrighted art, much less the impact on jobs, much less the impact on the human soul.

-the Centaur

[thirty-nine] minus twenty-one: what a team effort

centaur 0

Wow. We're done with the paper. And what a team effort! So many people came together on this one - research, infra, operations, human-robot interaction folks, the whole nine yards. It's amazing to me how interdisciplinary robotics is becoming. A few years ago 7 authors on a paper was unusual. But out of the last 5 papers I helped submit, the two shortest papers had 8 authors, and all the others were 15 or more.

And it's not citation inflation. True, this most recent paper had a smaller set of authors actively working on the draft, collating contributions from a larger group running the experiments ... but the previous paper had more than 25 authors, all of whom materially contributed content directly to the draft.

What a wonderful time to be alive.

And to recover from food poisoning.

-the Centaur

Pictured: this afternoon's draft of the paper, just prior to a video conference to hammer out some details.

[twenty-eight] minus twenty: re-ju-ven-ate!

centaur 0

Oh, look, it's a Dalek acting as a security guard! Nothing can go wrong with this trend. :-/

Though, as a roboticist seeing this gap between terminals, I can't help but wonder whether it just undocked from its charger, whether it is about to dock with its charger, whether it needs help from a human to dock with its charger, or whether it has failed to dock with its charger and is about to run out of power in the dark and the cold where all the wolves are.

-the Centaur

Announcing Logical Robotics

centaur 0

So, I'm proud to announce my next venture: Logical Robotics, a robot intelligence firm focused on making learning robots work better for people. My research agenda is to combine the latest advances of deep learning with the rich history of classical artificial intelligence, using human-robot interaction research and my years of experience working on products and benchmarking to help robots make a positive impact.

Recent advances in large language model planning, combined with deep learning of robotic skills, have enabled almost magical developments in explainable artificial intelligence, where it is now possible to ask robots to do things in plain language and for the robots to write their own programs to accomplish those goals, building on deep learned skills but reporting results back in plain language. But applying these technologies to real problems will require a deep understanding of both robot performance benchmarks to refine those skills and human psychological studies to evaluate how these systems benefit human users, particularly in the areas of social robotics where robots work in crowds of people.

Logical Robotics will begin accepting new clients in May, after my obligations to my previous employer have come to a close (and I have taken a break after 17 years of work at the Search Engine That Starts With a G). In the meantime, I am available to answer general questions about what we'll be doing; if you're interested, please feel free to drop me a line at via centaur at logicalrobotics.com or take a look at our website.

-the Centaur

do, or do not. there is no blog

centaur 0

One reason blogging suffers for me is that I always prioritize doing over blogging. That sounds cool and all, but it's actually just another excuse. There's always something more important than doing your laundry ... until you run out of underwear. Blogging has no such hard failure mode, so it's even easier to fall out of the habit. But the reality is, just like laundry, if you set aside a little time for it, you can stay ahead - and you'll feel much healthier and more comfortable if you do.

-the Centaur

Pictured: "Now That's A Steak Burger", a 1-pound monster from Willard Hicks, where I took a break from my million other tasks to catch up on Plans and the Structure of Behavior, the book that introduced idea of the test-operate-test-exit (TOTE) loop as a means for organizing behavior, a device I'm finding useful as I delve into the new field of large language model planning.

What is “Understanding”?

centaur 1
When I was growing up - or at least when I was a young graduate student in a Schankian research lab - we were all focused on understanding: what did it mean, scientifically speaking, for a person to understand something, and could that be recreated on a computer? We all sort of knew it was what we'd call nowadays an ill-posed problem, but we had a good operational definition, or at least an operational counterexample: if a computer read a story and could not answer the questions that a typical human being could answer about that story, it didn't understand it at all. But there are at least two ways to define a word. What I'll call a practical definition is what a semanticist might call the denotation of a word: a narrow definition, one which you might find in a dictionary, which clearly specifies the meaning of the concept, like a bachelor being an unmarried man. What I'll call a philosophical definition, the connotations of a word, are the vast web of meanings around the core concept, the source of the fine sense of unrightness that one gets from describing Pope Francis as a bachelor, the nuances of meaning embedded in words that Socrates spent his time pulling out of people, before they went and killed him for being annoying. It's those connotations of "understanding" that made all us Schankians very leery of saying our computer programs fully "understood" anything, even as we were pursuing computer understanding as our primary research goal. I care a lot about understanding, deep understanding, because, frankly, I cannot effectively do my job of teaching robots to learn if I do not deeply understand robots, learning, computers, the machinery surrounding them, and the problem I want to solve; when I do not understand all of these things, I stumble in the dark, I make mistakes, and end up sad. And it's pursuing a deeper understanding about deep learning where I got a deeper insight into deep understanding. I was "deep reading" the Deep Learning book (a practice in which I read, or re-read, a book I've read, working out all the equations in advance before reading the derivations), in particular section 5.8.1 on Principal Components Analysis, and the authors made the same comment I'd just seen in the Hands-On Machine Learning book: "the mean of the samples must be zero prior to applying PCA." Wait, what? Why? I mean, thank you for telling me, I'll be sure to do that, but, like ... why? I didn't follow up on that question right away, because the authors also tossed off an offhand comment like, "XX is the unbiased sample covariance matrix associated with a sample x" and I'm like, what the hell, where did that come from? I had recently read the section on variance and covariance but had no idea why this would be associated with the transpose of the design matrix X multiplied by X itself. (In case you're new to machine learning, if x stands for an example input to a problem, say a list of the pixels of an image represented as a column of numbers, then the design matrix X is all the examples you have, but each example listed as a row. Perfectly not confusing? Great!) So, since I didn't understand why Var[x] = XX, I set out to prove it myself. (Carpenters say, measure twice, cut once, but they'd better have a heck of a lot of measuring and cutting under their belts - moreso, they'd better know when to cut and measure before they start working on your back porch, or you and they will have a bad time. Same with trying to teach robots to learn: it's more than just practice; if you don't know why something works, it will come back to bite you, sooner or later, so, dig in until you get it). And I quickly found that the "covariance matrix of a variable x" was a thing, and quickly started to intuit that the matrix multiplication would produce it. This is what I'd call surface level understanding: going forward from the definitions to obvious conclusions. I knew the definition of matrix multiplication, and I'd just re-read the definition of covariance matrices, so I could see these would fit together. But as I dug into the problem, it struck me: true understanding is more than just going forward from what you know: "The brain does much more than just recollect; it inter-compares, it synthesizes, it analyzes, it generates abstractions" - thank you, Carl Sagan. But this kind of understanding is a vast, ill-posed problem - meaning, a problem without a unique and unambiguous solution. But as I was continuing to dig through the problem, reading through the sections I'd just read on "sample estimators," I had a revelation. (Another aside: "sample estimators" use the data you have to predict data you don't, like estimating the height of males in North America from a random sample of guys across the country; "unbiased estimators" may be wrong but their errors are grouped around the true value). The formula for the unbiased sample estimator for the variance actually doesn't look quite the matrix transpose - but it depends on the unbiased estimator of sample mean. Suddenly, I felt that I understood why PCA data had to have a mean of 0. Not driving forward from known facts and connecting their inevitable conclusions, but driving backwards from known facts to hypothesize a connection which I could explore and see. I even briefly wrote a draft of the ideas behind this essay - then set out to prove what I thought I'd seen. Setting the mean of the samples to zero made the sample mean drop out of sample variance - and then the matrix multiplication formula dropped out. Then I knew I understood why PCA data had to have a mean of 0 - or how to rework PCA to deal with data which had a nonzero mean. This I'd call deep understanding: reasoning backwards from what we know to provide reasons for why things are the way they are. A recent book on science I read said that some regularities, like the length of the day, may be predictive, but other regularities, like the tides, cry out for explanation. And once you understand Newton's laws of motion and gravitation, the mystery of the tides is readily solved - the answer falls out of inertia, angular momentum, and gravitational gradients. With apologies to Larry Niven, of course a species that understands gravity will be able to predict tides. The brain does do more than just remember and predict to guide our next actions: it builds structures that help us understand the world on a deeper level, teasing out rules and regularities that help us not just plan, but strategize. Detective Benoit Blanc from the movie Knives Out claimed to "anticipate the terminus of gravity's rainbow" to help him solve crimes; realizing how gravity makes projectiles arc, using that to understand why the trajectory must be the observed parabola, and strolling to the target. So I'd argue that true understanding is not just forward-deriving inferences from known rules, but also backward-deriving causes that can explain behavior. And this means computing the inverse of whatever forward prediction matrix you have, which is a more difficult and challenging problem, because that matrix may have a well-defined inverse. So true understanding is indeed a deep and interesting problem! But, even if we teach our computers to understand this way ... I suspect that this won't exhaust what we need to understand about understanding. For example: the dictionary definitions I've looked up don't mention it, but the idea of seeking a root cause seems embedded in the word "under - standing" itself ... which makes me suspect that the other half of the word, standing, itself might hint at the stability, the reliability of the inferences we need to be able to make to truly understand anything. I don't think we've reached that level of understanding of understanding yet. -the Centaur Pictured: Me working on a problem in a bookstore. Probably not this one.

Robots in Montreal

centaur 1
A cool hotel in old Montreal.

"Robots in Montreal," eh? Sounds like the title of a Steven Moffat Doctor Who episode. But it's really ICRA 2019 - the IEEE Conference on Robotics and Automation, and, yes, there are quite a few robots!

Boston Dynamics quadruped robot with arm and another quadruped.

My team presented our work on evolutionary learning of rewards for deep reinforcement learning, AutoRL, on Monday. In an hour or so, I'll be giving a keynote on "Systematizing Robot Navigation with AutoRL":

Keynote: Dr. Anthony Francis
Systematizing Robot Navigation with AutoRL: Evolving Better Policies with Better Evaluation

Abstract: Rigorous scientific evaluation of robot control methods helps the field progress towards better solutions, but deploying methods on robots requires its own kind of rigor. A systematic approach to deployment can do more than just make robots safer, more reliable, and more debuggable; with appropriate machine learning support, it can also improve robot control algorithms themselves. In this talk, we describe our evolutionary reward learning framework AutoRL and our evaluation framework for navigation tasks, and show how improving evaluation of navigation systems can measurably improve the performance of both our evolutionary learner and the navigation policies that it produces. We hope that this starts a conversation about how robotic deployment and scientific advancement can become better mutually reinforcing partners.

Bio: Dr. Anthony G. Francis, Jr. is a Senior Software Engineer at Google Brain Robotics specializing in reinforcement learning for robot navigation. Previously, he worked on emotional long-term memory for robot pets at Georgia Tech's PEPE robot pet project, on models of human memory for information retrieval at Enkia Corporation, and on large-scale metadata search and 3D object visualization at Google. He earned his B.S. (1991), M.S. (1996) and Ph.D. (2000) in Computer Science from Georgia Tech, along with a Certificate in Cognitive Science (1999). He and his colleagues won the ICRA 2018 Best Paper Award for Service Robotics for their paper "PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning". He's the author of over a dozen peer-reviewed publications and is an inventor on over a half-dozen patents. He's published over a dozen short stories and four novels, including the EPIC eBook Award-winning Frost Moon; his popular writing on robotics includes articles in the books Star Trek Psychology and Westworld Psychology. as well as a Google AI blog article titled Maybe your computer just needs a hug. He lives in San Jose with his wife and cats, but his heart will always belong in Atlanta. You can find out more about his writing at his website.

Looks like I'm on in 15 minutes! Wish me luck.

-the Centaur

 

Information Hygiene

centaur 0

Our world is big. Big, and complicated, filled with many more things than any one person can know. We rely on each other to find out things beyond our individual capacities and to share them so we can succeed as a species: there's water over the next hill, hard red berries are poisonous, and the man in the trading village called Honest Sam is not to be trusted.

To survive, we must constantly take information, just as we must eat to live. But just like eating, consuming information indiscriminately can make us sick. Even when we eat good food, we must clean our teeth and got to the bathroom - and bad food should be avoided. In the same way, we have to digest information to make it useful, we need to discard information that's no longer relevant, and we need to avoid misinformation so we don't pick up false beliefs. We need habits of information hygiene.

Whenever you listen to someone, you absorb some of their thought process and make it your own. You can't help it: that the purpose of language, and that's what understanding someone means. The downside is your brain is a mess of different overlapping modules all working together, and not all of them can distinguish between what's logically true and false. This means learning about the beliefs of someone you violently disagree with can make you start to believe in them, even if you consciously think they're wrong. One acquaintance I knew started studying a religion with the intent of exposing it. He thought it was a cult, and his opinion about that never changed. But at one point, he found himself starting to believe what he read, even though, then and now, he found their beliefs logically ridiculous.

This doesn't mean we need to shut out information from people we disagree with - but it does mean we can't uncritically accept information from people we agree with. You are the easiest person for yourself to fool: we have a cognitive flaw called confirmation bias which makes us more willing to accept information that confirms our prior beliefs rather than ones that deny it. Another flaw called cognitive dissonance makes us want to actively resolve conflicts between our beliefs and new information, leading to a rush of relief when they are reconciled; combined with confirmation bias, people's beliefs can actually be strengthened by contradictory information.

So, as an exercise in information hygiene for those involved in one of those charged political conversations that dominate our modern landscape, try this. Take one piece of information that you've gotten from a trusted source, and ask yourself: how might this be wrong? Take one piece of information from an untrusted source, and ask yourself, how might this be right? Then take it one step further: research those chinks in your armor, or those sparks of light in your opponent's darkness, and see if you can find evidence pro or con. Try to keep an open mind: no-one's asking you to actually change your mind, just to see if you can tell whether the situation is actually as black and white as you thought.

-the Centaur

Pictured: the book pile, containing some books I'm reading to answer a skeptical friend's questions, and other books for my own interest.

Learning to Drive … by Learning Where You Can Drive

centaur 1
I often say "I teach robots to learn," but what does that mean, exactly? Well, now that one of the projects that I've worked on has been announced - and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference - I  can tell you all about it! When I'm not roaming the corridors hammering infrastructure bugs, I'm trying to teach robots to roam those corridors - a problem we call robot navigation. Our team's latest idea combines "traditional planning," where the robot tries to navigate based on an explicit model of its surroundings, with "reinforcement learning," where the robot learns from feedback on its performance. For those not in the know, "traditional" robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler "local planner" that knows how to navigate shorter distances. It's a little like how you learn to drive in your neighborhood - starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what. But for that to work, you have to know how to drive, and that's where the local planner comes in. Building a local planner is simple in theory - you can write one for a toy world in a few dozen lines of code - but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called "navigation stacks" and can contain dozens of components - and in my experience they're hard to get working and even when you do, they're often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings. People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments. But there's a further wrinkle: the so-called "reality gap". By necessity, the local planner used by a probablistic roadmap is simulated - attempting to connect points on a map. That simulated local planner isn't identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can't navigate safely in the real world. This can have disastrous consequences - causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they're going, causing cars to tumble off the end of a bridge. Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap's local planner and the robot's navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world. In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a "straight-line" local planner which was our default alternative. While I didn't happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot's behavior wasn't as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we'll fix this in a subsequent round). This work includes both our group working on office robot navigation - including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson - and Alexandra's collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper.  Until the ICRA version comes out, you can find the preliminary version on arXiv:

https://arxiv.org/abs/1710.03937 PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning

We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.
  So, when I say "I teach robots to learn" ... that's what I do. -the Centaur