Now, see, this is what I was talking about last time. At the Game Developer's Conference today, after a long and informative series of talks, I ended up having an impromptu roundtable about language model planning, followed by a dinner with a friend talking about the history of the conference from a behind-the-scenes perspective.
The discussions were fascinating, and there are potentials for integrating these new language model technologies with older techniques like logic programming to very good results.
But, by the time the dinner was done, I was exhausted, and crashed back at my room, trying to sleep off some of the effects of two nights of rich dinners, no sleep, and hard-core information overload.
But I still had more work to do, creating my slides for the upcoming HRI in Academia and Industry Workshop at the AAAI Spring Symposium Series, not to mention my document updates on the main social navigation benchmarking paper itself. And, of course, all of that could not get finished in one night, not if I want to get up early enough to attend what are sure to be packed talks tomorrow morning.
But my point, and I did have one, is that if I had relied on myself to blog at the end of the day "when everything was done", I wouldn't have blogged at all, because everything is NOT done. But, since I had momentum from earlier in the day, it was easy to pull up the window and put together a quick post.
This post.
So, momentum is real. Once you start doing something, it's easier to keep doing it.
-the Centaur
Pictured: Various lines and slides from today's GDC, and a nice dessert from Amber India.
Oh yeah, I only offhandedly mentioned - back at the Game Developer's Conference!
Hi Mom! Oh wait, she's gone. That ... went dark fast, Francis. Well, hopefully she's watching up there and is not too mad that I'm still wasting my time on such frivolous things. But, I got my job at Google through the AI Programmers' Roundtable in ... 2005, I think it would have been, so this is not frivolous to me. And it's a great place to find out what's going on in the field ...
... and I must say, the first talk out of the gate got dense, fast!
"How can we make time if we do not ever take time?" I love that quote. It's a riff on a line from one of my favorite movies, The Matrix Reloaded, in which a minor baddie, the suave Merovingian, mocks our hero's Neo's polite refusal to dine - and calls him out on his dogged insistence on getting to the point. "Yes of course. Who has time? Who has time. But if we never *take* time, how can we have time?"
The Merovingian is just mocking Neo. But he has a real point: All too often in our fast-paced modern lives, we say to others - or to ourselves - that we don't have time to do something. Or, at least, I do - your mileage may vary, as may the bricks in the wall of your calendar and your number of irons in the fire. But often what we really mean is not that we don't have the time, but that we don't want to take the time to do it.
Sometimes this is option cost. Sometimes, we really need to give up something better to make time for something. Right now, I'm out at GDC, the Game Developer's Conference, and I've already let some of my peeps know that I can't do dinner Wednesday night, because I'm planning to attend the Game Developer's Awards. It's a great show and comes around just once a year, so if I want to do it, I have to make time for it. That means taking time from something else, in this case, taking that off the calendar for meeting friends.
So, too, it is with blogging.
I've been trying to blog every day this year, and have already fallen almost a month behind, even though I've been posting two and three times a day - when I post. But the problem, I realized, is that I had been fitting blogging in as an optional task at the end of the day. If you stay up late because you've been flying, or working on your taxes, or attending a programmer's get together, then blogging likely gets the shaft - doubly so if you have to get up early to attend a conference. Stake that vampire! Or, blogpost. As the case may.
SO, I've decided to try to be more like the Merovingian, at least, in his philosophy of time. If you don't take time, then you can never make time for anything; so I've decided to try taking out some time during the day to blog, rather than making it an end-of-the-day task. Like so many things, it's hard to say how long this will last, but at least for today, it produced one more blogpost than I would have otherwise.
-the Centaur
Pictured: My favorite table at one of my favorite breakfast joints, Mo'Z Cafe in San Francisco, where this blogpost was authored between sessions at GDC.
So, I just finished a three-leg plane flight, the longest leg of which was five and a half hours. Whas that twelve hours of travel time? I think it was twelve hours of travel time. I know that's nothing compared to people who fly to Australia or Singapore, but I feel like having a nap. So no blog for you.
-the Centaur
Pictured: A temporary fix which, yeah, didn't do so well in the rains.
Hey folks, I am proud to announce the 4th annual Embodied AI Workshop, held once again at CVPR 2023! EAI is a multidisciplinary workshop bringing computer vision researchers, machine learning researchers and roboticists to study the problem of creating intelligent systems that interact with their worlds.
For a highlight of previous workshops, see our Retrospectives paper. This year, EAI #4 will feature dozens of researchers, over twenty participating institutions, and ten distinct embodied AI challenges. Our three main themes for this year's workshop are:
Foundation Models: large, pretrained models that can solve many tasks few-shot or zero-shot
Generalist Agents: agents capable of solving a wide variety of problems
Sim to Real Transfer: learning in simulation but deploying in reality.
We will have presentations from all the challenges discussing their tasks, progress in the community, and winning approaches. We will also have six speakers on a variety of topics, and at the end of the workshop I'll be moderating a panel discussion among them.
I hope you can join us, in the real or virtually, at EAI #4 at CVPR 2023 in Vancouver!
This is a followup to my Making Computers Useful series, started all the way back in 2014. (Funnily enough, the 2013-era iMac featured in that series is now pretty damn useless as it has fallen out of update range, and locks up if you run Dropbox and Google Drive at the same time).
But, the overall goal here is to document some of the stuff that I need to do to make computers work for me. Typically, there’s a lot of friction in software, and it takes a good bit of work to make that all function on a new machine. Sometimes that becomes a deep dive.
This is one of those stories.
So today, while updating the Embodied AI Workshop’s website prior to the launch of the 2023 version, I wanted to run the tree command. Tree is great because it can help you understand the structure of a directory tree, like so:
I felt I needed this because the Embodied AI website is built on yarn and gatsby, and it turned a relatively simple site into 1.6 gigabytes of generated junk (which I noticed when one of my older computers started literally wheezing as it tried to mirror all those unexpected files):
As it turns out, you can get tree via Homebrew. Homebrew is a “package manager,” kind of like an “app store for the command line,” and Homebrew helps you get standard Linux tools, like tree, onto your Mac so you can take advantage of all the hidden Unix goodness in your Macintosh.
However … I’m a bit leery of Homebrew because this is how it installs itself:
I mean, WHAT? curl a file and run it with bash? Seriously. Now, look, I’m not saying Homebrew isn’t safe - every indication is that it is - but that this METHOD of installation is a recipe for disaster.
Why? Well, in case you’re not in the know, what this installation instruction is suggesting is to DOWNLOAD RANDOM CODE FROM SOMEWHERE ON THE INTERNET and RUN IT ON YOUR COMPUTER WITHOUT CHECKING IT AT ALL!
Nothing can go wrong with this plan.
Now, I’m no expert, but I’m familiar enough with this stuff to know what I’m doing. SO, first I checked with a few quick searches to see [is homebrew for mac safe] and it appeared to be.
SO I downloaded the software with JUST the CURL part, like so:
Folks, seriously, never do this on a site you do not trust.
After I had the code, I then inspected this homebrew-install.sh file to find out whether it was safe. I didn't see any obvious malware, but when I ran it, it wanted me to TYPE MY PASSWORD.
Seriously?
Please, I’m asking you, do not hot-pipe random software straight off the internet and run it straight from the command line and give it your password when it asks. If someone intercepts the website, and gets your password, they can do anything.
(SERIOUSLY. Once I was working with a legitimate Google representative about a Google ads program and when I went to log in to Google ads to check something, a hacker injected a fake Google ads site between me and Google, and damn near got my password. Only two-factor authentication saved me, as it broke some key link in the chain.)
BUT … it is the PATTERN I’m talking about here, not the specifics. Everything I’ve seen about Homebrew says that it is safe. I’ve even used it before, on other machines. SO, after some more research, and a little more code analysis, I confirmed this password-asking was safe, and gingerly went ahead.
And it went fine.
I had to pay thirty million bitcoin to a Russian spammer, but I wasn’t using it anyway, and I’m sure at least they got to buy a cup of coffee or something with it. :-D
Seriously. It went fine. And I love Homebrew. I just go through this every time I need to “bash” run a piece of “curl”-ed software straight off the Internet and then it asks for my password.
Still, tree worked like a charm. (Screenshots of its use were above). There are more pieces of Homebrew software I need to install, but as one test, I tried to install “banner”, a program to create oversized pieces of text, which I use in scripts to alert me that a big task is done.
But, it seems like Mac already has a version of banner, which works differently on Mac than Linux, printing VERY large ASCII banners that are vertical rather than horizontal. That’s useful, but not for my case, so I dug around for an equivalent tool. brew install figlet is the way to go:
All great!
It didn’t help me with my work on the Embodied AI website, as I had already moved on to fixing other problems on that website, and was only “brewing” things in the background while I did other tasks (like remote-attend the church vestry retreat).
But removing this friction will help me in the future. The next time I need to examine the tree structure of a directory, it's one command away. I can put banners in my scripts. And I can easily add new software with 'brew' the next time it comes up.
AND, as a bonus, I discovered a site which is doing something very much like what I want to do with the Making Computers Useful series, Sourabh Bajaj’s Mac OS Setup Guide, which “... covers the basics of setting up a development environment on a new Mac.” I have an internal document called “Mac OS X New System Tasks” which documents for myself the travails I go through every time I get a new Mac, and Sourabh’s guide seems like it provides a public version of what I want to do. Which is great! Less work for me. ;-D
On to the next task!
-the Centaur
P.S. As another added bonus, I composed this in Google Docs, and pasted it straight into Gutenberg, the new Wordpress block editor. It worked like a charm ... EVEN DOWN TO PASTING IN THE IMAGES! If this is a feature of Gutenberg, I will have to consider saying my favorite three words about it ... "I was wrong."
P.P.S. Don't hold your breath on that, though, I'm waiting for the other shoe to drop.
SO, one of my favorite characters is Porsche, a centauress warrior from the thirty-first century who populated many of my first tranche of as-yet-unpublished science fiction stories. (I think she only appears in one published story, "Stranded" in the anthology of the same name, and even that, just as a cameo). And while I have worked a lot to improve my art, I wondered what Midjourney could do. And I got the above result from the following prompt:
a centauress with long, rich curly purple hair, very beautiful, with half-asian, half-english appearance, and pointed ears, wearing an armored space costume like a combination of ghost in the shell and star trek, and bearing a double-bladed scythe with black glowing blades
- the Centaur
Wow. This is really spot on. Her hair is right, her face is right, her skin tone is right, her armor is right ... heck, even her slightly haunted look is right, and to go beyond even that, one of the variants looks like a slightly older, more grizzled variant, which completely checks out with her storyline:
Kudos to you, Midjourney, except ... she's not a centauress. She's just a person, a fetching one, I admit, but not a half-woman, half-horse creature with the pointed ears and black twin-bladed scythe of the prompt.
Well, shoot. What if we look at some of the other variants?
This one is creepily good in a sense ... it's got her forehead dot (she's a First Contact Engineer, and wears a pheromone bead she used to communicate with a scent-based alien hive species) and even hints of her mechanical arm and possibly ear. But this is just coincidence. Look at this other variant:
What appears is just chance. Here, her ears are rounded, the dot's gone, and the weapon looks even less on point. A lot of what looked right to me is just random features onto which I was projecting, like cloudbusting.
Well, double shoot. What if we refine the prompt? What do we get?
a centauress (a creature with the upper body of a woman and a lower body of a horse) with long, rich curly purple hair, very beautiful, with half-asian, half-english appearance, and pointed ears, wearing an armored space costume like a combination of ghost in the shell and star trek, and wielding a scythe with black glowing blades
- the Centaur
Yerk. That's ... just jumbled nonsense. Tweaks to the prompt to make it simpler just produced women on horses. Midjourney does not apparently understand the concept 'centaur' in any meaningful sense. I tried just the prompt "centaur", and ... um ... yeah ... no, I'm not going to show you those. They're just a guy with a horse, or sort of on a horse, or ... sort of ... in a horse? A centaur as envisioned by The Thing.
Okay, one last try. What if I give it one of my pieces of art, and then ask it to render it anime style? Let's hold that piece of imagery till the last, but the prompt is:
an anime style centauress with purple hair and a double-bladed scythe jumping in front of a waterfall
--the Centaur
Oh, lordy. And I'm not going to show you the one it tried to generate from the prompt "anime style".
Oh wait, I am!
Wow. Evocative - the top left reminds me of Cinnamon Frost - but it has little to do with the image I put in, and the attempt in the top right especially is nonsensical.
I am inspired by Midjourney. It's definitely a better renderer than me, and has good ideas about composition which I have already used in my artwork.
But I stick by my comment that it is an amateur which has taught itself to render very well, and cannot take meaningful art direction. As limited as I am, I'll stick with my own drawing, thank you!
Like this one, the image I gave to Midjourney above. It's not perfect, it's not well rendered ... but it is mine:
And she has four legs, a scythe, and pointy ears, dag nabit.
Ok, I'll rise, but I do not promise to shine. Why are we still doing this again anyway? Oh, that's right ... it's complicated, but as usual, our politicians want to do what feels good but is the exact opposite of the science. Daylight savings time (shifting ahead of the sun) has negative effects, and doing it year round seems like it will make it worse; but instead of banning it, our politicians want to make it permanent. It's a feel-good measure which will do the opposite and make lots of us feel bad (and become more sick).
Figures.
-the Centaur
Pictured: Coffee, somewhere (Victory Point Cafe), which, given how perpetually caffeinated I am, will do nothing at all to wake me up.
SO automatic image generation is a controversial thing I think about a lot. Perhaps I should comment on it sometime. Regardless, I thought I'd show off the challenges that come from using this technology using a simple example. If you recall, I did a recent post with a warped bookstore picture, and attempted to regenerate it using generative AI with Midjourney. Unfortunately, the prompt
a magical three-dimensional impossible bookstore in the style of M.C. Escher
me
failed to pick up the image for some reason. After a few iterations with the Midjourney Discord interface, I got the very nice, but nonsensical and generic, AI generated image you see up top. After playing around with the API, I realized that I likely had formulated my prompt wrong, and tried again to include this image:
On the second pass, I got another, more on-point, yet still nonsensical image as you see below:
These systems do LOOK impressive. But they work like ... amateurs who've learned to render well. They can produce things that are cool, but it's very hard to make them produce something on point.
And this is above and beyond the massive copyright issues that arise from a system that regurgitates other people's copyrighted art, much less the impact on jobs, much less the impact on the human soul.
What, you think you can just look at me, sitting here on this weird funny pile of leaves on top of these strangely fallen logs, minding my own business like a normal cat, and I'll look back? At YOU?
Recently I went to do something in Mathematica - a program I've used hundreds if not thousands of times - and found myself stumped on a simple issue related to defining functions. I've written large, complicated Mathematica notebooks, yet this thing I done hundreds of times was stymieing me.
But - yes - I'd done it hundreds of times; but not regularly in the past year or so.
My knowledge had gone stale.
Programming, it appears, is not like riding a bike.
What about other languages? I can remember LISP defun's, mostly, but would I get a C++ class definition right? I used to do that professionally, eight years ago, and have published articles on programming C++ ... but I've been writing almost exclusively Python and related scripting languages for the past 7 years.
Surprisingly, my wife and I had this happen in real life. We went to cook dinner, and surprisingly found some of the stuff in the pantry had gone stale. During the pandemic, you see, we bought ahead, since you couldn't always find things, but we consumed enough of our staples that they didn't go stale.
Not so once the rate of consumption dropped just slightly - eating out 2-3 times a week, eating out for lunch 2-3 times a week - with a slight drop in variety. Which meant the very most common staples were consumed, but some of the harder-to-find, less-frequently-used stuff went bad.
We suspect some of it may have had near-expired dates we hadn't paid attention to, but now that we're looking, we're carefully looking everywhere to make sure our staples are fresh.
Maybe, if there are skills we want to rely on, we should work to keep those skills fresh too.
Maybe we need to do more than just "sharpen the saw" (the old adage that work goes faster if you take the time to maintain your tools). Perhaps the saw needs to be pulled out once a while and honed even if you aren't sawing things regularly, or you might find that it's gone rusty while it's been stored away.
-the Centaur
Pictured: The bottom layers of detritus of the Languages Nook of the Library of Dresan, with an ancient cast-off office chair brought home from the family business by my father, over 30 years ago.
Recently, when digging through old posts, I was reminded that Classic Editor posts are broken in WordPress - all the paragraph breaks are gone, and the content is mashed up into one grey wall of text. Thanks, WordPress, for forcing everyone to switch to a worse editing experience AND breaking all our old content.
[hang on a second, i have to start clicking around at random places on the page to try to find the widget or control that will let me start typing again after inserting an image, because software USABILITY has been replaced by "user experience" folks from a graphic design background who have mistaken making things LOOK GOOD IF THEY HAD BEEN PRINTED for the very different ACTUALLY WORKING WELL AS A TOOL - I'm looking at you, WordPress Gutenberg, Dropbox Paper, and everything like you where you have to hover or click or click and select and hover random parts of the page to make it work. Okay, I can start typing again.]
[[ and yeah it just did it again while i was just fricking typing ]]
Ok we're back.
Ok?
Ok.
Anyhoo, I have like a thousand old posts (1371 published, according to the dashboard), but the block converter for fixing these no longer works. I wish I had discovered this problem earlier, but I just didn't expect to have to do blog archaeology when I moved to Gutenberg.
Regardless, however, I now have a system. I open the All Posts page on the WordPress dashboard, and scroll backwards in time until Classic Editor posts start showing up - nice that they provide that nudge to get us to use the new editor, isn't it. Once I find some Classic Editor posts, if you hover - AAAAARRRRRGH, don't mind me - I say, if you hover, you get the option to open with the Block Editor. FORTUNATELY, this is ACTUALLY a link and not a bizarre Javascript pseudo-button - Good WordPress, Good WordPress, have a cookie - and a right click will allow you to open this in a NEW WINDOW.
SO! I go down one entire page of results, opening them in a new window, until I've hit all the Classic Editor posts on that page. This creates a gazillion tabs, true, but then you can click on each tab in turn, and there's a simple three-click process which will activate the block editor, convert the old text, and - BAM! - update. Optionally, one more click will bring up the updated post so you can doublecheck it before closing the tab.
The process is laborious - but it's easy to get a whole page full of results at a time, and you can't easily lose your place, as you close your tabs as you go. I've gotten through 3 pages of results so far, each with 50 posts, so I've updated probably something north of 150 pages.
There are 25 more pages of posts to go, but it doesn't take more than 30 minutes, so I can do one a day for about a month and rescue all the old pages.
A lot of work ... but at least I now have a system.
-the Centaur
Pictured: The House With The Impressive Tree In The Front Yard, found in a nearby neighborhood, as photographed in Night Mode on my Android phone during a walk with my wife.
Somehow, inadvertently, I caused the previous picture's post to get blurred in transport. Below is a better version, which seems to have come through much clearer:
This is from my blogpost "All the Transitions of Tic Tac Toe, Redux" . Apparently the full-size image is no longer available (probably because it's close to 80 megabytes in size, and whatever file hosting I was using to put it up is broken) but a "smaller" version is below, only 12 megabytes in size (or here):
All the transitions from the first state of tic-tac-toe (at the bottom) to to win for X (left), win for O (right), or draw (top).
Funny ... I long remembered this as being the topic of "Don't Fall Into Rabbit Holes" but that turns out to have been a completely different project.
Wow. We're done with the paper. And what a team effort! So many people came together on this one - research, infra, operations, human-robot interaction folks, the whole nine yards. It's amazing to me how interdisciplinary robotics is becoming. A few years ago 7 authors on a paper was unusual. But out of the last 5 papers I helped submit, the two shortest papers had 8 authors, and all the others were 15 or more.
And it's not citation inflation. True, this most recent paper had a smaller set of authors actively working on the draft, collating contributions from a larger group running the experiments ... but the previous paper had more than 25 authors, all of whom materially contributed content directly to the draft.
What a wonderful time to be alive.
And to recover from food poisoning.
-the Centaur
Pictured: this afternoon's draft of the paper, just prior to a video conference to hammer out some details.
Even though building up a great library is an important part of my process, getting out of the office is just as important. There's little better in my mind than getting out to some other space where you can't do laundry, pay your bills, or even get distracted by some book you were reading. Out in a coffeehouse or cafe, you can sit, read, and write, disappearing into that state of "flow" you get from engagement with your own process.
But it's just as important to expose yourself to new, unchosen information - not your news feed or blogroll, but a set of information spread out across all possible topics, like reading a great encyclopedia, visiting a library ... or browsing a bookstore. While a bookstore's topics are limited, and even the nicest ones are trying to sell you things, they're not just trying to sell them to you: they're trying to create an information space, one of a completely different kind than I talked about when discussing my library.
In a bookstore or library, it's possible to get lost in chains of thought that you never would have otherwise had, because you're prompted by information that you never would have chosen to see, if it all came from your feed or your previous collection of chosen books and media.
Get out sometime, and lose yourself in a good bookstore. If you can walk there from where you are, so much the better; then you can combine the experience of life with the expansion of your mind.
-the Centaur
Pictured: Moe's Bookstore in Berkeley. Does ... that seem right to you, or am I still woozy from food poisoning? :-D
I got food poisoning Monday night. Admittedly, this was pretty serious (in the top five, or even top three of food poisoning incidents in my life) and it was on a red-eye flight (definitely in the top three most miserable experiences of my life) with serious turbulence (also in the top five or so as turbulence goes) but, even so, DAYS later, I'm still running on backup systems and batteries. I typically can't sleep until 5am, no matter when I go to bed, and then can't seem to wake up until 2 to 4 pm, well more than a solid 8 hours later. And I can't seem to concentrate, reading the same paragraphs over and over again until finally the lawnmower motor baarrrrumphs to life and I start to be able to move through the paper again.
So, in sum, what I'm saying is, try not to get food poisoning on a redeye.