Saturday, February 23, 2008

Facial Expression Recognition. Who Wants This?

I came across another interesting article on Science Daily concerning new software prototype that can read your face and identify your expression as one of six basic emotions. The article lists a couple of possible applications for this kind of software -- none of them anything I would ever want my computer monitor doing without some pretty explicit case-by-case authorization.

The basic idea is that after it decides what look is on your face, it can communicate with some kind of online avatar that will reproduce the basic expression for the benefit of whoever you happen to be corresponding with. So maybe your World of Warcraft avatar is mimicking your expressions in near real-time, or maybe you're doing business online and your computer is spying on your reaction to the offer you just got.

I mean... I guess it's cool. But it's kind of nosy. Facial expressions might give a more complete picture of a person's feelings than their words, but words are chosen, while facial expressions are largely involuntary. If you wanted to video chat, there's already software to do it. Do we need virtual faces?

In terms of web immersion, it seems a little bit too intense -- we're already creating weird enough social phenomena in MMORPGs as it is-- but giving people the tools to put up a false front that synchs up to their faces? How deep do we want to get into this stuff?

And in terms of e-commerce, I'm just not sure what's going to come of that. A new form of market research that tracks people's facial expressions when they look at a new product or ad? I can't see a scenario where that could possibly work out to the buyer's advantage, but I see plenty where it could be abused by the seller. We've already got spyware programs doing their best to track our browsing habits -- expression tracking just seems like one more step in the wrong direction -- unnecessary and invasive.

Now, there really are ways this could be used that would be cool and important, so I'm not saying I wish it didn't exist. I'd really like to see some applications where a program attempted to respond "intelligently" to your expressions by offering you different options depending on your reaction -- still, you'd really need a verbal override, because people just aren't in full and constant command of their faces the way they are with their voices and their fingertips.

Check out a video of someone testing out the program. Notice that it looks sort of fun to manipulate, but consistently misses the authentic half-smile and eyerolling that keep creeping up between unrealistic stage-expressions. I don't think this is going to clarify people's emotions over the net very well, though people might enjoy using it to express them.

Meanwhile, if we're going to be using something like this to show people our expressions and clarify our responses and intentions-- then instead of categorizing expressions and repackaging them as a basic catchall emotion like "fear" or "joy", wouldn't it make more sense to use those sophisticated tracking abilities to just rebuild the same face you made?* Human expressions are too complicated to be sliced up into six types.

*Given the somewhat vague wording of the article, it's possible the program in question does this, but my overwhelming feeling from the description of the way it works is that it focuses on identifying rather than rebuilding the expressions it captures.

4 comments:

Gyro said...

For marketing: don't worry about the facial expressions. They're already doing market research while running brain scans. Or rather, maybe worry a lot.

I think that the avatar expression is actually kind of neat. I mean, it's additional contextual information, and I doubt people would be required to leave it on. It's just another step in interaction and towards more of a VR environment. These are early steps, sure, and kind of haphazard. As long as you can turn them off, I think they're a good path to develop.

Gyro said...

Oh, and something I just thought of:

Think about how people might be able to better train themselves if one had an accurate system that gave them immediate feedback as to what their nonverbal cues were?

Jackie Bowen said...

I'd just prefer to see avatar expression cues and gestures being, if anything, being more complex and realistic, rather than more connected to what you're doing behind the screen.

Maybe it's just me, but if I had an avatar and wanted it to smile, I'd prefer to /tell/ it when it was allowed to smile!

Summary: I'd rather click a button that let me choose between 20 different expressions than have my computer "decide" I'm smiling at people on the internet every time my roommate makes a joke.

Not to mention "surprised" every time I roll my eyes.

Gyro said...

I imagine it's likely they'll implement a combination of pre-scripting and modeling. Hopefully, people will get their choice of emoting.