Commonsense Computing as a Non-Conventional Solution
You might find this title very weird.
Well, I just wanna point out that, a huge majority of the technology research society solve their problems by trying to find the exact, precise solutions that base on strong, general mathematical models.
Content analysis provides a great amount of examples: speech recognition, natural language understanding, and catagorization/retrieval/understanding of image, video, 3D model, human motion/gestures, facial expression understanding, emotion detection, and so on. Almost all the researchers try to solve all these problems from low levels, step by step, and wish to reach the higher levels gradually.
It's nothing wrong with it. But if we ask why, particularly for content analysis, non of their achievements are satisfying enough? I say, we should take a look at how we humans reason all the things in our world:
"By experience we learn it."
Nobody needs to understand complicated mathematical models in order to understand what people are saying, what their gestures/motions mean, what their facial expressions mean, what there are in the image, or what their emotions are. We just learn all these stuffs by experience. If we can teach computers all things by experience, I say it's gonna be easier for them to learn, as opposed to how researchers' original way of thinking this problem. Of course we don't have 20 years to wait for the computers to become an "experienced, grown-up computer" from a "new-born computer", but we can try to collect all the experience we have and give the computer the whole bunch of it in a relatively short time- afterall it's got a huge storage anyway. And that's how I view commonsense computing.
In other words, while the conventional thinking is quite valuable too, I think experience-based approach is even more important in the research of how people think about the world and themselves.
Affective Computing Based on Commonsense/Experience
I have this kinda feeling toward to-date affective computing too. Right now, people use as many sensors as they could, and try to conclude which sensor's data relates to the emotions the most. In my opinion, however, I think we should take advantage of how people use their experiences to recognize emotions through different modalities. That way, We wouldn' t need to construct an exact model about the relationship between particular emotion and sensor data but still get good - even better- recognition results. For example, if a woman walks very fast in a dark street alone with her body streched, we can easily infer that she might be afraid or nervous base on our commonsense. The commonsense inferencing provides much more information that we share then sensor data. Without such inferencing, I don't think it would be easy to recognize the emotions correctly, because the information provided by physiological phenomenona detected by sensors is too scarce.
0 Comments:
Post a Comment
<< Home