Thursday, November 03, 2005

Problem Defined - Video Sequencing Weekly Report Nov 2

This week, I’ve been trying to take look at more of the stories, the annotations, to come up with some better approach of applying commonsense technique into this problem domain, and also to do some little experiment with ConceptNet. We sort of coming up with a new formalization of the problem and scenario during the discussion, and the picture for the whole thing as well as the next step are drawn pretty clearly.

Looking at the Stories and the Annotations

I took a look at both sets of text, and I have found something interesting during the observation, listed below:

  • Synonyms – Similar while different terms tend to appear in different passages for describing the similar meanings, e.g., “creating”, “inventing”, “making”... and so on. It is important for the system to recognize that they convey the same idea if we wish to correlate the annotations with the stories
  • Related concepts – Sometimes the above situation results not from the synonyms but the related things under certain context. For example, "John invented the dance" and "he created new steps" should point to the same event with higher possibility, as opposed to “John invented the dance” and "John invented the animation"
  • Proper names – The single name lying in different pieces of annotation, e.g., "Victoria", actually refers to the same person and suggests a lot of hints or clues of causality. Identifying these names is something extremely important in the process of finding the relationships among stories and annotations as well
  • Despite that the events stored in StoryNet tends to be very low-level, I think it somehow has the potential of being a useful tool. The newer version of StoryNet that Dustin Smith and other people in Commonsense Computing Group are pursuing is constructed with the attempt of showing all the possible steps for a given goal. It should be useful for filling up the semantic gap where things are not explicitly told in the stories or annotations. This can be considered as a future step.

Finding the concept flow within the story

And then I tried to find the proper clips for a story manually, such that I can understand more about the process how human selects the useful materials over others.

  • Characters – Something interesting is that, almost all the stories written are going along with some particular character, so finding all the annotations concerning about this particular character would be the first step of the system’s process
  • Time – Because all stories are about time, and different pars of a story tend to be different sub-events happening at different time period, trying to divide the story into parts using time information might be the second step after identifying the character(s).
  • Extracting the “story descriptors” from the story (e.g., characters, desires/goals, problems, does, emotions, results). I was actually not sure whether it is necessary for the system to try to sense these “descriptors” existing in the passages. It would be helpful in terms of focusing on the few things and filtering out irrelevant utterances, but it would be harmful at the same time if the choice of descriptors is not good
  • Labeling the key ideas in the text. The key ideas right here, in my personal point of view, are actually the values for the descriptors (for example, for the goal descriptor, the sentence “Gustave likes to create his own dance” has the key idea of “create his own dance”) If we don’t do the step of extracting the story descriptors, then the key ideas could then be arbitrary verbs with certain objects that the verb acts on. (Since, again, the function of the story descriptors is to narrow down the semantic processing procedure into the examination of some specific set of values.)
  • Finding related annotations to the stories according to the labeled ideas
  • Scope – This is actually the semantic matching process of two sentences' parts But it's not reasonable for the system to take into account all the words in both sentences when computing the correlation, since they shouldn't be descriptions of events of a same scope. The clip annotations tend to be subsets of the stories, so it makes more sense for us to claim one annotation is very much related to a story even if they share only one concepts. So right now the approach I came up with is the so-called "in-other-words" approach. That is, rewrite all the sentences in both the stories and annotations into as many distinctive sentences as possible, and match the sentences from both sets.

Right now, I don't think it makes much sense anymore for the system to link annotations using the context constructed by the stories deliberately. The context naturally comes as the story is inputted, and the selected annotations will naturally contextually related as well if we select the annotations according to the sentences in the story. So here the problem would thus become how to relate the annotations with the stories by listing the "in-other-words" sentences, and we don't need to bother the relationships among clips anymore. The complexity of our task of building this system is therefore somewhat lowered, but I think it is actually a good interaction design, meanwhile a more reasonable problem domain for commonsense computing to fits in.

All the problems about deciding which is the next clip are eliminated via the approach of having users input their own story for the video to be generated. Because I don’t need to find the causal relationships among the clips, nor any other relationships. In other words, the task of constructing the narrative has been avoided. The subtask left becomes simply to compare two text passages – whether one comprises the other, or whether they refer to the same fact.

Once this stage can be carried out by commonsense reasoning technique, we’ll be able to proceed to the next stage – creating a narrative machine that gives us a story based on the material it has. For example, if the system wants to focus on emotional transitions, then clips about the same characters/group experiencing different emotions can be selected to form a narrative. This is going to be a higher-level look at these annotation things.

Story:

Gustave is one of the dancers. He likes to suggest ideas for the dance. He gets very frustrated that he is only allowed to do what he is told. At one point he was crying. This happened during the first week. The second week was different. Dancers were encouraged to design their own dance. Gustave had a wonderful time.

Annotations:

(O) Gustave wants to be a dancer instead of orchestra in "RoBallet," so he invents some new moves. He shows them to Jacques, who likes how Gustave changes his level throughout the dance, which is something no one else has done.

(O) Gustave is upset because he wants to experiment with presenting the dances in different ways, but so far Jacques has been telling the kids what to do. Gustave just wants the chance to make something up himself.

(?) Louis begins his animation. Gustave gives him tips on how to create patterns, but ends up making the wrong design. After suffering a minor setback, Louis starts over and does it his way.

(?) The kids look for Gustave, who has been behind the screen working on animation. They want to include him in their dance.

(?) Gustave's mother watches the children perform the Finale, and she notices that they have made much progress.

(?) Jacques explains to the kids how their animation will be displayed on the screen during the dances. Gustave suggests that Mason and Tiffany perform "Pas de Deux" behind the screen, because the silhouettes will look interesting.

(?)Victoria, Louis, Cristen, and Gustave practice "The Curtain." Dufftin reminds them to show their claws!

(?) After lunch, the kids design and program animation that will be used in the performance. Here, Louis has an inspired vision of what he wants to create. He explains his ideas to Anindita and Gustave.

(?) Dufftin and Gustave's mother discuss teaching techniques and what works best for children. The consensus? Teaching recovery is most important.

(?) Director Henri would like to thank everyone involved in the creation of "RoBallet." Although the dance was inspired by Louis, Jacques reminds them that Gustave had the idea of making up their own moves first.

(X) Gustave, Cristen, Louis and Victoria practice "Curtain" a few times. Will the sensors work for the performance?

(X) Jacques instructs Gustave to look directly into the camera as he dances so the audience can get a good look at his face.

(X) Gustave has been given a part in the orchestra. With the cast complete, the kids are about to perform with him for the first time. Victoria adds Gustave to her introduction, but Henri says she should mention him with the dancers. Victoria complains that Henri should correct her after the show, and not embarass her in the middle of it.

Extracting the Key ideas

Gustave is one of the dancers. He likes to suggest ideas for the dance. He gets very frustrated that he is only allowed to do what he is told. At one point he was crying. This happened during the first week. The second week was different. Dancers were encouraged to design their own dance. Gustave had a wonderful time.

He likes to suggest ideas for the dance

  • Gustave likes/wants to create
  • Gustave likes/wants to discuss
  • Gustave likes/wants to imagine
  • Gustave likes/wants to try different poses
  • Gustave likes/wants to move his body

He gets very frustrated that he is only allowed to do what he is told

  • Gustave feels bad to be restricted
  • Gustave wants to quit when being forced

At one point he was crying.

  • Gustave feels bad, unhappy, disappointed, frustrated

The second week was different.

  • Gustave’s not feeling bad, unhappy, etc
  • Gustave feels better

Dancers were encouraged to design their own dance.

  • Gustave/Dancers having their own design
  • Gustave/Dancers can use their imaginations
  • Gustave/Dancers are not restricted
  • Gustave/Dancers are praised

Conclusion: Definition of the Problem and the Scenario

  • Only one character is considered during the process of the story.
  • Having users input structuralized stories: character, goals/desires/problems, struggles, results
  • All the characters are described in a list with their personality and characteristics such that the system can disambiguate the one referred to in the story
  • Reduced from the task of first creating the narrative of the story and then choosing the right clips to sequence, into the subtask of selecting the right annotated video clip according to the mapping of the semantics from input stories to the annotations
  • So thus, the input would be
    1. a structuralized text story provided by anyone who joined the Roballet event
    2. a set of video clips with arbitrary text annotations
    3. a list of descriptions of the characters that appear in any of the video clips
  • The system would comprise
    1. a website for gathering the stories from the people who participate the event
    2. a natural language processor, or a modified version of Montilingua, that processes the stories as well as the annotations to identify the characters, the actions they take, the emotion they exhibit, and so on
    3. a related-concept provider, which may include ConceptNet, WordNet, as well as any other tool for finding possible synonyms and concepts
      (Note that ConceptNet and WordNet have different capabilities and can be used to find different information. For the example "create," WordNet can be used to find the synonym "invent," whereas ConceptNet can be used to find the related concept "imagine", which shares not directly the similar meanings but still highly related concept)
    4. a central module that takes care of the whole data flow in the system, matches the concepts/words derived from the stories and the annotations, and finally prompt the final results

0 Comments:

Post a Comment

<< Home