The Problem of Specificity and Generality - Video Sequencing Weekly Notes Oct 12
I wrote a program yesterday for constructing relationships between two clips by analysing them using ConceptNet and spreading activation. It turned out that either the related concepts are too few and show no difference from keyword matching (when the spreading depth/width are small) , or there is too much garbage in the related concepts that the system suggests, and the concepts that people would regard as important cues are blurred and deluted (when the depth/width are larger).
The problem, as Barbara expected, comes from the fact that the concepts explodes increidibly fast and widely when we use spreading activation in ConceptNet, which is in nature a collection of general, while shallow, common sense. The difficulty of applying this kinda tool in storytelling thus turns out to be that, it can't narrow down to specific topics and find the traces of the semantics as humans do, when two or more pieces of information are supplied. The two sentences, "The director arranged the performers for a final rehearsal" and "Tom felt frustrated because he still couldn't find the beat" may actually be highly relevant in an event of dance performance, but the word set ("director", "arrange", "performer","final", "rehearsal") are just so irrelevant with ("feel", "frustrated", "find", "beat") in ConceptNet.
More specifically speaking, the locations in ConceptNet of the nodes of these two word sets are too distant that it is reasonable for the system to be incapable of finding their high relevance. If we can fill in the semantic gap between these two senteces using a broader, less specific descriptions for the context (e.g., "This is a rehearsal for a dance performance. The director directs a group of performer on the stage......."), then it might be easier for the system to find that they are actually pretty related.
Consider that we have the users to write a story about the video they're to make, and the system arranges the available clips automatically into some sequence according to this story. The advantages are twofold. First, such story can serve as the bridge that fills in the semantic gaps among clips, so different clips can be related to one another by routing through the story sort of being in the middle. The relationships among clips can thus be constructed, after all there would be difficult to construct them otherwise. Secondly, "inputting stories and getting video sequences" could actually be a pretty nice way of interaction, since the output sequence would thus be generated according to the user's narrative, instead of relatively meaningless correlation using words that aren't even guaranteed to be important concepts.
I read Hugo's Bubble Lexicon today, and I think it might be a proper tool for us to apply to achieve such goal. While performing the path-finding process in Bubble Lexicon for reasoning, the ContextNodes are always activated and used to boosts up the paths that pass them. If we can find such ContextNodes from the user's story and activate them like this, all the words in the two sets: ("director", "arrange", "performer","final", "rehearsal"), ("feel", "frustrated", "find", "beat") would become much more related.
0 Comments:
Post a Comment
<< Home