The elements of
touching means the factors embedded in a character, event, object, or place that moves people. In the storytelling context, the elements of a touching story means the factors that make the embedded story touch people's hearts, which may be feelings, emotions or situations such as "courage," "passion," "suffer," "dilemma," "difficulties," "faith," "belief," and so on.
I started thinking this interesting question, "What are the elements of a moving story?" as me and my friends, who are working on a start-up doing documentary videos in Taiwan, tried to find out their strength in terms of a business model, which I personally think to be the capability of "touching people's hearts" as the best thing we do.
It stroke me one day as I tried to relate this question to the storied navigation research topic that I've been trying to push forward. To explain this, I think it's necessary to give a brief introduction of the problem that I wanted to solve. The problem I've been thinking was that, to help users with their storytelling activity using exiting documentary video clips, the computer has to be able to help the two sub-processes: a) to find the materials (clips) that are useful to the story to be told; and b) to compose these selected materials in a sequence with smooth, interesting story flow. The first one has, in some sense, been achieved using commonsense reasoning technique. That is, the system helps to find the related clips by analyzing users' input story text. The second one, however, is extremely hard to deal with. If we want to make smooth flow in the scene level, it is not the domain where commonsense technique works better, which has already been explained in my previous discussion. As for the story level, on the other hand, composing stories that are smooth or make sense isn't really a big problem for the users (At least we can order the clips according to the chronology). The problem on the story level would thus be making it interesting.
What does "interesting" means in terms of documentary video? This leads to the question of why people make documentaries. Documentary videos are different from films that are made up to entertain people, like Disney cartoons or other Hollywood movies. In my point of view, these real stories are documented because they are moving. In other words, these real characters may have faced some extraordinary events, or they may have made extraordinary decisions or actions, and their lives may changed dramatically because of these decisions or actions, and so on. I found that touching is the universal key point of documentary videos from discussing about my friends' work and their strength. And, from this I also realized that touching is the key aspect of being interesting in terms of documentaries, which is exactly the stuff that I want my system to help with.
Therefore, I'd like to investigate "the elements of touching" in this article, as I try to work on the second sub-process of the storied navigation system. I think it's important to make clear what they really are before trying to find a computational representation and to model it, as if we are looking for a solution to an AI problem. (And, maybe there are other reasons why people shoot documentaries as well, and we need to investigate them too, but I think for the time being, touching is big enough a problem for me.) On the other hand, to my best understanding, I don't think there are other existing work that mentioned this "touching" idea when trying to help users with their storytelling activities. Most of them take into account more explicit story elements such as characters, places, events, time, objects, etc., but not implicit elements such as the transition between affect (Yes, Hugo did played with it, but he didn't transform it into a tool that help storytelling.), or so I think it should be worthwhile to look at it a bit too.
Currently I don't have clear answer to the question of what the elements really are. But I suppose we might be able to find several dimensions that denotes several types of these elements, which should be affects such as "passionate" or Ekman's basic emotions. The transition of these affects along the time will be composed computationally and derive the value of "touching", such that we will be able to compute the "touching score" for any time spot based on the elements embedded in a series of annotated clips. Why their transition along the time? This idea comes from my observation of both jazz music and film. If a piece of narration, either a story or a musical piece, has a flow that doesn't appear to be boring, than it must contains endless "tension-solution" pairs. With these tension-solution pairs, the whole narration will become lively, otherwise it will be dull. The transition of the elements of touching along the time, in my current ideas, will be able to describe the value of tension being expressed in the narration, which in turn gives us an idea how touching the overall story becomes.
So the scenario of using my system to compose a story would be like this. Each video clip in the corpus has two types of annotations. The users will be asked to input "What happened in this clip?" and, optionally, "What feeling does this clip suggest?". If the user does not want to input the second one, the system will try to analyze the event happened using spreading activation, and give the result as the default value. Otherwise, the system analyze the feeling (arbitrary text) and try to map it to the affect dimensions that are used to compute the touching score. The system helps the user to find all the desired materials by retrieving clips annotated similarly to the story text, and helps the user to make better sequencing of these materials by giving feedback of all the affect values and the touching score, and recommended sequences as well. I think it would be better if what the user does is solely typing story sentences, and the system does the rest of the job, similarly to the Textable Movie, but more design details should be cleared as I discuss more about it with people.
延伸閱讀