Abstract:
This project aims to build a machine learning model that can construct an emotional graph of a screenplay by predicting the emotional impact of each scene in the screenplay on the audience based on the predefined emotional categories.
This kind of a model has many possible use cases which will be discussed in the proposal, but primarily it is intended to assist screenplay writers to gain a better objective third person perspective on how their screenplays will have an impact on the audience by comparing this screenplay’s emotional graph to past screenplays which have similar emotional graphs so that the writer can have a better understanding of his/her screenplay’s strengths and weaknesses. In other words, it is intended to give feedback on a screenplay with a well-defined and consistent benchmark which is the emotional graph of the screenplay.
The explosion of large language models and natural language processing in recent days has enabled machine learning models to assist users in a variety of fields. But its impact on screenplay analysing is still less explored. This project aims to address this shortcoming.
1 Introduction
1.1 Problem statement
Screenplay writers often give their finished screenplays to their trusted sources to gain a third person perspective on how their screenplays emotionally impact the audience or to know the reaction of audience on each scene. But these feedbacks are usually subjective and time consuming.
With the unprecedented number of films being produced currently, thanks to the streaming services, studio executives face the same problem of spending so much money, time and effort to analyse screenplays and decide which screenplays can be made into movies. These methods are often subjective, inconsistent and labour intensive.
There has been an explosion of large language models in recent days and they are fine tuned to perform various tasks from medical advice to financial assistance. But these models are not explored well enough in assisting screenplay writers to analyse their screenplay or providing them feedback.
1.2 Background Research
1.2.1 Importance of emotions in screenplay writing
In his book ‘Story’, Robert McKee [2] explains the importance of emotional beats in the structuring of an engaging story. Thus placing of right emotions at the right place in a screenplay is one of the important deciding factors in the engagement of the audience. The more the audience engages with a story, the more chance it has in succeeding as a film.
There have always been attempts to quantify the effectiveness of an engaging screenplay. There are various abstract methods of evaluating the effectiveness of a screenplay. Syd Field [3] explains some of these methods like three act structure, characters’ emotional arc and character development.
1.2.2 Previous works on analysing screenplays
Though a screenplay is more easier to be computationally analysed than a novel, thanks to the subdivision of screenplay into scenes and dialogues [1], there has been comparatively less work done on analysing screenplays through computation. This has been due to the lack of proper benchmarks on which we could compare and analyse screenplays. There has been a considerable work done on constructing the emotional trajectories of novels. Reagan, A.J. et al, have constructed an emotional arc of 1,327 stories from the project Gutenberg corpus to study the six basic shapes of storytelling and explored which shapes have higher chance in succeeding [4].
Hoyt J. et al. have programmed a tool called ‘Scripthreads’ to track the character interactions in a given screenplay. Frangidis P. et al. have used emotionally analysed concatenation of movie script and their respective reviews to predict movie ratings [5]. But other than this, not many significant literature could be found on computational analysis of screenplays.
1.2.3 Advances in Natural Language Processing
With the introduction of transformers machine learning model in 2019 by Vaswani, A., et al., there have been an explosion in the space of natural language processing and large language models. Transformers deep learning models are better than Recurrent Neural Networks and Long short term memory models in that it can remember long sentences using attention mechanism and takes significantly lower time to train [6]. This has resulted in a large number of libraries that are pretrained using the transformers model. Few of these language model libraries include roBERTa and Hugging face. These pretrained models enable transfer learning.
Figure 1. Image credit: Vaswani, A. et al. (2017). The Transformer – model architecture. Attention is all you need.
1.2.4 Transfer Learning
Transfer learning is the process where pretrained models are fine tuned for a specific use case. These are especially useful when the size of a training dataset is small [7]. For example, the hugging face [8] library consists of models trained on a large corpus of text using the transformers model which are effective when finetuned for sentiment analysis and emotional classification which is a significant task that this project aims to achieve.
1.2.5 Emotional Classification and Sentiment Analysis
Since this project concerns with detecting the emotional trajectories in scripts, there are two methods of doing it. Emotional classification and sentiment analysis are two different methods used in detecting emotions in text. Kim, E. et al., differentiates emotional classification and sentiment analysis as following [9]:
Sentiment analysis: It is the spectrum between positive and negative feeling that a text exhibits.
Emotional analysis: It the detection of the underlying emotion itself.
This project explores both of the methods to identify the best method in terms of accuracy, usability and scalability. When it comes to emotional analysis, there arises a need to identify discrete emotional categories that could be used to annotate the data and build a classification model. For centuries, scholars have attempted to identify and classify the human emotions. Some of the popular theories include Ekman’s theory of basic emotions, Plutchik’s wheel of emotion and Russel’s circumplex model [9].
Figure 2. A small scale representation of emotional trajectory using sentiment analysis
1.2.6 Emotional Categories
Ekman’s theory of basic emotions identify six emotions as discrete categories which do not overlap with each other. These six categories include joy, surprise, sadness, anger, fear and disgust [10]. While Ekman’s theory identifies emotions as discrete categories, Plutchik’s and Russel’s models represent emotions on a multidimensional level [9]. Though they could be used for measuring the intensities of emotions on a continuous scale for a better representation, they increase the complexity in using them in classification models. For this reason, most of the computational approaches in emotional classification uses Ekman’s six basic emotions for annotation.
For this project, Ekman’s six basic emotions will be used to classify the scenes in a screenplay. An additional category ‘neutral’ is added to the six basic emotions since exposition is a common part in screenplays and they usually are neutral in the emotion they convey.
Figure 3. A small scale representation of emotional graphs
1.3 Objectives
On a high level this project consists of three main tasks:
- Since there are no proper existing datasets that is useful for this specific purpose, this project aims to build an annotated dataset of scenes of screenplays mapped to the emotions they convey and additional features that provide context to the scene.
- This project leverages the advancement in natural language processing and pretrained large language models to train a model that classifies the scenes of a screenplay based on a discrete set of emotional categories and plot a graph to visualize the emotional trajectory of a given screenplay.
- Through more number of graphs of past screenplays, the project explores the effectiveness of this graph as a benchmark to compare and analyze screenplays through similarity learning.
These tasks are designed to help the user (screenwriter) to have a better understanding of the emotional trajectory in his/her screenplay and also compare previously released films which have similar emotional trajectory as his/her screenplay.
2 Proposed Project Pipeline
2.1 Data collection
The most challenging task of this project will be the data collection. There are many pretrained language models on sentiment analysis and emotional classification. But they are mostly focused on small or opinionated texts. This project requires the model to be trained on screenplays, hence there arises a need to build a custom training dataset specifically designed for this task. For this dataset, a lot of movie screenplays need to be collected and parsed into individual scenes. The data for this will be collected from IMSDb website [11] which is a collection of more than 10,000 screenplays.
Another important aspect of this data collection will be use of parsers to divide the screenplay into individual scenes and dialogues. There are many tools available to do this function like Hoyt, J. et al have used custom built parsers to pull out individual scenes from a screenplay [1].
The number of data points used for this type of project is usually in the range of thousands. But the scope of this project limits us to use about 200 screenplays to be parsed and annotated if time allows. These screenplays will be based on popular films from Hollywood.
2.2 Building the dataset
For building the dataset the individual scenes from the screenplay need to be annotated by a human on what emotion the scene conveys. For this purpose there are third party websites like Amazon Mechanical Turk but it is expensive and out of the scope of this project. Hence, this task will be done using friends and family through a survey that includes the scenes and will ask the person to classify it based on one of the seven emotions. It is important that the person has seen the movie before so it will be easier to remember what emotion he felt the first time he watched that scene. By doing this with multiple persons the scenes can be mapped to their emotions.
But this will not be enough since the deep learning models perform better when there are additional features that provides context to the scenes. These additional features will be the names of the protagonist, antagonist, side characters, scene description and dialogues. These are just the preliminary ideas. These will be properly explored during the designing phase of the project.
2.3 Training
For training, the project uses transfer learning. Hugging Face is a large library of language models. There are many pretrained models that are trained on emotion detection and sentiment analysis. These pretrained models will be fine-tuned with our custom built dataset. The project report will compare these different models on their performance and will select the best model.
2.3 Testing
The testing dataset would include an additional dataset consisting of 50 screenplays. These will be used to test the accuracy of the model.
2.4 Choosing the programming language
Currently python is the most used programming language for doing machine learning tasks. It has also got many popular libraries for doing data science related tasks like pytorch and tensorflow. This project will use the pytorch library along with the hugging face language models which has got state of the art machine learning models. All these will be done on Jupyter Notebooks.
2.5 Creating a database of emotional graphs of past screenplays
If the model is accurate enough, it will be used to create a library of emotional graphs of past screenplays. This library can be used to compare similar screenplays and if time allows, they can be used to study the patterns in screenplay and its probability of success.
2.6 Similarity Learning
The graph created from the screenplay is compared with well-established screenplays. This is done through a similarity measure. This is done by feeding the emotional trajectories also called feature data in a deep neural network and generating embeddings. These embeddings have lesser dimensions than the feature data itself but can used to capture the latent structure of the emotional trajectories.
2.7 User Interface
To construct the graphs matplotlib and seaborn library will be used. The graphical user interface will be designed with Gradio which is a python framework which can be used to build machine learning apps. This helps to design the app to take the screenplay as an input and provide the graph as the output. A similarity learning algorithm is used to provide feedback on which screenplays in the past have had similar emotional trajectories and their successes and criticisms will be displayed using review and ratings APIs available online.
3 Extended Functionalities
The primary aim of this project is to provide a feedback and a different perspective to the user’s screenplay. The emotional graph is just the approach it took to do this function because it proves to be a better benchmark to compare and analyse movie scripts. Other than this it has also many other use cases like:
- It can be used to study a person’s taste in movie watching where that person might be more inclined towards a specific shape of emotional trajectories. If it is found to be effective, it can be used in recommendation systems to provide another dimension.
- It can be used to predict the success of a movie script by comparing the past screenplays with their box office collections or movie ratings. If it is found to be effective it can be used by studio executives to select the screenplays or fine tune screenplays accordingly.
These are just few of the use cases that might be possible with this project.
As this project progresses I will upload more posts about my methodologies.