For application case 4.6 – Data Mining Goes to Hollywood, describe the research study, the methodology,...

70.2K

Verified Solution

Question

General Management

For application case 4.6 – Data Mining Goes toHollywood, describe the research study, the methodology, theresults and the conclusion.

Data Mining Goes to Hollywood: Predicting FinancialSuccess of Movies

Predicting box-office receipts (i.e., financial success) of aparticular motion picture is an interesting and challengingproblem. According to some domain experts, the movie industry isthe “land of hunches and wild guesses” due to the difficultyassociated with forecasting product demand, making the moviebusiness in Hollywood a risky endeavor. In support of suchobservations, Jack Valenti (the longtime president and CEO of theMotion Picture Association of America) once mentioned that “…no onecan tell you how a movie is going to do in the marketplace…notuntil the film opens in darkened theatre and sparks fly up betweenthe screen and the audience.” Entertainment industry trade journalsand magazines have been full of examples, statements, andexperiences that support such a claim. Like many other researcherswho have attempted to shed light on this challenging real-worldproblem, Ramesh Sharda and Dursun Delen have been exploring the useof data mining to predict the financial performance of a motionpicture at the box office before it even enters production (whilethe movie is nothing more than a conceptual idea). In their highlypublicized prediction models, they convert the forecasting (orregression) problem into a classification problem; that is, ratherthan forecasting the point estimate of box-office receipts, theyclassify a movie based on its box-office receipts in one of ninecategories, ranging from “flop” to “blockbuster,” making theproblem a multinomial classification problem. Table 5.4 illustratesthe definition of the nine classes in terms of the range ofbox-office receipts.

Data

Data was collected from variety of movie-related databases(e.g., ShowBiz, IMDb, IMSDb, AllMovie, etc.) and consolidated intoa single data set. The data set for the most recently developedmodels contained 2,632 movies released between 1998 and 2006. Asummary of the independent variables along with theirspecifications is provided in Table 5.5. For more descriptivedetails and justification for inclusion of these independentvariables, the reader is referred to Sharda and Delen (2007).Business Intelligence Spring 2017

Methodology

Using a variety of data mining methods, including neuralnetworks, decision trees, support vector machines, and three typesof ensembles, Sharda and Delen developed the prediction models. Thedata from 1998 to 2005 were used as training data to build theprediction models, and the data from 2006 was used as the test datato assess and compare the models’ prediction accuracy. Figure 5.15shows a screenshot of IBM SPSS Modeler (formerly Clementine datamining tool) depicting the process map employed for the predictionproblem. The upper-left side of the process map shows the modeldevelopment process, and the lower-right corner of the process mapshows the model assessment (i.e., testing or scoring) process (moredetails on IBM SPSS Modeler tool and its usage can be found on thebook’s Web site).

Results

Table 5.6 provides the prediction results of all three datamining methods as well as the results of the three differentensembles. The first performance measure is the percent correctclassification rate, which is called bingo. Also reported in thetable is the 1-Away correct classification rate (i.e., within onecategory). The results indicate that SVM performed the best amongthe individual prediction models, followed by ANN; the worst of thethree was the CART decision tree algorithm. In general, theensemble models performed better than the individual predictionsmodels, of which the fusion algorithm performed the best. What isprobably more important to decision makers, and standing out in theresults table, is the significantly low standard deviation obtainedfrom the ensembles compared to the individual models. BusinessIntelligence Spring 2017

Conclusion

The researchers claim that these prediction results are betterthan any reported in the published literature for this problemdomain. Beyond the attractive accuracy of their prediction resultsof the box-office receipts, these models could also be used tofurther analyze (and potentially optimize) the decision variablesin order to maximize the financial return. Specifically, theparameters used for modeling could be altered using the alreadytrained prediction models in order to better understand the impactof different parameters on the end results. During this process,which is commonly referred to as sensitivity analysis, the decisionmaker of a given entertainment firm could find out, with a fairlyhigh accuracy level, how much value a specific actor (or a specificrelease date, or the addition of more technical effects, etc.)brings to the financial success of a film, making the underlyingsystem an invaluable decision aid.

Answer & Explanation Solved by verified expert
3.7 Ratings (643 Votes)
Research Study This research study where a number of software tools and data mining techniques are used to build models to predict financial success boxoffice receipts of Hollywood movies while they are nothing more than ideas prerelease Predicting box    See Answer
Get Answers to Unlimited Questions

Join us to gain access to millions of questions and expert answers. Enjoy exclusive benefits tailored just for you!

Membership Benefits:
  • Unlimited Question Access with detailed Answers
  • Zin AI - 3 Million Words
  • 10 Dall-E 3 Images
  • 20 Plot Generations
  • Conversation with Dialogue Memory
  • No Ads, Ever!
  • Access to Our Best AI Platform: Flex AI - Your personal assistant for all your inquiries!
Become a Member

Other questions asked by students