1 A Jump Into the Android App Store: What Makes a Best-Selling App?David Agogo University of Massachusetts, Amherst Abstract Comments on Methods Within the past decade, the explosive rise in the use of smartphones has led to the establishment of the multi-billion dollar mobile app ecosystem with the app store at its center. Using data mining approaches, an archive of information on 1.1 million apps is probed for interesting characteristics of top selling apps. From analyses of subsets of the data archive, paid and free apps of above average popularity are predicted with 64% accuracy using bootstrap forests. Useful insights on the most successful app categories, and how best to interpret review ratings are discovered. Data mining methods consistently outperformed parametric statistics (OLS) Results held strongly in the hold-out validation sample However, model predictors performed worse in analysis of paid apps only. (Click graph to enlarge) Objectives Predictors Discover characteristics common to apps with the highest number of installs Discover what app categories are most popular, after controlling for other variables Discover the differences that exist between the worlds of paid and unpaid app Derived Predictors Average App Rating (5 point scale) Proportion of Reviewers that Loved it (5/5) Proportion of Reviewers that Hated it (1/5) Presence in Other App Stores Number of Weeks in App Store Estimated Number of Release Cycles Earliest Android Compatibility Original Predictors App Category Created by Top Developer App Price App Size Presence of In-App Purchases Findings Data mining methods consistently outperform parametric statistics (OLS) Results generally hold strongly in the hold-out validation sample Model predictors perform poorly in predicting correlates of number of installs for paid apps Other Highlights Median = Mean success is 1,000 to 5,000 installations Largest Effect: IsTopDeveloper, IsFree, OtherStores Best Categories for Top Developers: Weather, Productivity, Transportation and Brain App; Worst Categories: Sports, News and Magazines, Travel and Local Categories with Highest Installs of Free Apps: Racing, Communication, Sports Games, Music and Audio, Brain Apps. Lowest Installs: News and Magazines, Travel and Local, Sports, Entertainment and Medical Apps) Data 1.1 Million rows of metadata crawled from Play Store Analysis performed on two randomly selected subsets: N1 = 27,981 randomly selected rows N2 = 30,000 paid apps ONLY 70:30 training to hold-out validation ratio Methods Read Full Paper Ordinary Least Squares, CHAID/CART Decision Trees, Bootstrap Forest [best performing], Neural Net
2 A Jump Into the Android App Store: What Makes a Best-Selling App?David Agogo University of Massachusetts, Amherst Comments on Methods