1 Jaclyn Hansberry [email protected]MIS2502: Data Analytics The Things You Can Do With Data The Information Architecture of an Organization Jaclyn Hansberry
2 Why Data Analytics? Of decisions by managers are made by using their “gut” 40% Say this is because there is “no good data” 61% Want to increase their organization’s use of business intelligence 72% Source:
3 It all starts with data Data Gathering Storing Retrieving Interpreting Almost every business action requires at least one of these!
4 Data versus informationDiscrete, unorganized, raw facts Information The transformation of those facts into meaning
5 ? Examples of Data Data Quantity sold Course enrollment Customer nameDiscount Star rating Information ?
6 So then how do you turn data into information?
7 Example: Memphis PoliceUsed historical crime data Place police where and when crime was likely to occur Example: Memphis Police Used historical crime data Place police where and when crime was likely to occur
8 Example: New York Mets Look at fan ticket purchase, social media, and mobile data Personalize communications and promotions Growth in corporate sales; ticket base now
9 Two types of data ! Transactional Captures data describing and eventAn exchange between actors Real-time Analytical Captures data to support analysis and reporting An aggregated view of the business Historical ! Explain the role of transactional and analytical data in the examples on the previous slides.
10 The Information Architecture of an OrganizationData entry Transactional Database Data extraction Analytical Data Store Data analysis Stores real-time transactional data Stores historical transactional and summary data Called OLTP: Online transaction processing Called OLAP: Online analytical processing ! But this is changing rapidly….
11 Components of an information infrastructureTransactional Database Supports management of an organization’s data For everyday transactions Analytical Data Store Supports managerial decision-making For periodic analysis This is what is commonly thought of as “database management” This is the foundation for business intelligence
12 The Transactional DatabaseStores real-time, transactional data In business, a transaction is the exchange of information, goods, or services. For databases, a transaction is an action performed in a database management system. Operational databases deal with both: they store information about business transactions using database transactions Examples of transactions Purchase a product Enroll in a course Hire an employee Data is in real-time Reflects current state How things are “now”
13 The Relational ParadigmHow transactional data is collected and stored Primary Goal: Minimize redundancy Reduce errors Less space required Most database management systems are based on the relational paradigm Oracle, Microsoft Access, SQL Server ? Which of these do you think is more important today
14 The Relational Database Online Retailer ExampleA series of tables with logical associations between them The associations (relationships) allow the data to be combined Product ProductID Name Description Price Shipping SalesRank Review ReviewID ProductID StarRating Text ReviewerName Likes
15 Why more than one table? Every review has an associated productEvery product can have a review Products and reviews have a unique ID number Split the details off into separate tables Product ProductID Name Description Price Shipping SalesRank Review ReviewID ProductID StarRating Text ReviewerName Likes This is good because: Information is entered and stored once Minimizes redundancy
16 Analyzing transactional dataCan be difficult to do from a relational database Having multiple tables is good for storage and data integrity, but bad for analysis Tables must be “joined” together before analysis can be done The solution is the Analytical Data Store Operational databases are optimized for storage efficiency, not retrieval Analytical databases are optimized for retrieval and analysis, not storage efficiency and data integrity
17 The Analytical Data StoreStores historical and summarized data “Historical” means we keep everything Data is extracted from the operational database and reformatted for the analytical database Extract Transform Load Operational Database Analytical Database Data conversion Query Query We’ll discuss this in much more detail later in the course!!
18 The Dimensional ParadigmData is stored like this around a business event… …and can be summarized like this for analysis…
19 Dimensional Data and the Data Cube…or it can be expanded in detail like this so that data mining (complex statistical analysis) can be done. Sales ID Qty. Sold Total Price Prod. ID Prod. Name Prod. Price Prod. Weight Store ID Store Address Store City Store State Store Type Time ID Day Month Year 1000 1001 1002 Sales Fact Product Dimension Store Dimension Time Dimension
20 Comparing Operational and Analytical Data StoresOperational Data Store Analytical Data Store Based on Relational paradigm Based on Dimensional paradigm Storage of real-time transactional data Storage of historical transactional data Optimized for storage efficiency and data integrity Optimized for data retrieval and summarization Supports day-to-day operations Supports periodic and on-demand analysis
21 The agenda for the courseWeeks 1 through 5 Weeks 6 through 9 Weeks 10 through 14 Data entry Transactional Database Analytical Data Store Data analysis Data extraction Stores real-time transactional data Stores historical transactional and summary data Data interpretation, visualization, communication