1 Introduction to Machine LearningVladimir Jojic March 1st 2017
2 Machine Learning Applications all around youProduct recommendation Speech recognition (News) Feed selection Traffic prediction Self-driving cars Home automation Automated stock trading Design of new drugs
3 Two ways of designing/implementing an algorithmGiven a specification and a few examples Given a set of correct input-output pairs Specification: Function reverse should given an input array a return array b such that : b[length(a)-i] = a[i] for i=0,..,length(a) Illustrative examples: reverse([1,2,3]) -> [3,2,1] reverse([1]) -> [1] reverse([]) -> [] Data: ([1,2,3],[3,2,1]) ([10,4,5,23],[23,5,4,10]) ([2],[2]) ([100,211,1,1],[1,1,211,100]) . Specification defines the task formally. Data gives examples of a successfully accomplished task.
4 Traditional Programming ApproachCode Specification Implementation
5 Machine Learning ApproachCode Machine Learning Method Implementation Data Machine learning uses examples, DATA, to learn how to accomplish a TASK.
6 Machine Learning – a toy exampleSample 1 Sample 2 Sample 3 Data: {( , 1), ( ,0), ( ,0) …} The TASK: Given input image predict label dog or a cat. A machine learning algorithm produces a function 𝑓 such that 𝑓(𝐢𝐧𝐩𝐮𝐭) frequently matches the true label output. input output input output input output
7 A machine learning concept: FeaturesInput, such as , needs to be distilled into quantities that can be used for prediction – features. One type of image features: fraction of patches with a particular dominant color Green fraction: 20/30 Light brown fraction: 9/30 Dark brown fraction: 1/30 Light gray fraction: /30 Patches with triangular objects: 3/30 Patches with circular objects: /30
8 Feature mapping 𝑥 = [ 0.66, 0.3, 0.03 …] Raw image Feature vector
9 Feature mapping of the datasetInput Features 𝑥 1 =[ 0.66, 0.3, 0.03 …] Sample 1 𝑥 2 =[ 0.7, 0.2, 0.01 …] Sample 2 𝑥 3 =[ 0.0, 0.4, 0.4, … ] Sample 3 . .
10 Feature space Feature vectors can be fairly long: 10^3 – 10^6 Each sample is a point in a high-dimensional space. But, we can’t visualize that space easily. Hence, we try to think in lower dimensional space. 𝑥 2 Feature 2 𝑥 1 Feature 1
11 A simple machine learning algorithmAs a warm-up we will think of a very simple algorithm. Intuition: Similar inputs produce similar outputs. New input 𝑥 2 is closest to Since is a cat, is a cat as well. 𝑥 1
12 Machine Learning – a toy exampleData Prediction ≠ = input 1 = output 1 = f( input 1 )=1 input 3 = output 3 = f( input 3 )=1 input 2 = output 2 = f( input 2 )=0 Q: How would you measure performance of function 𝑓?
13 Designing exams Suppose you taught a course. You are selecting question for the final exam Q: Would you use the same questions you gave students as homework assignments, or the same questions you use every year, or questions that students have never seen in class? Argue why!
14 Overfitting In machine learning, we typically split the Data into a Training and a Test set. Training set – homework – is used to train the algorithm. Test set – exam – is used to evaluate the trained algorithm’s performance on unseen data. Overfitting occurs when performance on the training data is much better than performance on the test data.
15 Neural Networks Neural Networks draw on biological inspiration to achieve astonishing results in machine learning and artificial intelligence. x1 x2 x3 2 1 -0.5 Artificial neuron
16 A neural unit example Input: z= 1∗𝑥 1 +2∗ 𝑥 2 −0.5∗ 𝑥 3 Output:if z<0 then unit outputs 0 if z>0 then unit outputs z x1 x2 x3 2 1 -0.5
17 Illustration 1 2 -0.5 z = 0*1 + 0*2 – 1*0.5 = -0.5 <0 z<0 so the neuron outputs 0 1 2 -0.5 z = 0*1 + 1*2 – 1*0.5 = 1.5 > 0 z<0 so the neuron outputs 1.5 1.5 Values associated with the edges tell us how important is the connection. Adjusting these values to make the most accurate prediction is called Learning.
18 Deep Learning Recent advances in neural network training have enabled training of deep networks Deep networks have many intermediate layers between input and output. Input Layer Output Layer … 1
19 Deep Networks can write novels after being trained on Shakespeare’s worksAt the beginning of training: tyntd-iafhatawiaoihrdemot lytdws e ,tfti, astai f ogoh eoase rrranbyne 'nhthnee e plia tklrgd t o idoe ns,smtt h ne etie h,hregtrs nigtike,aoaenns lng After training for a few hours: "Why do what that day," replied Natasha, and wishing to himself the fact the princess, Princess Mary was easier, fed in had oftened him. Pierre aking his soul came to the packs and drove up his father-in-law women.
20 How is it done Learn how to predict the next character given theprevious characters: (“To be, or not to b”, “e”) (“That is the questi”, “o”) … “e” “To be, or not to b”
21 Deep Networks can learn to write Operating Systems code/* * If this error is set, we will need anything right after that BSD. */ static void action_new_function(struct s_stat_info *wb) { unsigned long flags; int lel_idx_bit = e->edd, *sys & ~((unsigned long) *FIRST_COMPAT); buf[0] = 0xFFFFFFFF & (bit << 4); min(inc, slist->bytes); printk(KERN_WARNING "Memory allocated %02x/%02x, " "original MLL instead\n"), min(min(multi_run - s->len, max) * num_data_in), frame_pos, sz + first_seg); div_u64_w(val, inb_p); spin_unlock(&disk->queue_lock); mutex_unlock(&s->sock->mutex); mutex_unlock(&func->mutex); return disassemble(info->pending_bh); } static void num_serial_settings(struct tty_struct *tty) if (tty == tty) disable_single_st_p(dev); pci_disable_spool(port); return 0;
22 Deep Networks can learn to play Go (AlphaGo)
23 How does it work? Network plays against itself, by choosing moves based on preferences. Input is the board state, output next move … Input: Output:
24 How does it work? After a game is done: Moves made by the winner become slightly more preferred. Moves made by the loser become slightly less preferred. Over many games good moves become preferred. Preferences are called a policy, and the procedure of training in such a manner reinforcement learning.
25 Deep Networks are learning how to play Starcraft II (DeepMind)
26 How to get started?
27 Curiosity and tenacityHow to continue? Machine Learning requires Computer science Mathematics: multivariate calculus, probability and statistics, and linear algebra Curiosity and tenacity
28 A group exercise– dreaming up new machine learning applicationsIn groups of 3-4 come up with one or two ideas for machine learning applications 1) What is the Task? For example: predicting what you want for breakfast tomorrow 2) What is the Data you would use? Think input-output pairs. Example: (Ingredients, ratings) Inputs are ingredients, outputs are ratings. 3) How would solving this task change your life and life of others? Example: Menu design for restaurants with known customers. Personalized meal service. Balancing cost, nutrition, and diversity. We will take 5-10 minutes and then talk about the ideas you came up with.