[Machine Learning Foundations] Lecture 1. The Learning Problem

Machine Learning Foundations Notes
Lecture 1. The Learning Problem
Instructor: Hsuan-Tien Lin

What Is Machine Learning?

From Learning to Machine Learning
- Learning: acquiring skill with experience accumulated from observations 學習是從觀察出發，經過學習，轉化成有用的技巧
- Machine Learning: acquiring skill with experience accumulated/computed from data 用電腦來模擬學習的過程，經過對環境的觀察（資料），變成對電腦有用的技能
- What is skill? improve some performance measure 技能可以視為某一種表現的增進
Why use machine learning?
- ML is an alternative route to build complicated systems
- when human cannot program the system manually 靠腦力一行行寫下規則來計算很困難，讓機器自己分析資料比較容易，比如說要讓機器人上火星
- when needing rapid decisions that humans cannot do 需要快速做決定的事情，人沒有辦法短時間內判斷，如超短線的股票交易
- when needing to be user-oriented in a massive scale 做個人化服務
Key Essence of Machine Learning 什麼時候才需要用到機器學習?
- exists some ‘underlying pattern‘ to be learned 有淺藏模式可以學習，要有某些目標
- but no programmable (easy) definition 但沒有很辦法很簡單地用程式直接定義，又或者說沒有辦法用很簡單的邏輯判斷來解決
- somehow there is data about the pattern 這些淺藏的規則有很多資料可以做為機器學習的來源，別忘了機器學習是從資料開始的

生活中的食衣住行育樂都有機器學習的應用，也可以用機器學習來使這些日常生活所需要的事情更便利
Food (Sadilek et al., 2013): from Twitter data to tell food poisoning likeliness of restaurant properly
Clothing (Abu-Mostafa, 2012): from sales figures and client surveys to give good fashion recommendations to clients
Housing (Tsanas and Xifara, 2012): from characteristics of building and their energy load to predict energy load of other buildings closely
Transportation (Stallkamp et al., 2012): from some traffic sign images and meanings to recognize traffic signs accurately
Education: from students’ records to predict whether a student can give correct answer to another quiz question
Entertainment e.g., Recommender System: from how many users have rated some movies to predict how a user would rate an unrated movie

Machine Learning and Data Mining
- It’s difficult to distinguish ML and DM in reality
- Machine Learning: use data to compute hypothesis g that approximates target $f$ 機器學習是希望用資料去找出一個假說 $g$，使得 $g$ 和我們想要的目標 $f$ 很相像
- Data Mining: use (huge) data to find property that is interesting 資料探勘希望用資料去找出有趣的事情，比如超市經營者想知道某消費者買了 A 商品，會再買什麼商品。資料探勘傳統上會使用相當大量的資料，去找出對特定的應用有趣或有用的性質。
Machine Learning and Artificial Intelligence
- ML is one possible route to realize AI
- Artificial Intelligence: compute something that shows intelligent behavior 希望電腦可以做出聰明的東西，比如電腦會自動下棋、自動開車
- 機器學習可以說是實現人工智慧的一種方法
Machine Learning and Statistics
- statistics: many useful tools for ML
- Statistics: use data to make inference about an unknown process 使用資料來做推論，推論一個原來不知道的事情，傳統的統計學從數學出發，在統計學裡，會有一些數學假設，最後得出一個可以證明的推論
- 機器學習從資料出發，演算法比較重視如何算出來，不只是數學上的結果如何
- 統計可以說是實現機器學習的一種方法