[Machine Learning Foundations] Lecture 1. The Learning Problem
Machine Learning Foundations Notes
Lecture 1. The Learning Problem
Instructor: Hsuan-Tien Lin
What Is Machine Learning?
From Learning to Machine Learning
- Learning: acquiring skill with experience accumulated from observations 學習是從觀察出發,經過學習,轉化成有用的技巧
- Machine Learning: acquiring skill with experience accumulated/computed from data 用電腦來模擬學習的過程,經過對環境的觀察(資料),變成對電腦有用的技能
- What is skill? improve some performance measure 技能可以視為某一種表現的增進
Why use machine learning?
- ML is an alternative route to build complicated systems
- when human cannot program the system manually 靠腦力一行行寫下規則來計算很困難,讓機器自己分析資料比較容易,比如說要讓機器人上火星
- when needing rapid decisions that humans cannot do 需要快速做決定的事情,人沒有辦法短時間內判斷,如超短線的股票交易
- when needing to be user-oriented in a massive scale 做個人化服務
Key Essence of Machine Learning 什麼時候才需要用到機器學習?
- exists some ‘underlying pattern‘ to be learned 有淺藏模式可以學習,要有某些目標
- but no programmable (easy) definition 但沒有很辦法很簡單地用程式直接定義,又或者說沒有辦法用很簡單的邏輯判斷來解決
- somehow there is data about the pattern 這些淺藏的規則有很多資料可以做為機器學習的來源,別忘了機器學習是從資料開始的
Applications of Machine Learning
- 生活中的食衣住行育樂都有機器學習的應用,也可以用機器學習來使這些日常生活所需要的事情更便利
- Food (Sadilek et al., 2013): from Twitter data to tell food poisoning likeliness of restaurant properly
- Clothing (Abu-Mostafa, 2012): from sales figures and client surveys to give good fashion recommendations to clients
- Housing (Tsanas and Xifara, 2012): from characteristics of building and their energy load to predict energy load of other buildings closely
- Transportation (Stallkamp et al., 2012): from some traffic sign images and meanings to recognize traffic signs accurately
- Education: from students’ records to predict whether a student can give correct answer to another quiz question
- Entertainment e.g., Recommender System: from how many users have rated some movies to predict how a user would rate an unrated movie
Components of Learning
Example: Metaphor Using Credit Approval 銀行判斷是否該核發信用卡給某位申請者
- 銀行有的資料:申請者的年齡、性別、收入等等
- 機器要學的是銀行該如何核發信用卡,該如何決定發卡或不發卡給某人
Formalize the Learning Problem
- 把信用卡申請者的資料想成是 $x$,銀行是否核發信用卡想成 $y$
- 有一個未知的函式 $f$(也就是潛在的規則),可以讓 $x$ 對應到 $y$,機器就是要去學習算出這個_ f_
- 而我們有一組資料 $D$,裡頭是過去銀行對申請者核發信用卡的資料,$x_{1}$是過去的某位申請者一號,$y_{1}$ 是銀行是否發卡給他
從資料觀察中可以得到一些假設 hypothesis,然後讓電腦去學習這些假設是對的還是錯的,最後得出一個 $g$
詳細的流程:
$f$ 是一個未知的理想公式,它產生了資料(產生的過程不知道)
- 從資料經過機器學習演算法產生 $g$
- $g$ 代表某一種效能增進,即 $g$ 跟 $f$ 越像越好($g$ 和 $f$ 不會一模一樣)
- Machine Learning: use data to compute hypothesis $g$ that approximates target $f$
- $H$ 和 $A$ 則是機器學習的 Model
Machine Learning and Other Fields
Machine Learning and Data Mining
- It’s difficult to distinguish ML and DM in reality
- Machine Learning: use data to compute hypothesis g that approximates target $f$ 機器學習是希望用資料去找出一個假說 $g$,使得 $g$ 和我們想要的目標 $f$ 很相像
- Data Mining: use (huge) data to find property that is interesting 資料探勘希望用資料去找出有趣的事情,比如超市經營者想知道某消費者買了 A 商品,會再買什麼商品。資料探勘傳統上會使用相當大量的資料,去找出對特定的應用有趣或有用的性質。
Machine Learning and Artificial Intelligence
- ML is one possible route to realize AI
- Artificial Intelligence: compute something that shows intelligent behavior 希望電腦可以做出聰明的東西,比如電腦會自動下棋、自動開車
- 機器學習可以說是實現人工智慧的一種方法
Machine Learning and Statistics
- statistics: many useful tools for ML
- Statistics: use data to make inference about an unknown process 使用資料來做推論,推論一個原來不知道的事情,傳統的統計學從數學出發,在統計學裡,會有一些數學假設,最後得出一個可以證明的推論
- 機器學習從資料出發,演算法比較重視如何算出來,不只是數學上的結果如何
- 統計可以說是實現機器學習的一種方法
Summary
- What is Machine Learning? use data to approximate target
- Applications of Machine Learning: Almost everywhere
- Components of Machine Learning: $A$ takes $D$ and $H$ to get $g$
- Machine Learning and Other Fields: related DM, AI and Statistics
- Slides: http://www.csie.ntu.edu.tw/~htlin/mooc/doc/01_present.pdf