Machine Learning Foundations Notes
Lecture 1. The Learning Problem
Instructor: Hsuan-Tien Lin



What Is Machine Learning?

  • From Learning to Machine Learning

    • Learning: acquiring skill with experience accumulated from observations 學習是從觀察出發,經過學習,轉化成有用的技巧
    • Machine Learning: acquiring skill with experience accumulated/computed from data 用電腦來模擬學習的過程,經過對環境的觀察(資料),變成對電腦有用的技能
    • What is skill? improve some performance measure 技能可以視為某一種表現的增進
      1-1
  • Why use machine learning?

    • ML is an alternative route to build complicated systems
    • when human cannot program the system manually 靠腦力一行行寫下規則來計算很困難,讓機器自己分析資料比較容易,比如說要讓機器人上火星
    • when needing rapid decisions that humans cannot do 需要快速做決定的事情,人沒有辦法短時間內判斷,如超短線的股票交易
    • when needing to be user-oriented in a massive scale 做個人化服務
  • Key Essence of Machine Learning 什麼時候才需要用到機器學習?

    • exists some ‘underlying pattern‘ to be learned 有淺藏模式可以學習,要有某些目標
    • but no programmable (easy) definition 但沒有很辦法很簡單地用程式直接定義,又或者說沒有辦法用很簡單的邏輯判斷來解決
    • somehow there is data about the pattern 這些淺藏的規則有很多資料可以做為機器學習的來源,別忘了機器學習是從資料開始的

Applications of Machine Learning

  • 生活中的食衣住行育樂都有機器學習的應用,也可以用機器學習來使這些日常生活所需要的事情更便利
  • Food (Sadilek et al., 2013): from Twitter data to tell food poisoning likeliness of restaurant properly
  • Clothing (Abu-Mostafa, 2012): from sales figures and client surveys to give good fashion recommendations to clients
  • Housing (Tsanas and Xifara, 2012): from characteristics of building and their energy load to predict energy load of other buildings closely
  • Transportation (Stallkamp et al., 2012): from some traffic sign images and meanings to recognize traffic signs accurately
  • Education: from students’ records to predict whether a student can give correct answer to another quiz question
  • Entertainment e.g., Recommender System: from how many users have rated some movies to predict how a user would rate an unrated movie

Components of Learning

  • Example: Metaphor Using Credit Approval 銀行判斷是否該核發信用卡給某位申請者

    • 銀行有的資料:申請者的年齡、性別、收入等等
    • 機器要學的是銀行該如何核發信用卡,該如何決定發卡或不發卡給某人
    • Formalize the Learning Problem

      • 把信用卡申請者的資料想成是 $x$,銀行是否核發信用卡想成 $y$
      • 有一個未知的函式 $f$(也就是潛在的規則),可以讓 $x$ 對應到 $y$,機器就是要去學習算出這個_ f_
      • 而我們有一組資料 $D$,裡頭是過去銀行對申請者核發信用卡的資料,$x_{1}$是過去的某位申請者一號,$y_{1}$ 是銀行是否發卡給他
      • 從資料觀察中可以得到一些假設 hypothesis,然後讓電腦去學習這些假設是對的還是錯的,最後得出一個 $g$1-1

      • 詳細的流程:1-1

      • $f$ 是一個未知的理想公式,它產生了資料(產生的過程不知道)

      • 從資料經過機器學習演算法產生 $g$
      • $g$ 代表某一種效能增進,即 $g$ 跟 $f$ 越像越好($g$ 和 $f$ 不會一模一樣)
      • Machine Learning: use data to compute hypothesis $g$ that approximates target $f$
      • $H$ 和 $A$ 則是機器學習的 Model

Machine Learning and Other Fields

  • Machine Learning and Data Mining

    • It’s difficult to distinguish ML and DM in reality
    • Machine Learning: use data to compute hypothesis g that approximates target $f$ 機器學習是希望用資料去找出一個假說 $g$,使得 $g$ 和我們想要的目標 $f$ 很相像
    • Data Mining: use (huge) data to find property that is interesting 資料探勘希望用資料去找出有趣的事情,比如超市經營者想知道某消費者買了 A 商品,會再買什麼商品。資料探勘傳統上會使用相當大量的資料,去找出對特定的應用有趣或有用的性質。
      1-1
  • Machine Learning and Artificial Intelligence

    • ML is one possible route to realize AI
    • Artificial Intelligence: compute something that shows intelligent behavior 希望電腦可以做出聰明的東西,比如電腦會自動下棋、自動開車
    • 機器學習可以說是實現人工智慧的一種方法
      1-1
  • Machine Learning and Statistics

    • statistics: many useful tools for ML
    • Statistics: use data to make inference about an unknown process 使用資料來做推論,推論一個原來不知道的事情,傳統的統計學從數學出發,在統計學裡,會有一些數學假設,最後得出一個可以證明的推論
    • 機器學習從資料出發,演算法比較重視如何算出來,不只是數學上的結果如何
    • 統計可以說是實現機器學習的一種方法
      1-1

Summary

  • What is Machine Learning? use data to approximate target
  • Applications of Machine Learning: Almost everywhere
  • Components of Machine Learning: $A$ takes $D$ and $H$ to get $g$
  • Machine Learning and Other Fields: related DM, AI and Statistics
  • Slides: http://www.csie.ntu.edu.tw/~htlin/mooc/doc/01_present.pdf