[Deep learning book] 5.1.1 The Task, T

인공지능/이론 정리

[Deep learning book] 5.1.1 The Task, T

고등어찌짐 2022. 9. 4. 10:05

5.1 Learning Algorithms

A machine learning algorithm is an algorithm that is able to learn from data.
learning?
A computer program is said to learn from experience E with respect to someclass of tasks T and performance measure P, if its performance at tasks inT, as measured byP, improves with experience E.

5.1 Learning Algorithms

A machine learning algorithm
데이터로부터 학습할 수 있는 알고리즘
learning?
Task T 에 대해서 experience E 를 기반으로 performance P 를 개선해나가는 과정

5.1.1 The Task, T

task
the process of learning itself is not the task. Learning is our means of attaining the ability to perform the task.
example
Machine learning tasks are usually described in terms of how the machine learning system should process an example. We typically represent an example as a vector x ∈ Rn where each entry xi of the vector is another feature. An example is a collection of features that have been quantitatively measured from some object or event that we want the machine learning system to process.
features :
We typically represent an example as a vector x ∈ Rn where each entry xi of the vector is another feature.
For example, the features of an image are usually the values of the pixels in the image.

5.1.1 The Task, T

task
머신러닝이 달성해야할 목적 자체를 이야기하는 것 ex ) 로봇이 걷게하기 - 걷는 것이 로봇의 task
task != learning
learning 이란 task T 를 할 수 있는 능력을 갖게되는 것 ( 단순히 배우는것 자체는 task 가 아니다. )
example
머신러닝이 처리하길 원하는 어떤 물체나 사건으로부터 양적으로 측정되는 피쳐들의 집합
example 은 x ∈ Rn 인 벡터 x 로 나타낼 수 있다.
features
벡터 x 내에서 각 xi 들은 서로 다른 피쳐를 의미한다.
ex ) 이미지의 각 픽셀 값들이 피쳐

5.1.1.1 Task T examples (1)

Classiﬁcation
the learning algorithm is usually asked to produce a function f:Rn→ {1, . . . , k}.
(1) when y=f(x), the model assigns an input described by vector x to a category identiﬁed by numeric code y.
(2) where f outputs a probability distribution over classes
Classiﬁcation with missing inputs
When some of the inputs may be missing, rather than providing a single classiﬁcation function, the learning algorithm must learn a set of functions.
Each function corresponds to classifying x with a diﬀerent subset of its inputs missing.
=> learn a probability distribution over all the relevant variables, then solve the classiﬁcation task by marginalizing out the missing variables. With n input variables, we can now obtain all 2^n diﬀerent classiﬁcation functions needed for each possible set of missing inputs, but the computer program needs to learn only a single function describing the joint probability distribution.

5.1.1.1 Task T examples (1)

Classiﬁcation
function f : Rn →{ 1, ... k }
(1) y=f(x) 일 때 input 벡터 x 를 카테고리를 나타내는 숫자 코드 y 로 나타내는 것
(2) input 벡터 x 가 클래스들에 속할 확률 분포를 나타내는 것
Classiﬁcation with missing inputs
분류 class 대상에 없는 input 벡터가 들어오는 경우를 말함
(1) 각각의 class 에 대해 분류할 수 있는 function set 을 학습하는 것
각 class 에 대응하는 sub function 들을 하나씩 생성해서 학습할 것
=> 모든 관련된 변수들에 대한 확률 분포를 학습한 다음 누락된 변수는 배제
=> 따라서 n 개의 class 가 있다면 2^n 개의 다른 분류 함수들이 사용됨
(2) joint probability distribution 을 학습하는 하나의 함수만을 학습하는 방법
각 class 에 대응하는 sub function 들을 하나씩 생성해서 학습할 것

5.1.1.1 Task T examples (2)

Regression
predict a numerical value given some input
f:Rn→R
ex1) prediction of the expected claim amount that an nsured person will make
ex2) prediction of future prices of securities
Transcription
observe a relatively unstructured representation of some kind of data and transcribe the information into discrete textual form
ex) optical character recognition, speech recognition
Machine translation
the input alreadyconsists of a sequence of symbols in some language, and the computer programmust convert this into a sequence of symbols in another language
Structured output
- involve any task where the output is a vector (or other data structure containing multiple values) with important relationships between the diﬀerent elements.
- the program must output severalvalues that are all tightly interrelated.
ex) parsing : mapping a natural language sentence into a tree that describes its grammatical structure by tagging nodes of the trees as being verbs, nouns, adverbs
ex2) pixel-wise segmentation of image : every pixel in an image to a speciﬁc category
ex3) in image captioning, the computer program observes an image and outputs a natural language sentence describing the image

5.1.1.1 Task T examples (2)

Regression
numerical 한 값을 예측하는 것
f: Rn → R
ex ) 주가 예측
Transcription
구조화되지 않은 표현을 별개의 텍스트로 만드는 것
ex) optical character recognition, speech recognition
Machine translation
한 언어를 다른 언어로 번역하는 태스크
Structured output
출력이 서로 다른 요소들 사이의 중요한 관계를 갖는 벡터이거나 여러 값을 포함하는 데이터 구조인 경우
ex) parsing (특정 문장에 대한 언어 문법 트리 생성), segmentation, image captioning

5.1.1.1 Task T examples (3)

Anomaly detection
the computer program sifts through a set of events or objects and ﬂags some of them as being unusual or atypical
ex) credit card fraud detection
Synthesis and sampling
- generate new examples that are similar to those in the training data
ex) videogames can automatically generate textures for large objects or landscapes

- generate a speciﬁc kind of output given the input
ex) in a speech synthesis task, we provide a written sentence and ask the program to emit an audio waveform containing a spoken version of that sentence
Imputation of missing values
given a new example x ∈ Rn, but with some entries xi of x missing
provide a prediction of the values of the missing entries

5.1.1.1 Task T examples (3)

Anomaly detection
오브젝트나 사건들 중 일부 비이상적인 사용을 탐지하는 것
ex) 신용카드 사기 탐지
Synthesis and sampling
학습 데이터에서 완전히 새로운 것을 만들어내거나, 특정 input 이 들어왔을 때 특정 output 을 내보내는 것
ex ) 비디오 게임에서 텍스쳐 자동 생성
ex ) 음성 합성 태스크에서 특정 문장을 보여주면 그 문장만 음성 오디오에서 수정하는 것
Imputation of missing values
x ∈ Rn 인 example 에서 xi 중 x 값이 누락되었을 때, 이 누락된 항목 값을 예측하는 것

5.1.1.1 Task T examples (4)

Denoising
- given as input a corrupted example ˜x ∈ Rn obtained by an unknown corruption process from a clean example x ∈ Rn
- predict the clean example x from its corrupted version ˜x
- predict the p(x |˜x)
Density estimation or probability mass function estimation
learn a function p model : Rn→ R,
- p model : if x is continuous / probability density function
- p model : if x is discrete / probability mass function

- Density estimation enables us to explicitly capture that distribution
- distribution : where examples cluster tightly and where they are unlikely to occur

- does not always enable us to solve all these related tasks
- the required operations on p(x) are computationally intractable

5.1.1.1 Task T examples (4)

Denoising
- 정상 example 인 x ∈ R^n 으로부터 생성된 비정상 example ~x ∈ R^n 이 주어졌을 때, x 를 예측하는 것
- 조건부 확률 P(x|~x) 를 예측
Density estimation or probability mass function estimation
p model : Rn→ R 함수를 학습하는 것
- x 가 연속적인 경우 : p model 은 확률 밀도 함수 probability density function
- x 가 이산적인 경우 : p model 은 확률 질량 함수 probability mass function

- density estimation : distribution 를 측정하는 것
- distribution : example 이 cluster 되는 정도

- p(x) 는 많은 경우에 계산이 힘들기 때문에 관련된 모든 문제를 해결할 수 있는 것은 아님