Забыли?

?

# Online Passive-Aggressive Algorithms

код для вставкиСкачать
```Online Passive-Aggressive
Algorithms
Shai Shalev-Shwartz joint work with
Koby Crammer, Ofer Dekel & Yoram Singer
The Hebrew University
Jerusalem, Israel
Three Decision Problems
Classification
Regression
Uniclass
Classification
Online Setting
Regression
Uniclass
n/a
вЂў Predict target value
вЂў Update hypothesis
;
suffer loss
Classification
A Unified View
Regression
Uniclass
вЂў Define discrepancy for
вЂў Unified Hinge-Loss:
вЂў Notion of Realizability:
:
A Unified View (Cont.)
вЂў Online Convex Programming:
вЂ“ Let
convex functions:
вЂ“ Let
be a sequence of
be an insensitivity parameter.
вЂ“ For
вЂў Guess a vector
вЂў Get the current convex function
вЂў Suffer loss
вЂ“ Goal: minimize the cumulative loss
The Passive-Aggressive Algorithm
вЂў Each example defines a set of consistent
hypotheses:
вЂў The new vector
is set to be the
projection of
onto
Classification
Regression
Uniclass
Passive-Aggressive
An Analytic Solution
Classification
where
Regression
Uniclass
and
Loss Bounds
вЂў Theorem:
вЂ“
- a sequence of examples.
вЂ“ Assumption:
вЂ“ Then if the online algorithm is run with
following bound holds for any
where
and
, the
for classification and regression
for uniclass.
Loss bounds (cont.)
For the case of classification we have one
degree of freedom since if
then
for any
Therefore, we can set
following bounds:
and get the
Loss bounds (Cont).
вЂў Classification
вЂў Uniclass
Proof Sketch
вЂў Define:
вЂў Upper bound:
вЂў Lower bound:
Lipschitz Condition
Proof Sketch (Cont.)
вЂў Combining upper and lower bounds
The Unrealizable Case
вЂў Main idea: downsize step size by
Loss Bound
вЂў Theorem:
вЂ“
вЂ“ bound for any
- sequence of examples.
and for any
Implications for Batch Learning
вЂў Batch Setting:
вЂ“ Input: A training set
, sampled
i.i.d according to an unknown distribution D.
вЂ“ Output: A hypothesis parameterized by
вЂ“ Goal: Minimize
вЂў Online Setting:
вЂ“ Input: A sequence of examples
вЂ“ Output: A sequence of hypotheses
вЂ“ Goal: Minimize
Implications for Batch Learning
(Cont.)
вЂў Convergence:
Let
be a fixed training set and let
vector obtained by PA after epochs. Then, for any
вЂў Large margin for classification:
For all we have:
,
which implies that the margin attained by PA for
classification is at least half the optimal margin
be the
Derived Generalization Properties
вЂў Average hypothesis:
Let
be the average hypothesis.
Then, with high probability we have
A Multiplicative Version
вЂў Assumption:
вЂў Multiplicative update:
вЂў Loss bound:
Summary
вЂў
вЂў
вЂў
вЂў
вЂў
вЂў
Unified view of three decision problems
New algorithms for prediction with hinge loss
Competitive loss bounds for hinge loss
Unrealizable Case: Algorithms & Analysis
Multiplicative Algorithms
Batch Learning Implications
Future Work & Extensions:
вЂў Updates using general Bregman projections
вЂў Applications of PA to other decision problems
Related Work
вЂў Projections Onto Convex Sets (POCS), e.g.:
вЂ“ Y. Censor and S.A. Zenios, вЂњParallel
OptimizationвЂќ
вЂ“ H.H. Bauschke and J.M. Borwein, вЂњOn Projection
Algorithms for Solving Convex Feasibility
ProblemsвЂќ
вЂў Online Learning, e.g.:
вЂ“ M. Herbster, вЂњLearning additive models online
with fast evaluating kernelsвЂќ
```
###### Документ
Категория
Презентации
Просмотров
15
Размер файла
1 101 Кб
Теги
1/--страниц
Пожаловаться на содержимое документа