File size: 9,368 Bytes
9375c9a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
// Copyright (C) 2017  Davis E. King (davis@dlib.net)
// License: Boost Software License   See LICENSE.txt for the full license.
#undef DLIB_UPPER_bOUND_FUNCTION_ABSTRACT_Hh_
#ifdef DLIB_UPPER_bOUND_FUNCTION_ABSTRACT_Hh_

#include "../matrix.h"
#include <limits>

namespace dlib
{

// ----------------------------------------------------------------------------------------

    struct function_evaluation
    {
        /*!
            WHAT THIS OBJECT REPRESENTS
                This object records the output of a real valued function in response to
                some input. 

                In particular, if you have a function F(x) then the function_evaluation is
                simply a struct that records x and the scalar value F(x).
        !*/

        function_evaluation() = default;
        function_evaluation(const matrix<double,0,1>& x, double y) :x(x), y(y) {}

        matrix<double,0,1> x;
        double y = std::numeric_limits<double>::quiet_NaN();
    };

// ----------------------------------------------------------------------------------------

    class upper_bound_function
    {
        /*!
            WHAT THIS OBJECT REPRESENTS
                This object represents a piecewise linear non-parametric function that can
                be used to define an upper bound on some more complex and unknown function.
                To describe this precisely, lets assume there is a function F(x) which you
                are capable of sampling from but otherwise know nothing about, and that you
                would like to find an upper bounding function U(x) such that U(x) >= F(x)
                for any x.  It would also be good if U(x)-F(x) was minimal.  I.e. we would
                like U(x) to be a tight upper bound, not something vacuous like U(x) =
                infinity.

                The upper_bound_function class is a tool for creating this kind of upper
                bounding function from a set of function_evaluations of F(x).  We do this
                by considering only U(x) of the form:
                    U = [](matrix<double,0,1> x) {
                       double min_ub = infinity;
                       for (size_t i = 0; i < POINTS.size(); ++i) {
                            function_evaluation p = POINTS[i]
                            double local_bound = p.y + sqrt(noise_terms[i] + trans(p.x-x)*M*(p.x-x))
                            min_ub = min(min_ub, local_bound)
                        }
                        return min_ub;
                    }
                Where POINTS is an array of function_evaluation instances drawn from F(x),
                M is a diagonal matrix, and noise_terms is an array of scalars.

                To create an upper bound U(x), the upper_bound_function takes a POINTS array
                containing evaluations of F(x) as input and solves the following quadratic
                program to find the parameters of U(x):
                    
                    min_{M,noise_terms}:  sum(squared(M)) + sum(squared(noise_terms/relative_noise_magnitude))
                    s.t.   U(POINTS[i].x) >= POINTS[i].y,  for all i 
                           noise_terms[i] >= 0
                           min(M) >= 0
                           M is a diagonal matrix 
               
                Therefore, the quadratic program finds the U(x) that always upper bounds
                F(x) on the supplied POINTS, but is otherwise as small as possible.



                The inspiration for the upper_bound_function object came from the AdaLIPO
                algorithm from this excellent paper:
                    Global optimization of Lipschitz functions 
                    Malherbe, Cédric and Vayatis, Nicolas 
                    International Conference on Machine Learning - 2017
                In that paper, they propose to use a simpler U(x) where noise_terms is
                always 0 and M is a diagonal matrix where each diagonal element is the same
                value.  Therefore, there is only a single scalar parameter for U(x) in
                their formulation of the problem.  This causes difficulties if F(x) is
                stochastic or has discontinuities since, without the noise term, M will
                become really huge and the upper bound becomes vacuously large.  It is also
                problematic if the gradient of F(x) with respect to x contains elements of
                widely varying magnitude since the simpler formulation of U(x) assumes a
                uniform rate of change regardless of which dimension is varying. 
        !*/

    public:

        upper_bound_function(
        );
        /*!
            ensures
                - #num_points() == 0
                - #dimensionality() == 0
        !*/

        explicit upper_bound_function(
            const std::vector<function_evaluation>& points,
            const double relative_noise_magnitude = 0.001,
            const double solver_eps = 0.0001
        );
        /*!
            requires
                - all the x vectors in points must have the same non-zero dimensionality.
                - relative_noise_magnitude >= 0
                - solver_eps > 0
            ensures
                - Creates an upper bounding function U(x), as described above, assuming that
                  the given points are drawn from F(x).
                - Uses the provided relative_noise_magnitude when solving the QP, as
                  described above.  Note that relative_noise_magnitude can be set to 0.  If
                  you do this then all the noise terms are constrained to 0.  You should
                  only do this if you know F(x) is non-stochastic and continuous
                  everywhere.
                - When solving the QP used to find the parameters of U(x), the upper
                  bounding function, we solve the QP to solver_eps accuracy.   It's
                  possible that large enough solver_eps can lead to upper bounds that don't
                  upper bound all the supplied points.  But for reasonable epsilon values
                  this shouldn't be a problem. 
                - #num_points() == points.size()
                - #dimensionality() == points[0].x.size()
        !*/

        upper_bound_function(
            const double relative_noise_magnitude,
            const double solver_eps 
        );
        /*!
            requires
                - relative_noise_magnitude >= 0
                - solver_eps > 0
            ensures
                - #num_points() == 0
                - #dimensionality() == 0
                - This destructor is the same as calling the above constructor with points.size()==0
        !*/


        void add (
            const function_evaluation& point
        );
        /*!
            requires
                - num_points() == 0 || point.x.size() == dimensionality()
                - point.x.size() != 0
            ensures
                - Adds point to get_points().
                - Incrementally updates the upper bounding function with the given function
                  evaluation.  That is, we assume that F(point.x)==point.y and solve the QP
                  described above to find the new U(x) that upper bounds all the points
                  this object knows about (i.e. all the points in get_points() and the new point).
                - Calling add() is much faster than recreating the upper_bound_function
                  from scratch with all the points.  This is because we warm start with the
                  previous solution to the QP.  This is done by discarding any non-active
                  constraints and solving the QP again with only the previously active
                  constraints and the new constraints formed by all the pairs of the new
                  point and the old points.  This means the QP solved by add() is much
                  smaller than the QP that would be solved by a fresh call to the
                  upper_bound_function constructor.
        !*/

        const std::vector<function_evaluation>& get_points(
        ) const;
        /*!
            ensures
                - returns the points from F(x) used to define this upper bounding function.
                  These are all the function_evaluation objects given to this object via
                  its constructor and add().
        !*/

        long num_points(
        ) const;
        /*!
            ensures
                - returns the number of points used to define the upper bounding function.
                  (i.e. returns get_points().size())
        !*/

        long dimensionality(
        ) const;
        /*!
            ensures
                - returns the dimensionality of the input vectors to the upper bounding function. 
        !*/

        double operator() (
            const matrix<double,0,1>& x
        ) const;
        /*!
            requires
                - num_points() > 0
                - x.size() == dimensionality()
            ensures
                - return U(x)
                  (i.e. returns the upper bound on F(x) at x given by our upper bounding function)
        !*/

    };

// ----------------------------------------------------------------------------------------

}

#endif // DLIB_UPPER_bOUND_FUNCTION_ABSTRACT_Hh_