Spaces:
Sleeping
Sleeping
File size: 3,351 Bytes
cfca8a4 841d7fc cfca8a4 fa28749 cfca8a4 fa28749 cfca8a4 c3d240e 907cc73 841d7fc 7b2d6ac c3d240e 841d7fc 295c6bd 5eeffc0 295c6bd 8ff33c6 841d7fc 5eeffc0 295c6bd 1f1f9b0 841d7fc 8ff33c6 1f1f9b0 a369299 8ff33c6 5eeffc0 8ff33c6 5eeffc0 8ff33c6 5eeffc0 8ff33c6 5eeffc0 8ff33c6 1f1f9b0 5eeffc0 16c9195 8ff33c6 16c9195 8ff33c6 382662a 1f1f9b0 16c9195 37f71ff ec48038 16c9195 37f71ff f570496 d3ad40f f570496 cedbbde 1f1f9b0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
# Eureqa.jl
Symbolic regression built on Eureqa, and interfaced by Python.
Uses regularized evolution and simulated annealing.
## Running:
You can execute the program from the command line with, for example:
```bash
python eureqa.py --threads 8 --binary-operators plus mult
```
You can see all hyperparameters in the function `eureqa` inside `eureqa.py`.
This function generates Julia code which is then executed
by `eureqa.jl` and `paralleleureqa.jl`.
## Modification
You can change the binary and unary operators in `hyperparams.jl` here:
```julia
const binops = [plus, mult]
const unaops = [sin, cos, exp];
```
E.g., you can add the function for powers with:
```julia
pow(x::Float32, y::Float32)::Float32 = sign(x)*abs(x)^y
const binops = [plus, mult, pow]
```
You can change the dataset here:
```julia
const X = convert(Array{Float32, 2}, randn(100, 5)*2)
# Here is the function we want to learn (x2^2 + cos(x3) - 5)
const y = convert(Array{Float32, 1}, ((cx,)->cx^2).(X[:, 2]) + cos.(X[:, 3]) .- 5)
```
by either loading in a dataset, or modifying the definition of `y`.
(The `.` are are used for vectorization of a scalar function)
### Hyperparameters
Annealing allows each evolutionary cycle to turn down the exploration
rate over time: at the end (temperature 0), it will only select solutions
better than existing solutions.
The following parameter, parsimony, is how much to punish complex solutions:
```julia
const parsimony = 0.01
```
Finally, the following
determins how much to scale temperature by (T between 0 and 1).
```julia
const alpha = 10.0
```
Larger alpha means more exploration.
One can also adjust the relative probabilities of each operation here:
```julia
weights = [8, 1, 1, 1, 0.1, 0.5, 2]
```
for:
1. Perturb constant
2. Mutate operator
3. Append a node
4. Delete a subtree
5. Simplify equation
6. Randomize completely
7. Do nothing
# TODO
- [ ] Hyperparameter tune
- [ ] Add mutation for constant<->variable
- [ ] Create a Python interface
- [ ] Create a benchmark for accuracy
- [ ] Create struct to pass through all hyperparameters, instead of treating as constants
- Make sure doesn't affect performance
- [ ] Use NN to generate weights over all probability distribution conditional on error and existing equation, and train on some randomly-generated equations
- [ ] Performance:
- [ ] Use an enum for functions instead of storing them?
- Current most expensive operations:
- [x] deepcopy() before the mutate, to see whether to accept or not.
- Seems like its necessary right now. But still by far the slowest option.
- [ ] Calculating the loss function - there is duplicate calculations happening.
- [ ] Declaration of the weights array every iteration
- [x] Explicit constant optimization on hall-of-fame
- Create method to find and return all constants, from left to right
- Create method to find and set all constants, in same order
- Pull up some optimization algorithm and add it. Keep the package small!
- [x] Create a benchmark for speed
- [x] Simplify subtrees with only constants beneath them. Or should I? Maybe randomly simplify sometimes?
- [x] Record hall of fame
- [x] Optionally (with hyperparameter) migrate the hall of fame, rather than current bests
- [x] Test performance of reduced precision integers
- No effect
|